Linux 服务器磁盘无空间故障排除

June 24, 2019
Backend DevServer运维Linux

今天偶尔重启了一下服务器,结果发现MySQL启动不了了,手动启动了一下服务结果还是没有效果,输出以下的错误。

  1. Jun 24 15:22:32 ubuntu systemd[1]: Starting MySQL Community Server...
  2. Jun 24 15:22:33 ubuntu systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE

因为我的服务器只有20G的容量。所以我第一个想到的问题就是20G的控件占满了导致MySQL服务无法启动,因为没有足够的储存资源可以分给他。用 df 指令检查了以下各个挂载点的磁盘占用,果然 / 挂载点已经满了,并且 \boot 挂载点也已经没有剩余空间了。

  1. root@ubuntu:~# df -lh
  2. Filesystem Size Used Avail Use% Mounted on
  3. udev 484M 0 484M 0% /dev
  4. tmpfs 101M 5.6M 96M 6% /run
  5. /dev/sda2 20G 19G 129M 100% /
  6. tmpfs 504M 0 504M 0% /dev/shm
  7. tmpfs 5.0M 0 5.0M 0% /run/lock
  8. tmpfs 504M 0 504M 0% /sys/fs/cgroup
  9. /dev/sda1 361M 359M 0 100% /boot
  10. tmpfs 101M 0 101M 0% /run/user/0

\boot 挂载点如果出现剩余空间不足的话根据以往的经验一般是Linux内核太多了,因为Ubuntu更新了内核之后旧的内核还会存在操作系统里面,所以应该清理一下老的内核。

apt-get autoremove 命令一般就可以解决问题,但是在我的情况下又报了很多错误,按照提示运行了 apt-get -f install 指令也没有成功。于是我想着去手动删掉旧的内核。

  1. root@ubuntu:~# sudo apt-get autoremove
  2. Reading package lists... Done
  3. Building dependency tree
  4. Reading state information... Done
  5. You might want to run 'apt-get -f install' to correct these.
  6. The following packages have unmet dependencies:
  7. linux-image-generic : Depends: linux-image-4.4.0-151-generic but it is not installed or
  8. linux-image-unsigned-4.4.0-151-generic but it is not installed
  9. Recommends: thermald but it is not installed
  10. linux-modules-extra-4.4.0-151-generic : Depends: linux-image-4.4.0-151-generic but it is not installed or
  11. linux-image-unsigned-4.4.0-151-generic but it is not installed
  12. E: Unmet dependencies. Try using -f.

首先需要查看当前内核版本,我的服务器使用的是 4.4.0-150

  1. root@ubuntu:~# uname -r
  2. 4.4.0-150-generic

之后用指令查看可以删除的内核,在下面的列表中

  • rc:表示已经被移除
  • ii:表示符合移除条件(可移除)
  • iU:已进入 apt 安装队列,但还未被安装(不可移除)。
  1. root@ubuntu:~# dpkg -l | tail -n +6 | grep -E 'linux-image-[0-9]+' | grep -Fv $(uname -r)
  2. ii linux-image-4.4.0-130-generic 4.4.0-130.156 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  3. ii linux-image-4.4.0-134-generic 4.4.0-134.160 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  4. ii linux-image-4.4.0-137-generic 4.4.0-137.163 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  5. ii linux-image-4.4.0-139-generic 4.4.0-139.165 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  6. ii linux-image-4.4.0-141-generic 4.4.0-141.167 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  7. ii linux-image-4.4.0-142-generic 4.4.0-142.168 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP
  8. ii linux-image-4.4.0-143-generic 4.4.0-143.169 amd64 Signed kernel image generic
  9. ii linux-image-4.4.0-145-generic 4.4.0-145.171 amd64 Signed kernel image generic
  10. ii linux-image-4.4.0-148-generic 4.4.0-148.174 amd64 Signed kernel image generic
  11. ii linux-image-4.4.0-62-generic 4.4.0-62.83 amd64 Linux kernel image for version 4.4.0 on 64 bit x86 SMP

查询到了可以移除的内核之后我试着使用 dpkg —purge 来卸载内核,不过很遗憾,还是不行。运行dpkg指令的时候都会提醒我 No space left on device。

  1. root@ubuntu:~# sudo dpkg --purge linux-image-4.4.0-130-generic
  2. dpkg: dependency problems prevent removal of linux-image-4.4.0-130-generic:
  3. linux-image-extra-4.4.0-130-generic depends on linux-image-4.4.0-130-generic.
  4. dpkg: error processing package linux-image-4.4.0-130-generic (--purge):
  5. dependency problems - not removing
  6. Errors were encountered while processing:
  7. linux-image-4.4.0-130-generic
  8. root@ubuntu:~# sudo dpkg --purge linux-image-extra-4.4.0-130-generic
  9. (Reading database ... 454615 files and directories currently installed.)
  10. Removing linux-image-extra-4.4.0-130-generic (4.4.0-130.156) ...
  11. depmod: FATAL: could not load /boot/System.map-4.4.0-130-generic: No such file or directory
  12. run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 4.4.0-130-generic /boot/vmlinuz-4.4.0-130-generic
  13. run-parts: executing /etc/kernel/postinst.d/initramfs-tools 4.4.0-130-generic /boot/vmlinuz-4.4.0-130-generic
  14. update-initramfs: Generating /boot/initrd.img-4.4.0-130-generic
  15. W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
  16. gzip: stdout: No space left on device
  17. E: mkinitramfs failure cpio 141 gzip 1
  18. update-initramfs: failed for /boot/initrd.img-4.4.0-130-generic with 1.
  19. run-parts: /etc/kernel/postinst.d/initramfs-tools exited with return code 1
  20. dpkg: error processing package linux-image-extra-4.4.0-130-generic (--purge):
  21. subprocess installed post-removal script returned error exit status 1
  22. Errors were encountered while processing:
  23. linux-image-extra-4.4.0-130-generic

上述所有的方法无法成功其实就是因为 \boot 挂载点被完全占满了,执行内核相关操作的指令比如apt的时候也需要一些磁盘空间,所以上面的方法都没有办法手动,所以还是要纯手动的删除一些文件。

首先进入 /boot 挂载点,然后用du指令列出来了文件列表,可以看出来里面只要文件名结尾不是 4.4.0-150-generic 的文件都是就的内核的相关文件。

  1. root@ubuntu:~# cd /boot
  2. root@ubuntu:/boot# du -sk *|sort -n
  3. 1 retpoline-4.4.0-139-generic
  4. 1 retpoline-4.4.0-142-generic
  5. 12 lost+found
  6. 188 config-4.4.0-139-generic
  7. 188 config-4.4.0-142-generic
  8. 188 config-4.4.0-143-generic
  9. 188 config-4.4.0-145-generic
  10. 188 config-4.4.0-148-generic
  11. 188 config-4.4.0-150-generic
  12. 188 config-4.4.0-151-generic
  13. 1229 abi-4.4.0-139-generic
  14. 1230 abi-4.4.0-142-generic
  15. 3830 System.map-4.4.0-139-generic
  16. 3830 System.map-4.4.0-142-generic
  17. 3831 System.map-4.4.0-143-generic
  18. 3831 System.map-4.4.0-145-generic
  19. 3833 System.map-4.4.0-148-generic
  20. 3834 System.map-4.4.0-150-generic
  21. 3834 System.map-4.4.0-151-generic
  22. 6859 grub
  23. 7030 vmlinuz-4.4.0-139-generic
  24. 7045 vmlinuz-4.4.0-142-generic
  25. 7050 vmlinuz-4.4.0-145-generic
  26. 7052 vmlinuz-4.4.0-143-generic
  27. 7057 vmlinuz-4.4.0-148-generic
  28. 7059 vmlinuz-4.4.0-150-generic
  29. 9390 initrd.img-4.4.0-138-generic
  30. 38910 initrd.img-4.4.0-139-generic
  31. 38987 initrd.img-4.4.0-141-generic
  32. 38997 initrd.img-4.4.0-143-generic
  33. 38998 initrd.img-4.4.0-142-generic
  34. 39657 initrd.img-4.4.0-145-generic
  35. 39945 initrd.img-4.4.0-148-generic
  36. 39946 initrd.img-4.4.0-150-generic

根据上面的文件列表,使用rm指令来删除不需要的文件。之后再使用du指令发现文件已经被顺利删除了。

  1. root@ubuntu:/boot# sudo rm -rf /boot/*-4.4.0-{138,139,141,143,142,145,148}-*
  2. root@ubuntu:/boot# du -sk *|sort -n
  3. 12 lost+found
  4. 188 config-4.4.0-150-generic
  5. 188 config-4.4.0-151-generic
  6. 3834 System.map-4.4.0-150-generic
  7. 3834 System.map-4.4.0-151-generic
  8. 6859 grub
  9. 7059 vmlinuz-4.4.0-150-generic
  10. 39946 initrd.img-4.4.0-150-generic

删除之后/boot挂载点瞬间下降了80%,这时apt指令就可以用了,运行下面的指令即可完全移除不需要的内核和package。

  1. sudo apt-get autoremove

Comments

July 21, 2018 at 10:52 am

There are no comments

keyboard_arrow_up