Fixing No Disk Space Problem Caused by Snapshots in OpenSUSE

01/16/2017


Problem

There was an issue about my OpenSUSE Leap 42.1 today. When I started the system, OpenSUSE could not enter the desktop and left a message “No space left on device”. Luckily, I was still able to operate the system through terminal. By running the df -h command, the current disk usage was revealed:

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G  4.0K  3.9G   1% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G   11M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        21G   20G     0 100% /
/dev/sda9        21G   20G     0 100% /var/tmp
/dev/sda9        21G   20G     0 100% /var/spool
/dev/sda9        21G   20G     0 100% /var/opt
/dev/sda9        21G   20G     0 100% /var/lib/pgsql
/dev/sda9        21G   20G     0 100% /var/lib/mailman
/dev/sda9        21G   20G     0 100% /var/lib/named
/dev/sda2       256M   39M  218M  15% /boot/efi
/dev/sda9        21G   20G     0 100% /var/lib/mysql
/dev/sda9        21G   20G     0 100% /var/lib/mariadb
/dev/sda10       30G  8.3G   22G  29% /home
/dev/sda9        21G   20G     0 100% /var/lib/libvirt/images
/dev/sda9        21G   20G     0 100% /var/crash
/dev/sda9        21G   20G     0 100% /tmp
/dev/sda9        21G   20G     0 100% /srv
/dev/sda9        21G   20G     0 100% /opt
/dev/sda9        21G   20G     0 100% /boot/grub2/x86_64-efi
/dev/sda9        21G   20G     0 100% /boot/grub2/i386-pc
/dev/sda9        21G   20G     0 100% /usr/local
/dev/sda9        21G   20G     0 100% /var/log
/dev/sda9        21G   20G     0 100% /.snapshots

Among the output, /dev/sda9 is a Btrfs volume with many subvolumes. As the result showed, disk space in this volume had been exhausted. How did this happen?

(Note: If you just want to find the solution, please go to “Summary” part near the end of this article.)

Cause

Further analysis showed a huge disk space consumed by /.snapshots, which stores snapshots generated by Btrfs in OpenSUSE. Snapshots record system files at a certain point of time to allow them to be restored if needed. Used ls /.snapshots -al to list all contents in the directory.

total 4.0K
drwxr-xr-x 1 root root   32 Jan 14  2016 1
drwxr-xr-x 1 root root   66 Jul 28 03:09 217
drwxr-xr-x 1 root root   98 Jul 28 03:09 218
drwxr-xr-x 1 root root   66 Aug 11 22:53 234
drwxr-xr-x 1 root root   98 Aug 11 22:53 235
drwxr-xr-x 1 root root   66 Sep 12 21:07 260
drwxr-xr-x 1 root root   98 Sep 12 21:07 261
drwxr-xr-x 1 root root   66 Oct 14 13:24 300
drwxr-xr-x 1 root root   98 Oct 14 13:24 301
drwxr-xr-x 1 root root   66 Oct 24 14:28 312
drwxr-xr-x 1 root root   98 Oct 24 14:28 313
drwxr-xr-x 1 root root   66 Oct 27 05:24 316
drwxr-xr-x 1 root root   98 Oct 27 05:24 317
drwxr-xr-x 1 root root   66 Nov  3 17:30 318
drwxr-xr-x 1 root root   98 Nov  3 17:30 319
drwxr-xr-x 1 root root   66 Nov  3 17:31 320
drwxr-xr-x 1 root root   98 Nov  3 17:31 321
drwxr-xr-x 1 root root   66 Nov 10 13:06 322
drwxr-xr-x 1 root root   98 Nov 10 13:06 323
drwxr-xr-x 1 root root   66 Nov 20 12:43 324
drwxr-xr-x 1 root root   98 Nov 20 12:43 325
drwxr-xr-x 1 root root   66 Nov 26 21:22 326
drwxr-xr-x 1 root root   98 Nov 26 21:22 327
-rw-r----- 1 root root 2.6K Nov 26 21:22 grub-snapshot.cfg

Each directory named in numbers stood for a snapshot. By using du -sh command on a directory, for example, du -sh 323, we could find the total size of it: 9.0G!

However, in this case, even the size each of the snapshots was about 8-9G, it doesn’t mean that each one occupied that much space exclusively(the total size of the volume was only 20G). Due to the mechanism of snapshots, some content may be shared by multiple snapshots while its size being calculated in each snapshot.

For example, directory A, B, C consume 3G, 10G, 5G disk space respectively. Now, snapshot1 is created to include A and B while snapshot2 includes B and C. Then, the size of snapshot1 will be 3G+10G=13G, snapshot2 will be 10G+5G=15G. But the actual total size consumed by snapshot1 and snapshot2 will not be 28G but 3G+10G+5G=18G.

To view detailed list of snapshots, use sudo snapper ls:

Type   | #   | Pre # | Date                     | User | Cleanup | Description           | Userdata     
-------+-----+-------+--------------------------+------+---------+-----------------------+--------------
single | 0   |       |                          | root |         | current               |              
single | 1   |       | Thu Jan 14 05:40:57 2016 | root |         | first root filesystem |              
pre    | 217 |       | Thu Jul 28 03:06:07 2016 | root | number  | zypp(packagekitd)     | important=yes
post   | 218 | 217   | Thu Jul 28 03:09:01 2016 | root | number  |                       | important=yes
pre    | 234 |       | Thu Aug 11 22:51:06 2016 | root | number  | zypp(packagekitd)     | important=yes
post   | 235 | 234   | Thu Aug 11 22:53:19 2016 | root | number  |                       | important=yes
pre    | 260 |       | Mon Sep 12 21:05:52 2016 | root | number  | zypp(packagekitd)     | important=yes
post   | 261 | 260   | Mon Sep 12 21:07:17 2016 | root | number  |                       | important=yes
pre    | 300 |       | Fri Oct 14 13:20:31 2016 | root | number  | zypp(packagekitd)     | important=yes
post   | 301 | 300   | Fri Oct 14 13:24:16 2016 | root | number  |                       | important=yes
pre    | 312 |       | Mon Oct 24 14:26:34 2016 | root | number  | zypp(packagekitd)     | important=yes
post   | 313 | 312   | Mon Oct 24 14:28:09 2016 | root | number  |                       | important=yes
pre    | 316 |       | Thu Oct 27 05:24:34 2016 | root | number  | zypp(zypper)          | important=no 
post   | 317 | 316   | Thu Oct 27 05:24:38 2016 | root | number  |                       | important=no 
pre    | 318 |       | Thu Nov  3 17:29:58 2016 | root | number  | zypp(packagekitd)     | important=no 
post   | 319 | 318   | Thu Nov  3 17:30:09 2016 | root | number  |                       | important=no 
pre    | 320 |       | Thu Nov  3 17:30:25 2016 | root | number  | zypp(packagekitd)     | important=no 
post   | 321 | 320   | Thu Nov  3 17:31:45 2016 | root | number  |                       | important=no 
pre    | 322 |       | Thu Nov 10 13:06:29 2016 | root | number  | zypp(packagekitd)     | important=no 
post   | 323 | 322   | Thu Nov 10 13:06:55 2016 | root | number  |                       | important=no 
pre    | 324 |       | Sun Nov 20 12:42:08 2016 | root | number  | zypp(packagekitd)     | important=no 
post   | 325 | 324   | Sun Nov 20 12:43:47 2016 | root | number  |                       | important=no 
pre    | 326 |       | Sat Nov 26 21:21:03 2016 | root | number  | zypp(packagekitd)     | important=no 
post   | 327 | 326   | Sat Nov 26 21:22:19 2016 | root | number  |                       | important=no 

As you can see, the snapshot number(id) in the second column matched those directories in /.snapshots. Most of snapshots were in pairs because they were created automatically prior and after a system event.

As mentioned before, it is difficult to tell how much actual space each snapshot occupies. Nevertheless, there is a way to estimate it. The sudo snapper status command lists all file changes between two snapshots. Each line in the output indicates one file change. Thus, by counting lines in the output, we can tell how many files have been changed. For example:

$ sudo snapper status 318..319 | wc -l
9135

The ouput means that 9135 files have been added, deleted or modified from snapshot 318 to 319. All of them were kept in snapshots without being removed from disk. That would take a lot of unnecessary space if you don’t need them at all!

Solution

Now that the cause was found, solving it was not difficult at all. The command used to delete unwanted snapshot is sudo snapper delete --sync # (# is the snapshot id). After I deleted an old snapshot, 9% of disk space was freed immediately. The Snapper manual recommends that a pre-post pair of snaspshots should be deleted at the same time. Since there were many snapshots ranging from 217 to 327 on my disk, I used the following script to delete them all at once:

for i in `seq 217 327`; do sudo snapper delete --sync $i done

Problem solved! There were a lot more free space on disk now!

Since the disk space was now released, OpenSUSE started normally after a reboot. However, it would be more favarable if such thing can be prevented in the future. Thus, we need to know how those snapshots were automatically generated. The OpenSUSE Administration Reference provides a detailed explanation: by default, the use of YaST and Zypper triggers automatic snapshots creation when one of the following things happens:

Snapshots created by the two cases are called Installation Snapshots and Administration Snapshots respectively. The two functions are enabled by two plugin packages intalled on the system. To me, the second one is more annoying since a pair of snapshots are created anytime you start and close an administration tool, regardless whether you have actually made some changes or not.

So, in order to restrict the automatic snapshot creations. One of the two following methods can be apply:

(I) Disable automatic snapshot

The downside of this method is that without those snapshots, it will be more difficult to restore the files to a certain point if needed. This can be overcome by manually creating snapshots before making important changes by using command:

snapper create --description "Describe the snapshot"

(II) Change automatic clean-up setting

In the previous sudo snapper ls output table, there was a column named “Cleanup”. It indicates the automatic clean-up scheme. By default, the scheme for all automatic snapshots is “number”. Detailed configuration can be found in /etc/snapper/configs/root. Open the file and look for the following section:

# run daily number cleanup
NUMBER_CLEANUP="yes"

# limit for number cleanup
NUMBER_MIN_AGE="1800"
NUMBER_LIMIT="10"
NUMBER_LIMIT_IMPORTANT="10"

As the setting suggests, under number scheme, snapshots are deleted when a certain number of all snapshots is reached. In the current setting, for example, if there are more than 10 files, the exceeding files older than 1800s will be removed. Snapshots marked with “important” are treated differently with those not. Snapper always delete oldest snapshots first when cleaning up.

Thus, if you are troubled by too many automatic snapshots, change the number limit to a smaller number, for example, 2. Such change can be made to the config file above directly, or by executing command sudo snapper -c root set-config "NUMBER_LIMIT=2" The Snapper automatic clean-up will then just leave fewer snapshots, saving more space.

It is worth noting that in OpenSUSE 42.1, the automatic clean-up is set to run on daily basis. Simply changing the number limit setting above does not affect current snapshots instantly. To apply them immediately, go to /etc/cron.daily/suse.de-snapper and execute it to initiate an immediate clean-up.

Summary

To free space occupied by snapshots and prevent further space consumption, do the following:
(All commands should run in root or sudo)

  1. List existing snapshots
    snapper ls

  2. Find the snapshot id in “#” column, or goto step 4 directly.

  3. Delete unnecessary snapshots by id.
    snapper delete --sync [id]

  4. Set automatic snapshot clean-up number limit to a smaller value.
    snapper -c root set-config "NUMBER_LIMIT=2"
    snapper -c root set-config "NUMBER_LIMIT_IMPORTANT=2"

  5. Execute /etc/cron.daily/suse.de-snapper to perform clean-up.

References

OpenSUSE Administration Reference

Back to Top

SYANG.IO © 2021