Database Backup Compression
On my Athlon X2 5200+, compressing the SQL backup:
$ ls -l ~/backup/mythconverg-1214-20090403154601.sql -rw-r--r-- 1 me group 321488312 Apr 3 16:34 /home/me/backup/mythconverg-1214-20090403154601.sql
using gzip, bzip2, and xz (from the XZ Utils package, which uses the same LZMA compression algorithms used by 7-zip) gives the following results:
Compressor | Compression Time | Size After Compression | Size Compared to Uncompressed | Size Compared to gzip'ed |
None (uncompressed) | N/A | 307 MiB | 100% | 569.0% |
gzip | 14.3 seconds | 53 MiB | 17.6% | 100% |
bzip2 | 121.2 seconds (2min 1.2sec) | 45 Mib | 14.8% | 84.1% |
xz | 539 seconds (8min 59sec) | 30 MiB | 9.9% | 56.2% |
So, for mythconverg backup compression, when factoring in both space savings and CPU usage, it seems gzip is the winner (for most users's needs) as it's extremely fast and the space saved by using xz instead of gzip--23.6MiB saved on a rather large 307MiB backup file--is /not/ worth the processor time (let alone the energy cost[*]) of using the better compression algorithm. (For other usage--where files will be compressed infrequently and transmitted many times (per compression) over high-cost bandwidth (where bandwidth is more expensive than CPU time), the same would not hold true.)
gzip
$ time gzip ~/backup/mythconverg-1214-20090403154601.sql real 0m14.329s user 0m13.907s sys 0m0.421s
$ ls -l ~/backup/mythconverg-1214-20090403154601.sql* -rw-r--r-- 1 me group 56496450 Apr 3 16:34 /home/me/backup/mythconverg-1214-20090403154601.sql.gz
bzip2
$ time bzip2 ~/backup/mythconverg-1214-20090403154601.sql real 2m1.168s user 2m0.704s sys 0m0.445s
$ ls -l ~/backup/mythconverg-1214-20090403154601.sql* -rw-r--r-- 1 me group 47531259 Apr 3 16:34 /home/me/backup/mythconverg-1214-20090403154601.sql.bz2
xz
$ time xz ~/backup/mythconverg-1214-20090403154601.sql real 8m59.496s user 8m58.710s sys 0m0.654s
$ ls -l ~/backup/mythconverg-1214-20090403154601.sql* -rw-r--r-- 1 me group 31731304 Apr 3 16:34 /home/me/backup/mythconverg-1214-20090403154601.sql.xz
[*]Cost of drive space at $0.10/GB = $0.0024765146 or at $0.08/GB = $0.001981211680. Assuming 8min of additional 100% CPU usage with a change of 18W between idle and 100% CPU usage, that's 2.4Wh = 0.0024kWh, so at $0.10/kWh, that's $0.00024 to perform the compression. At first glance, the compression seems to be 10 times cheaper than the drive space, but when you realize that you'll be doing frequent backups and deleting/rotating old backups, the drive space turns out to be cheaper over time.