So how good is the new xz compressor?

Recently Roger Mason sent in a patch for T2 adding a new compressor named xz, apparently a successor of lzma.

The google results where a bit brief on comparison between xz, lzma, and the well known bzip2 (and gzip).

So here goes a tiny test, I modified T2 to add support to compress the binary packages with xz, and got this results for a quick test (all compressors are run with -9 for best compression):

204800 lzma-4.32.7.tar
75248 lzma-4.32.7.tar.gz
74304 lzma-4.32.7.tar.bz2
62736 lzma-4.32.7.tar.xz
62655 lzma-4.32.7.tar.lzma

Bzip2 does not particularly shine, and xz neither, … Ok, let us try with something bigger, containing some more raw test, the apache package:

14090240 apache-2.2.16.tar
3098510 apache-2.2.16.tar.gz user 0m1.040s
2331497 apache-2.2.16.tar.bz2 user 0m2.124s
2069252 apache-2.2.16.tar.xz user 0m6.700s
2037338 apache-2.2.16.tar.lzma user 0m13.089s

Hm, xz still does not come out smallest, at least it is not as exorbitant slow as lzma, …

One last try for xz to show it’s potential, let’s use the millions of lines of code that form the current linux kernel:

412897280 linux-2.6.35.tar
88300782 linux-2.6.35.tar.gz user 0m35.118s
69305709 linux-2.6.35.tar.bz2 user 0m54.431s
57065123 linux-2.6.35.tar.lzma user 7m50.453s
56921708 linux-2.6.35.tar.xz user 5m44.266s

Finally! Some 140kB smaller, still slightly faster than lzma, but 10 times(!!!) slower than bzip2, sigh.

All run on an Intel(R) Xeon(R) CPU X5365 @ 3.00GHz, while apparently even the new tools (lzma, xz) only used one of it’s cores (by default).

Your milage may vary.

2 Responses to “So how good is the new xz compressor?”

  1. Koen Vervloesem Says:

    I’m curious about the DEcompression times for these algorithms, can you measure them? I ask this because xz generally decompresses faster than bzip.

    That’s also the reason why Fedora, Arch Linux and Slackware are using xz as default for their packages now. The slow compression doesn’t matter, because it is done by the maintainers, and only when a package is updated. But because decompression of the package is faster than with bzip2 and the compressed packages are smaller (with faster downloads as a result), xz gives a faster package management system from the user’s point of view.

  2. René Says:

    Indeed, for me compression mostly meant storage space (or bandwidth) utilized, as well as time compressing it. Here are decompression numbers where xz indeed looks better:

    bzip2 -d < linux-2.6.35.tar.bz2 user 0m11.705s
    lzma -d < linux-2.6.35.tar.lzma user 0m5.304s
    xz -d < linux-2.6.35.tar.lzma user 0m5.276s
    gzip -d < linux-2.6.35.tar.gz user 0m2.832s

    (All to > /dev/null)

    So indeed during uncompressing xz looks better than bzip2, …

Leave a Reply

You must be logged in to post a comment.