To read all comments associated with this story, please click here.
What a flat out lie. You can't beat bzip2 with anything but a few percent, except on the occasional obscure crafted file. Prove it, I say. You say "often". I invite you to pick any exe on teh intarweb, compress it with bzip2 and lzma and post the results to something like fileshack.
Did you even try it yourself first? It's not hard to test. Here's the first random, moderately sized file I thought of to try:
kristian@mars:~/Temp$ ls -l libc-2.5.so*
-rwxr-xr-x 1 kristian kristian 1216808 2007-10-07 21:53 libc-2.5.so
-rw-r--r-- 1 kristian kristian 435125 2007-10-07 21:53 libc-2.5.so.7z
-rwxr-xr-x 1 kristian kristian 536797 2007-10-07 21:53 libc-2.5.so.bz2
That's a 100k difference over bzip2 -9. For a very large file which may be downloaded a large number of times the difference is even more pronounced:
kristian@mars:~/Temp$ ls -lh syllable-0.6.4.iso*
-rw-r--r-- 1 kristian kristian 106M 2007-10-07 21:58 syllable-0.6.4.iso
-rw-r--r-- 1 kristian kristian 77M 2007-10-07 22:02 syllable-0.6.4.iso.7z
-rw-r--r-- 1 kristian kristian 82M 2007-10-07 21:58 syllable-0.6.4.iso.bz2
Edited 2007-10-07 21:04
"""
"""
The data I transfer daily in lzma format is my own customers' point of sale data. I can't very well send you that. So try this instead:
#!/bin/bash
cd /etc
tar -c -v -f - . | cat > ~/test.tar
tar -c -v -f - . | bzip2 > ~/test.tar.bz2
tar -c -v -f - . | lzma -9 > ~/test.tar.lzma
Here are the results on my FC7 system:
-rw-r--r-- 1 root root 114114560 2007-10-07 16:39 test.tar
-rw-r--r-- 1 root root 9486794 2007-10-07 16:40 test.tar.bz2
-rw-r--r-- 1 root root 6953453 2007-10-07 16:44 test.tar.lzma
That's 27% smaller than the bzip2 file, on a test I pulled randomly out of the air.
The *big* news here is that there are actually bzip2 zealots in this world! Who'd have thought?
Would you like a valium?
Edited 2007-10-07 22:02
On the other hand, in my experience LZMA compressors tend to take several times longer than other programs to compress data... (recent test with about a gig of .fits files from my research; gzip 5:48, bzip2 11:49, 7zip 56:51; none really compressed them very much though gzip was worst and 7zip was best) So it's a tradeoff between compression ratio and speed. If you don't care about compression time, LZMA is awesome.
DigitalAxis,
No. These days you have a choice with lzma. The package I use accepts switches -1 through -9, like gzip. At -1, the compression is about like bzip2 and with comparable speed. -7 is the default and the sweet spot, regarding both compression speed and memory usage vs compression effectiveness.
-9 is slower, yes. But it gets better results.
Gzip is still good for when you need speed. Lzop is great for when you need *blazing* speed with, still, remarkably effective compression. (Well... all things considered.)
Bzip2 is still good for... I'm not sure what. But it's popular. ;-)
lzma has been hampered by implementations with incredibly obtuse user interfaces, unfortunately. I just recently extricated myself from that mess. These days I just use it like gzip.
Edited 2007-10-07 22:41







Member since:
2005-07-24
Kudos to them for recognizing the value of lzma compression. I often see lzma produce 30-40% smaller file sizes than bzip2.