LZ77 Very Slow

Discussion about everything. New games, 3d math, development tips...

LZ77 Very Slow

Postby Alpha Omega » Sun Oct 16, 2011 1:15 pm

I made an implementation of the LZ77 algorithm to learn the ins and outs of file compression. I compared the compression ratios to that of ZIP and for files with heavy repetition they were not as good but not terrible. However the code is soooo slow. It works quick for small files (of course) but large files >=5 MB can take minutes!!! I am using a 4096 byte moving window varient for the sliding window. I know the searching is costly for possible searching 4096 times per byte in the file, my question is how do ZIP and other compression programs execute so damn fast!? How do you profile your code for speed? I am trying to find the bottle neck in the code but am having trouble :(.
User avatar
Alpha Omega
 
Posts: 284
Joined: Wed Oct 29, 2008 12:07 pm

Re: LZ77 Very Slow

Postby CuteAlien » Sun Oct 16, 2011 7:07 pm

Use a profiler. On Linux there is gprof and on Windows a free profiler is "very sleepy" (really, that's it's name ^_^). Or add timer-code into your functions (for example like this: viewtopic.php?t=13069&highlight=).
IRC: #irrlicht on irc.freenode.net
My patches&stuff: http://www.michaelzeilfelder.de/irrlicht.htm
Games with Irrlicht: http://www.irrgheist.com/
News: http://www.reddit.com/r/irrlicht/
User avatar
CuteAlien
Admin
 
Posts: 5349
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany

Re: LZ77 Very Slow

Postby Alpha Omega » Thu Oct 20, 2011 1:18 am

Thanks, that program is very good. I replaced the brute force search method I was using with a 256 x 256 index table. I am now compressing 50 MB @ 26.66 seconds and decompressing @ 10.00 seconds. Right now I am getting > 50% compression on most text files. I am going to implement a huffman code algorithm before the LZ77 (I think 7zip does this?) to hopefully achieve a good final compression.

I know Irrlicht uses zlib for compression right? For games and file compression, I know the decompression speed is most desired so you can decompress on the fly and put the data right into the game, right? I am hoping to get the final version to about 10 seconds compression and 5 seconds decompression by the time I am finished. I just don't know if the compression % will be as good, we will see.
User avatar
Alpha Omega
 
Posts: 284
Joined: Wed Oct 29, 2008 12:07 pm

Re: LZ77 Very Slow

Postby mongoose7 » Thu Oct 20, 2011 1:46 am

Huffman is not good. Its only distinction is that, given a particular example, it will compress it better than any other method, but it does not then compress other data well. That is, you can set it up to compress just one piece of data.
mongoose7
 
Posts: 514
Joined: Wed Apr 06, 2011 12:13 pm

Re: LZ77 Very Slow

Postby Alpha Omega » Thu Oct 20, 2011 2:14 am

Huffman will only not work if the file contains all 256 different byte combinations that are of equal probability. So prolly not good with image files but text files/log files can achieved great compression. Also using huffman algorithm will not eliminate the redundancy in the bit stream, as I see it replacing an 8 bit combination with a 3 bit combination will still result in the same encoded pattern giving further compression and faster LZ77 execution.

But the best compression is with arithmetic coding models, however the speed is terrible. This algorithm will be for very quick execution to possibly encode information through transmissions.

I have a problem with Huffman algorithm when it comes to transmitting the table used during compression. So maybe I will choose 2-3 predetermined look up tables to do the encoding, prolly more inefficient but will keep the speed cost low.
User avatar
Alpha Omega
 
Posts: 284
Joined: Wed Oct 29, 2008 12:07 pm

Re: LZ77 Very Slow

Postby mongoose7 » Thu Oct 20, 2011 9:34 am

Alpha Omega wrote:Huffman will only not work if the file contains all 256 different byte combinations that are of equal probability. So prolly not good with image files but text files/log files can achieved great compression. Also using huffman algorithm will not eliminate the redundancy in the bit stream, as I see it replacing an 8 bit combination with a 3 bit combination will still result in the same encoded pattern giving further compression and faster LZ77 execution.

Well, I was going to walk on by, and maybe I should, but:

2^3 is 8. So you are compressing files with only 8 distinct characters in them?
mongoose7
 
Posts: 514
Joined: Wed Apr 06, 2011 12:13 pm

Re: LZ77 Very Slow

Postby Alpha Omega » Thu Oct 20, 2011 10:19 pm

No I was just using it as an example. Obviously you have use to use more bits for less probable characters. Anyway your right, huffman code is not very efficient compared to arithmetic coding, and I think I can make an implementation that is pretty quick, combining a LZW style dictionary into the arithmetic coding algorithm.
User avatar
Alpha Omega
 
Posts: 284
Joined: Wed Oct 29, 2008 12:07 pm

Re: LZ77 Very Slow

Postby hybrid » Fri Oct 21, 2011 11:04 am

Maybe try bzip or lzma
hybrid
Admin
 
Posts: 13942
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany

Re: LZ77 Very Slow

Postby REDDemon » Tue Nov 22, 2011 1:06 am

The fastest should be gzip. lzma has very good compression but is slower (not to much). Every compression algorithm has pros and contros (for games i think that zlib and gzip are very good.. other formats are good for server/clients wich needs to minimize bandwith when downloading). I never tried to use directly libraries like zlib or libpng. I spent some time on data crypt/decrypt streams. but not on data compression.
OpenGL is not hard. What you have to do is just explained in specifications. What is hard is dealing with poor OpenGL implementations.
User avatar
REDDemon
 
Posts: 831
Joined: Tue Aug 31, 2010 8:06 pm
Location: Genova (Italy)

Re: LZ77 Very Slow

Postby Virion » Tue Nov 22, 2011 3:49 am

idsoft used zlib (I think) for doom3 and the compression is pretty good.
The assets for Doom 3 are 3.6GB uncompressed, and only 1.44GB after being packed up.

http://www.iddevnet.com/doom3/packfiles.php
User avatar
Virion
 
Posts: 2102
Joined: Mon Dec 18, 2006 5:04 am
Location: Malaysia

Re: LZ77 Very Slow

Postby CuteAlien » Fri Dec 02, 2011 9:44 am

Jake03 a bot or not - that is the question. Typical bot pattern (the kind that copy-pastes texts which sound similar from elsewhere and adds spam a few days later), but could also be uninformed coder who asks for code-help without showing code...
IRC: #irrlicht on irc.freenode.net
My patches&stuff: http://www.michaelzeilfelder.de/irrlicht.htm
Games with Irrlicht: http://www.irrgheist.com/
News: http://www.reddit.com/r/irrlicht/
User avatar
CuteAlien
Admin
 
Posts: 5349
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany

Re: LZ77 Very Slow

Postby hybrid » Fri Dec 02, 2011 12:27 pm

Yep, actually the mentioned code is missing. But assuming this was just a copy-paste error, this post might make sense. But well, looking at the two other posts, it's either bot or troll, so maybe ready for deletion anyway...
hybrid
Admin
 
Posts: 13942
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany

Re: LZ77 Very Slow

Postby CuteAlien » Tue Dec 06, 2011 10:17 am

Ok, Jake03 turned out to be a bot (he added spam now). Grr.. really, really hate that combination of delayed spam + bots copying text that sounds similar to thread-text from somewhere.
IRC: #irrlicht on irc.freenode.net
My patches&stuff: http://www.michaelzeilfelder.de/irrlicht.htm
Games with Irrlicht: http://www.irrgheist.com/
News: http://www.reddit.com/r/irrlicht/
User avatar
CuteAlien
Admin
 
Posts: 5349
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany

Re: LZ77 Very Slow

Postby REDDemon » Tue Dec 06, 2011 8:33 pm

What about adding extra antibot questions for users wich have less than X messages? So before posting any message on the forum if you have less than 30 messages a question will be asked.
OpenGL is not hard. What you have to do is just explained in specifications. What is hard is dealing with poor OpenGL implementations.
User avatar
REDDemon
 
Posts: 831
Joined: Tue Aug 31, 2010 8:06 pm
Location: Genova (Italy)

Re: LZ77 Very Slow

Postby CuteAlien » Tue Dec 06, 2011 9:11 pm

We're not even sure if that's bots anymore - or just people in some poor country which are payed for spamming. But the idea is sound - I'll ask Yoran if something like that could be done.
IRC: #irrlicht on irc.freenode.net
My patches&stuff: http://www.michaelzeilfelder.de/irrlicht.htm
Games with Irrlicht: http://www.irrgheist.com/
News: http://www.reddit.com/r/irrlicht/
User avatar
CuteAlien
Admin
 
Posts: 5349
Joined: Mon Mar 06, 2006 2:25 pm
Location: Tübingen, Germany

Next

Return to Off-topic

Who is online

Users browsing this forum: No registered users and 0 guests