6/07/2011

06-07-11 - How to read an LZ compressed file

An example of the kind of shite I'm doing in Oodle these days.

You have an LZ-compressed file on disk that you want to get decompressed into memory as fast as possible. How do you do this ?

Well, first of all, you make your compressor write in independent chunks so that the decompressor can run on multiple chunks at the same time with threads. But to start you need to know where the chunks are in the file, so the first step is :


1. Fire an async read of the first 64k of the file to get the header.

the header will tell you where all the independent chunks are. (aside : in final Oodle there may also be an option to aglomerate all the headers of all the files, so you may already have this first 64k in memory).

So after that async read is finished, you want to fire a bunch of decomps on the chunks, so the way to do this is :


2. Make a "Worklet" (async function callback) which parses the header ; set the Worklet to run when the IO op #1
finishes.

I used to do this by having the WorkMgr get a signal from IO thread (which still happens) but I now also have a mechanism to just run Worklets directly on the IO thread, which is preferrable for Worklets that are super trivial like this one.

Now, if the file is small you could just have your Worklet #2 read the rest of the file and then fire async works on each one, but if the file is large that means you are waiting a long time for the IO before you start any decomp work, so that's not ideal, instead what we do is :


3. In Worklet #2, after parsing header, fire an async IO for each independent compressed chunk.  For each chunk, create
a decompression Worklet which is dependent on the IO of that chunk (and also neighbors, since due to IO
sector alignment the compression boundaries and IO boundaries are not quite the same).

So what this will do is start a bunch of IO's that then retire one by one, as each one retires it starts up the decomp task for that chunk. This means you start decompressing almost immediately and for large files you keep the CPU and IO busy the whole time.

Finally the main thread needs a way to wait for this all to be done. But the handles to the actual decompression async tasks don't exist until async task #2 runs, so the main thread can't wait on them directly. Instead :


4. At the time of initial firing (#1), create an abstract waitable handle and set it to "pending" state; then
pass this handle through your async chain.  Task #2 should set it to needing "N to go", since it's the first
point that knows the count, and then the actual async decompresses in #3 should decrement that counter.  So
the main thread can wait on it being "0 to go".

You can think of this as a sempahore, though in practice I don't use a semaphore because there are some OS's where that's not possible (sadly).

What the client sees is just :


AsyncHandle h = OodleLZ_StartDecompress( fileName );

Async_IsPending(h); ?

Async_Block(h);

void * OodleLZ_GetFinishedDecompress( h );

if they just want to wait on the whole thing being done. But if you're going to parse the decompressed file, it's more efficient to only wait on the first chunk being decompressed, then parse that chunk, then wait on the next chunk, etc. So you need an alternate API that hands back a bunch of handles, and then a streaming File API that does the waiting for you.

No comments:

old rants