Jump to content

LEGO LOCO Decompression Tool

Recommended Posts


Greets all,


So it's been a while since my initial research on the "mysterious bitmaps" from LEGO LOCO, and after a small splurge of research and coding I finally figured out this mysterious format. First and foremost, the format used by the bitmaps is not bitmap specific at all, it's a compression format used by any and all files which appear to be 'unreadable'. The loading order of the game appears to be as such to detect the difference between the two types of files, compressed and uncompressed:

  • Find requested file
  • If it doesn't exist, return 0
  • If it does exist, request a file handle and copy it to RAM
  • Check the value at 0x5. If it's 0x1, it's compressed. Otherwise, return RAM offset of already copied file
  • Load 32 bit word at offset 0x0, malloc that amount of space
  • Pass compressed file and malloc location to decompression function

Now for the good stuff: The compression format and the entire file format demystified. The file is stored with an 8 byte large header and as large of an actual file space as allowed (limited to highest 32 bit value in this case). While the file is being decompressed, the file is considered to actually start at byte 0x8.


The file is divided into two parts: The table and the data. The table is technically only 0x400 bytes large starting at offset 0x400 (0x408 with the initial header), but it can extend larger if needed. The data goes as large as needed. The data and table itself use a Huffman compression method, with the table being the Huffman tree and the data being the binary array. The word value at 0x4 in the header determines the seed position in the huffman tree. The seed position is multiplied by 4 to get the actual offset of the seed. This offset does not include the header. Once the seed is loaded, each 32 bit word is loaded from the data and is divided by 2. If the far-most right bit was a 1 before division by 2, the next tree offset is shifted by 0x2. The next offset is determined as such: (((seed * 2) + flag) * 2). A 16 bit value is then loaded from that offset and used as the next seed. If this seed value is less than 0x100, it is interpreted as a byte and the byte is written and the seed reset to the bottom of the tree. This processes is done continually until all bytes have been written.


If that was a bit difficult to grasp, you might be able to tell it a little better by viewing the source code of the decompression tool.


With that said and done, the compression tool. It's written in Java and is executed from the command line using java -jar locodecomp.jar <infile> <outfile>. You can download the precompiled version here or view the source code on my GitHub here. The source code is fairly straightforward so if anyone wants to port it to any other languages feel free to do so. As a note, I included a small bitmap fixer which can patch in a header for certain bitmaps. How it works is it takes a blank bitmap header and stitches certain pieces of the raw bitmap together to form a readable bitmap. Since the bitmaps are a bit inconsistent in terms of defining width and height in the raw file, it calculates it using the .dat file associated with it. The width and height can be adjusted in a hex editor. Since it uses .dat files to get width and height, it usually only works with buildings due to the buildings being the only bitmaps that define it's size inside the .dat file. For sounds I'm not sure what header information is missing, but I do know that most of the sounds can be played in Audacity by importing it as a raw file and changing the format to use unsigned 8bit PCM and a frequency of 22050.


Questions, comments, concerns? Feel free to let me know. I'm pretty excited for this myself since if we can convert all the bitmaps to a normal format we could potentially open up more expansive modding between the bitmaps and the .dat files.



Share this post

Link to post
Share on other sites
Tauka Usanake

We're only halfway there now. Time to make a compression tool and be able to change the images LOCO uses.


So what are we limited to right now? I don't want to think we can add in new objects right away, even though it should be more easy than finding out how this worked. I think we have a general idea from a resource read from the loco.exe but I could be wrong.

Share this post

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.