Jump to content

Bionicle: The Game Game Archive Extracting (Research and Tools)


JrMasterModelBuilder
 Share

Recommended Posts

JrMasterModelBuilder

Bionicle: The Game Game Archive Extracting (Research and Tools)

These are more like research tools than actual modding tools, but here we go:

BIONICLE: The Game uses a archive format with the header "VOLT", to store all the assets. Within those archives are some additional archives I've been tinkering with. The ones I've had some luck with have the header "BIGB" followed by some command-line build arguments. Thanks to Google, I found a program that was made to decompress and extract the contents of files almost identical to this here: http://wiki.gbatemp.net/wiki/The_Conduit That program is made for a slightly different version of the archive format, so I re-wrote the program in Python below. So, without further ado, my partial file format specs and Python scripts.

Main archives:


#VOLT format

0-3 - VOLT - header

4-7 - UNKNOWN - always 02 00 00 00

8-11 - NUMBER OF FILES - decimal number of files

12-15 - FILE LIST LENGTH

15-? - 12 bytes for each file. Most likely 3 block segments.

    Block1 = UNKNOWN

    Block2 = 01 00 00 00 always AFAIK.

    Block3 = Offset of file info in file list - 0 first time.


#And for each of those files referenced:

8 bytes = File offset.

8 bytes = File size.

? bytes = File name (null terminated).

Most of the sub-achives.

#BIGB format.

0-3	 - ASCII  = File Header (BIGB)

4-7	 - UINT32 = Offset of data - 16 (probably offset after header)

8-11    - UINT32 = Offset of data after name and before creation arguments - 12 (the amount to skip forward)

12-15   - UINT32 = Version Number? Always 0x01000000

16-79   - ASCII  = Null terminated name of some kind.

80-127  - ASCII  = Null terminated text of some kind. (normal)

128-131 - UINT32 = Segment 1 Uncompressed Offset? 0 means file contains no data.

132-135 - UINT32 = Segment 1 Uncompressed Offset? 0 means file contains no data.

136-139 - UINT32 = Segment 2 Compressed Offset? 0 means file contains no data.

140-143 - UINT32 = Segment 2 Compressed Offset? 0 means file contains no data.

144-399 - ASCII  = Null terminated build command-line arguments.

(NOTE: The data contained with is RLE compressed, my Python scripts below decompress these block during extraction.) VOLTExtractor.py
:0, "size":0, "indexoffset":uint32(fileOffset+8), "unknown":uint32(fileOffset)}) fileOffset += 12 #Remember where the index is. indexOffset = fileOffset for i,v in enumerate(fileList): #Set the file offset to the index start plus the offset in the index. fileOffset = indexOffset + v["indexoffset"] #Set the file offset. fileList[i]["offset"] = uint32(fileOffset) fileOffset += 8#Skip mystery null block as well. #Set the file size. fileList[i]["size"] = uint32(fileOffset) fileOffset += 8#Skip mystery null block as well. #Read in the name until the null byte. if bytesAre == "str": while fileData[fileOffset] != b"\x00": fileList[i]["filename"] += fileData[fileOffset] fileOffset += 1 else: while fileData[fileOffset] != 0x00: fileList[i]["filename"] += chr(fileData[fileOffset]) fileOffset += 1 #And skip the null byte for the new entry. fileOffset += 1 #Create output path from input path. outFolder = path.split(os.sep) endFolder = outFolder[-1].split(".") extension = endFolder[-1] endFolder[0:-1] endFolder = ".".join(endFolder[0:-1]) + "_" + extension outFolder[-1] = endFolder outFolder = os.sep.join(outFolder) #Create the first non-existant folder to extract to. if os.path.exists(outFolder): i = 1 while(os.path.exists(outFolder + "_" + str(i))): i += 1 outFolder = outFolder + "_" + str(i) #Make the output folder. os.makedirs(outFolder) #Create a log file to log extracted files. with open((outFolder + os.sep + "__extract.log"), "w") as log: #Write the header to the log file. log.write(path.split(os.sep).pop() + "\ttotalfiles: " + str(totalFiles) + "\tindexsize: " + str(indexSize) + "\r\nfilename\toffset\tsize\tindexoffset\tunknown\r\n") #Loop through the list of files, saving them. for a in fileList: #Write the file. with open((outFolder + os.sep + a["filename"]), "wb") as f: #Write data to the log. log.write(a["filename"] + "\t" + str(a["offset"]) + "\t" + str(a["size"]) + "\t" + str(a["indexoffset"]) + "\t" + str(a["unknown"]) + "\r\n") f.write(fileData[a["offset"]:a["offset"]+a["size"]]) f.close() #Close the log. log.close() print("\tCOMPLETE: " + str(len(fileList)) + " files extracted.") return True #Detect if executable or not. fileName = sys.argv[0].split(os.sep).pop() if fileName[-3:] == ".py" or fileName[-4:] == ".pyw": runCommand = "python " + fileName else: runCommand = fileName if len(sys.argv) > 1: for i in range(1, len(sys.argv)): extract(sys.argv[i]) else: print("VOLT Extractor 1.0\n\nThis program will extract VOLT archives to an adjacent folder.\n\nCOPYRIGHT:\n\t(C) 2012 JrMasterModelBuilder\n\nLICENSE:\n\tGNU GPLv3\n\tYou accept full responsibility for how you use this program.\n\nUSEAGE:\n\t" + runCommand + " <LIST_OF_FILE_PATHS>")

"""

    VOLT Extractor - VOLT archive extractor.


    Copyright (C) 2012 JrMasterModelBuilder


    You accept full responsibility for how you use this program.


    This program is free software: you can redistribute it and/or modify

    it under the terms of the GNU General Public License as published by

    the Free Software Foundation, either version 3 of the License, or

    (at your option) any later version.


    This program is distributed in the hope that it will be useful,

    but WITHOUT ANY WARRANTY; without even the implied warranty of

    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

    GNU General Public License for more details.


    You should have received a copy of the GNU General Public License

    along with this program.  If not, see <http://www.gnu.org/licenses/>.

"""



import os

import sys


def extract(path):

    def uint32(offset):

	    if bytesAre == "str":

		    return ord(fileData[offset]) + (ord(fileData[offset+1]) * 256) + (ord(fileData[offset+2]) * 65536) + (ord(fileData[offset+3]) * 16777216)

	    else:

		    return fileData[offset] + (fileData[offset+1] * 256) + (fileData[offset+2] * 65536) + (fileData[offset+3] * 16777216)


    #Check if this version of Python treats bytes as int or str

    bytesAre = type(b'a'[0]).__name__


    print("PROCESSING: " + path)

    #Open the file if valid.

    try:

	    with open(path, "rb") as f:

		    fileData = f.read()

    except IOError:

	    print("\tERROR: Failed to read file.")

	    return False


    if len(fileData) < 4 or fileData[0:4] != b"VOLT":

	    print("\tERROR: Not a VOLT file.")

	    return False


    print("\tEXTRACTING: Please wait.")


    fileList = []

    fileOffset = 0

    #Skip over the header.

    fileOffset += 8


    #Read in info on the file.

    totalFiles = uint32(fileOffset)

    fileOffset += 4

    indexSize = uint32(fileOffset)

    fileOffset += 4


    #Read the initial index to find index entry offsets in the next index.

    for i in range(totalFiles):

	    #Add file to the list.

	    fileList.append({"filename":"", "offset"
BIGBExtractor.py

"""

    BIGB Extractor - BIGB archive extractor.


    Copyright (C) 2012 JrMasterModelBuilder


    You accept full responsibility for how you use this program.


    This program is free software: you can redistribute it and/or modify

    it under the terms of the GNU General Public License as published by

    the Free Software Foundation, either version 3 of the License, or

    (at your option) any later version.


    This program is distributed in the hope that it will be useful,

    but WITHOUT ANY WARRANTY; without even the implied warranty of

    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

    GNU General Public License for more details.


    You should have received a copy of the GNU General Public License

    along with this program.  If not, see <http://www.gnu.org/licenses/>.

"""



import os

import sys


def extract(path):

    def uint32(offset):

	    if bytesAre == "str":

		    return ord(fileData[offset]) + (ord(fileData[offset+1]) * 256) + (ord(fileData[offset+2]) * 65536) + (ord(fileData[offset+3]) * 16777216)

	    else:

		    return fileData[offset] + (fileData[offset+1] * 256) + (fileData[offset+2] * 65536) + (fileData[offset+3] * 16777216)


    def decompress(offsetStart, length, decompLength):

	    outputData = []

	    offset = offsetStart

	    endBlock = offset + length

	    outputOffset = 0

	    while True:

		    commandByte = fileData[offset]

		    if bytesAre == "str":

			    commandByte = ord(commandByte)

#		    print(offset, commandByte)

		    offset += 1


		    if commandByte & 0x80:

			    #Back reference.

#			    print("Back reference.")

			    byte2 = fileData[offset]

			    if bytesAre == "str":

				    byte2 = ord(byte2)

			    offset += 1

			    #Make up for taking 2 bytes.

			    dupeBytes = (byte2 & 0xF) + 2

			    #Bytes to go back.

			    srcRel = commandByte

			    srcRel = (srcRel << 4 | byte2 >> 4)

			    srcRel = (srcRel ^ 0xFFF) + 2

			    srcOffset = outputOffset - srcRel


			    if srcOffset < 0:

				    print("\tERROR: Back reference before start of file.")

				    return outputData


			    #Append the back reference to the output.

			    outputData.extend(outputData[srcOffset:srcOffset+dupeBytes])

			    #Skip forward.

			    outputOffset += dupeBytes


		    elif commandByte & 0x40:

			    #RLE.

#			    print("RLE.")

			    #How many bytes, no point if less than 2.

			    byteCount = (commandByte & 0x3F) + 2

			    repeatedByte = fileData[offset]

			    offset += 1

			    #Append the byte to the output however many times.

			    for _ in range(byteCount):

				    outputData.append(repeatedByte)

			    #Skip forward.

			    outputOffset += byteCount


		    else:

			    #Literal.

#			    print("Literal.")

			    #How many bytes, at least 1.

			    byteCount = (commandByte & 0x3F) + 1

			    #Append the bytes to the output.

			    outputData.extend(fileData[offset:offset+byteCount])

			    #Skip forward.

			    offset += byteCount

			    outputOffset += byteCount


		    if offset >= endBlock or len(outputData) >= decompLength:

			    if offset > endBlock:

				    print("\tERROR: Decompressing ran over.")

			    if len(outputData) > decompLength:

				    print("\tERROR: Too many bytes decompressed.")

			    return outputData


    #Check if this version of Python treats bytes as int or str

    bytesAre = type(b'a'[0]).__name__


    print("PROCESSING: " + path)

    #Open the file if valid.

    try:

	    with open(path, "rb") as f:

		    fileData = f.read()

    except IOError:

	    print("\tERROR: Failed to read file.")

	    return False


    if len(fileData) < 4 or fileData[0:4] != b"BIGB":

	    print("\tERROR: Not a BIGB file.")

	    return False


    print("EXTRACTING: Please wait.")


    fileOffset = 0


    #Skip over the header.

    fileOffset += 4


    #Get the offset of the data including the 16 byte header.

    dataOffset = uint32(fileOffset) + 16

    fileOffset += 4


    #Get the dictionary offset jump including the 12 bytes of header already read (ignore 0x01000000 verison number?).

    fileOffset = uint32(fileOffset) + 12


    #Get data about the 2 compressed blocks.

    fileBlocks = [

	    {

		    "decomp":uint32(fileOffset),

		    "comp":uint32(fileOffset+8)

	    }

	    ,

	    {

		    "decomp":uint32(fileOffset+4),

		    "comp":uint32(fileOffset+12)

	    }

    ]

    fileOffset += 16

#    print(fileBlocks)


    buildArguments = ""

    #Read in the until the null byte.

    if bytesAre == "str":

	    while fileData[fileOffset] != b"\x00":

		    buildArguments += fileData[fileOffset]

		    fileOffset += 1

    else:

	    while fileData[fileOffset] != 0x00:

		    buildArguments += chr(fileData[fileOffset])

		    fileOffset += 1


    #Skip to the data.

    fileOffset = dataOffset


    #Create output path from input path.

    outFolder = path.split(os.sep)

    endFolder = outFolder[-1].split(".")

    extension = endFolder[-1]

    endFolder[0:-1]

    endFolder = ".".join(endFolder[0:-1]) + "_" + extension

    outFolder[-1] = endFolder

    outFolder = os.sep.join(outFolder)


    #Create the first non-existant folder to extract to.

    if os.path.exists(outFolder):

	    i = 1

	    while(os.path.exists(outFolder + "_" + str(i))):

		    i += 1		    

	    outFolder = outFolder + "_" + str(i)


    #Make the output folder.

    os.makedirs(outFolder)


    #Create a log file to log extracted files.

    with open((outFolder + os.sep + "__extract.log"), "w") as log:

	    #Write the header to the log file.

	    p = "BUILD ARGUMENTS:\r\n\t" + buildArguments + "\r\n\r\n"

	    print(p)

	    log.write(p)


	    #Loop through the list of files, saving them.

	    for i,a in enumerate(fileBlocks):

		    #Get the data from each file in local variables for speed.

		    comp = a["comp"]

		    decomp = a["decomp"]


		    p = "BLOCK " + str(i) + ":\r\n\tOFFSET: " + str(fileOffset) + "\r\n\tCOMPRESSED: " + str(comp) + "\r\n\tDECOMPRESSED: " + str(decomp) + "\r\n"

		    if comp == 0:

			    p += "\tNOTE: Block does not exist.\r\n\r\n"

			    print(p)

			    log.write(p)

			    continue


		    #Write the file.

		    with open(outFolder + os.sep + "BLOCK_" + str(i) + ".bin", "wb") as f:

			    #Ckeck if compressed.

			    if comp < decomp:

				    data = decompress(fileOffset, comp, decomp)

			    elif comp == decomp:

				    data = fileData[fileOffset:fileOffset+comp]

			    else:

				    p += "\tERROR: Decompressed size less than compressed size.\r\n\r\n"

			    leng = len(data)

			    p += "\tOUTPUTSIZE: " + str(leng)

			    if leng == decomp:

				    p += " (EQUAL)"

			    if leng < decomp:

				    p += " (LESS)"

			    if leng > decomp:

				    p += " (MORE)"

			    p += "\r\n\r\n"

			    #If bytes are stings, compensate.

			    if bytesAre == "str":

				    #If a list, convert to byte array for writing.

				    if type(data).__name__ == "list":

					    data = bytearray(data)

				    f.write(data)

			    else:

				    f.write(bytes(data))

			    f.close()


		    #Log data on each file.

		    print(p)

		    log.write(p)


		    #Increment the offset for the next block.

		    fileOffset += comp


    print("COMPLETE: Files extracted.\r\n\r\n")

    return True



#Detect if executable or not.

fileName = sys.argv[0].split(os.sep).pop()

if fileName[-3:] == ".py" or fileName[-4:] == ".pyw":

    runCommand = "python " + fileName

else:

    runCommand = fileName


if len(sys.argv) > 1:

    for i in range(1, len(sys.argv)):

	    extract(sys.argv[i])

else:

    print("BIGB Extractor 1.0\n\nThis program will extract BIGB archives to an adjacent folder.\n\nCOPYRIGHT:\n\t(C) 2012 JrMasterModelBuilder\n\nLICENSE:\n\tGNU GPLv3\n\tYou accept full responsibility for how you use this program.\n\nUSEAGE:\n\t" + runCommand + " <LIST_OF_FILE_PATHS>")



These scripts should work on Python versions 2.7 - 3.3 but will be fastest on 3.0 or higher.

Some of the decompressed blocks have what appear to be a list of multiple WAV headers, followed by the audio data. No idea how that all works yet.

Link to comment
Share on other sites

Just wondering, how did you get your hands on the archives? From all the topics I've read, only a handful were sent, and attempts to get a Beta disc failed.

Nice work! I am unsuew what all of this means, since I am not a dev, but it still means a a game that was never released is closer to being modded! (Wait. That whole sentence sounds wierd. :P)

Link to comment
Share on other sites

JrMasterModelBuilder

Just wondering, how did you get your hands on the archives? From all the topics I've read, only a handful were sent, and attempts to get a Beta disc failed.

Nice work! I am unsuew what all of this means, since I am not a dev, but it still means a a game that was never released is closer to being modded! (Wait. That whole sentence sounds wierd. :P)

Uh, this is actually about BIONICLE: The Game which was released in 2003, not The Legend of Mata Nui cancelled in 2001. I do have one archive to TLoMN though, onua.blk, which was uploaded by Mark/DB a while back. As soon as we get our website back online, I'll give you the link. Haven't had much luck with those yet, but it appears to be RLE encoded as well.

Anyway, I've been doing some research into the files when decompressed. The ones that look like audio files have a list of WAV file headers that have no "data" blocks in their "RIFF" blocks. The do have a "strm" block (which according to my research is a non-standard block) which contains 8 bytes, the first 4 of which sometimes UINT32 equal 0 sometimes not and the last 4 of which UINT32 equal something higher than 0. My initial thought was that these blocks reference audio streams outside the RIFF blocks but my attempts to create new WAV files by mashing up the headers and data into new files have been unsuccessful.

  • Like 1
Link to comment
Share on other sites

  • 4 weeks later...
JrMasterModelBuilder

Game Extractor by Watto Studios. That extracted everything IIRC.

Even the archives inside the archives?

Link to comment
Share on other sites

Game Extractor by Watto Studios. That extracted everything IIRC.

Even the archives inside the archives?

Inception-Meme-Twilight-inception-2010-1

Sorry, off-topic, but I thought it was funny :af:

  • Like 2
Link to comment
Share on other sites

Game Extractor by Watto Studios. That extracted everything IIRC.

Even the archives inside the archives?

Can't remember exactly, but I don't think so, although it's possible it might. I'll buy it again and get back to you.

Link to comment
Share on other sites

JrMasterModelBuilder

Just wondering, how did you get your hands on the archives? From all the topics I've read, only a handful were sent, and attempts to get a Beta disc failed.

Nice work! I am unsuew what all of this means, since I am not a dev, but it still means a a game that was never released is closer to being modded! (Wait. That whole sentence sounds wierd. :P)

Uh, this is actually about BIONICLE: The Game which was released in 2003, not The Legend of Mata Nui cancelled in 2001. I do have one archive to TLoMN though, onua.blk, which was uploaded by Mark/DB a while back. As soon as we get our website back online, I'll give you the link. Haven't had much luck with those yet, but it appears to be RLE encoded as well.

The link I promised:

http://biomediaproject.com/bmp/files/otherfiles/TheLegendOfMataNui/

Game Extractor by Watto Studios. That extracted everything IIRC.

Even the archives inside the archives?

Can't remember exactly, but I don't think so, although it's possible it might. I'll buy it again and get back to you.

Just tried Game Extractor, it doesn't extract the sub-archives.

Link to comment
Share on other sites

  • 1 year later...
maver1k_XVII

I recently found my old CD with the game discussed here and after stumbling upon this topic I decided to try extracting the archives. I never really used python scripts like that before (although I actually code model importing scripts in python  :P ) but I figured out how to make them work and extracted all the .vol and .avl files.

 

As for BIGB files that were extracted from .vol archives, I encountered a few issues. Of all the BIGB's only those with .PCM extension seem to extract properly. Is this supposed to happen or that may happen because I'm doing something wrong?

 

Anyways, I poked around those .bin files and I was able to make a script to view them in Noesis. Here are examples of what's inside:

' alt='' class='ipsImage' >

' alt='' class='ipsImage' >

' alt='' class='ipsImage' >

 

I was hoping that PCM stands for "PC Model" or "PC Mesh", but turns out it was "PC Map". :lol: And yeah, as you can see from the pictures even though many objects look correct there are a lot of odd triangles. I have absolutely no idea if I'll manage to make the script any better because I never really worked with maps before. I'm thinking about posting it here in it's current state, but is there any point to it considering the fact that it is not very useful at the moment?

 

Oh, and by the way I think I found textures in .bin files and a way to identify them, but I don't know how to open them. That's what I figured out by now:

  • 0x6C5C5400
  • 12 bytes of unknown values
  • offset to image (from start of the .bin)
  • ---Format of the image header
  • 0x00010100 - magic
  • 3 bytes of unknown values
  • byte - bits per pixel?
  • int32 - null?
  • int16 - width or height
  • int16 - width or height
  • int16 - always 0x0800?
Link to comment
Share on other sites

maver1k_XVII

Sorry for doubleposting, but I have a big update.

 

I've been tinkering with the files of this game and the tech demo of unreleased Bionicle 2 City of Legends and I had a few breakthroughs.

 

Let's start with Bionicle 1. First of all - I managed to figure out the image format. Here are some images for example:

XQmaXRk.pngOdPtDs3.pngpEpt7FS.pngtcFmeza.gif

Secondly - I finally extracted those .PCS and .PCSI archives by altering the BIGB script a little and as a result I gained the access to the models of characters and their textures. The format is almost identical to levels, but there are a lot of problems that I still need to solve. That includes texture assignment and telling the models apart (yes, as silly as it may sound, currently I can only view the contents of the .bin as a single model, although it contains multiple). Again, here are some examples:

' alt='' class='ipsImage' >

Most models don't have textures assigned because I can only do this manually at the moment. The models generally consist of dozens of parts and it's not always obvious which texture goes where.

 

Moving to the Bionicle 2, the BIGB files are not too different and most imprtantly are not compressed. I managed to extract some of them with a simple QuickBMS script, but I'm still not sure if it works correctly. The maps and models have somewhat similar format, but at the same time I found it much more difficult to handle. I managed to find the level and the model of Toa Matau but I am almost clueless about the format apart from the geometry, which is the same as in Bionicle 1. I also figured out how to decode the textures, but the only way I can extract them is through hex editor. Here are pictures of models:

' alt='' class='ipsImage' >

' alt='' class='ipsImage' >

And most of the textures for Matau:

f0DDo5w.png

 

I'm certainly going to work more with all this stuff, it's the first time I had so much success with reversing file formats :D.

Link to comment
Share on other sites

  • 7 years later...

Hello ! I'm about to do an almost 8-year necro, stay put

 

I recently tried to understand I-Ninja's file format (another game by Argonaut games) and it seems very close to what Bionicles and maybe other games (Ben 10 ?) use.

This thread was very valuable to me and although I couldn't find how to handle BIGB files right now in a proper way, I did manage to follow the instructions to make a VOLT archive parser & extractor.

 

It's in C# and the full source and release are on github. https://github.com/Rackover/VOLTArcUnpacker/

 

image.png

 

 Now i look forward to being able to implement BIGB support to it, but I'm not in a hurry and this just may happen soon or never, depending on my free time and motivation :) 

Feel free to use this unpacker for bionicles and i'll stay subscribed to this thread to see if there are any developments to this story.

 

Have a nice day !

Link to comment
Share on other sites

 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.