Friday, January 24, 2014

Getting C2 chemicals descriptions (chemicals.def file format)

What a sneaky bastard I am.
This was supposed to be a C2 Norn autopsy article! (I swear I'm not doing that to avoid awarding anyone the promised carrot.)

I just realised something was still missing from the last article.
In the C1 chemicals information article we were able to pull out both chemical names and descriptions from the chemicals.txt file.

So where's that information in C2 ?

 It doesn't seem to exist in the default game, but you can find it in the C2 genetics kit, hiding as the "chemicals.def" file (pretty bad disguise if you ask me).

If we open that file up in a hex editor, it will look strangely familiar :

A 2 byte "\x00\x00" header before the first entry this time.

The file is also a flat collection of string entries with a 2 byte header. ( this time it's "00 00" )
The only change is that the length information prefixing each entry is not a single byte anymore but rather a word (2 bytes).

You can see that the length for "Pain" is now described as "04 00" rather than simply "04" as in the other files.
It makes sense if you want to use descriptions longer than 256 characters (the maximum value for a single byte).
But it also means that those aren't strictly the "Cstring" type described over at the CDN.
(Check out the C2 History file parsing post for details on long Cstrings if you don't remember why)

If we get to the end of the chemicals list, we can find a 4 byte "\x00\x00\x00\x00" separator, and then a list of the chemicals descriptions in the same order as the chemicals themselves.
This is very similar to the C1 chemicals.txt format. ( Also we can be sure that those 4 bytes are an actual separator and not 2 empty string entries, since we've seen that "Antigen 7" is the last of the chemical list in the previous article) 

After the 4 bytes "\x00\x00\x00\x00" separator, begins the list of chemical description entries.As usual these are strings prefixed with their length, except that now their length is coded as words( 2bytes) rather than bytes.

I won't be going into the details of how to write the corresponding parser for this file as you could easily extrapolate that from the C1 chemicals.txt and C2 allchemicals.str parsers.
This will probably be a good reader exercise at this point.
(Also I'm to lazy to write it right now, this being my second article in a row this evening.And I find that the information it contains is not that vital at this point anyway)

Just to sum things up, here is the formal chemicals.def file format, similar to the C1 one, with chemical entries,a separator, and then chemicals description entries, the only difference being that here all string entries are prefixed with their length coded on 2 bytes rather than one.

<2 byte header> : always "\x00\x00"
<2 bytes "chemical1 length">< X bytes chemical1 name>
<2 bytes "chemical2 length">< X bytes chemical2 name>
...
<2 bytes "chemicalN length">< X bytes chemicalN name>
<4 bytes separator> : always "\x00\x00\x00\x00" 
<2 bytes "chemdesc1 length">< X bytes chemical1 description>
<2 bytes "chemdesc2 length">< X bytes chemical2 description>
...
<2 bytes "chemdescN length">< X bytes chemicalN description>



Ok, this time I promise, the next article will really be about performing C2 autopsies !


No comments:

Post a Comment