Firefox Cache2 Storage Breakdown

Mozilla introduced a new format of storage ** ** for the Firefox browser in version 27. It was defaulted off until recently in version 32, when it was turned on. Mozilla claims in its recent statements that cache2 is more efficient to speed up the browser.

Here is a good write-up about the previous version of cache in case you encounter it. Pretty much every forensic tool supports it—which brings me to my next point.


My colleague Carl Purser is the one who told me about this new version. He showed me the new files and such, and then we wondered together about what tools support the new version. I sent out a tweet, but only got crickets in response.

This got me curious about the structure, so I grabbed a copy of the latest Firefox source code and got to gettin’.


There are two types of files in the cache. A file called ”index” is exactly that. It holds records for each of the files that are tracked in the cache2 folder. The other type is the cached file itself, which is named with a sha1 hash of the file’s URL. The meat is in the cache files, but let’s gets the index file out of the way first.

Index File

The index file starts off with a 12-byte header.

Currently the version is 1. If this value changes, it will be time to research again. You will likely find the dirty flag on since DFIR best practices still say to ‘pull the plug.’ It’s OK. I haven’t noticed anything significant from it. The Last modified can be interpreted easily by DCode.

The records are fairly simple and start immediately after the header.

Forensically, there isn’t a whole lot to see here, but the work is done and it makes sense to share. Perhaps someone else can spot something important.

Cache Files

The cache files start as the files that come over the wire. After the end of the original content, Firefox stores a bunch of metadata from the web server. To locate the metadata you have to read the last four bytes of the file in Big Indian.

At this location, we have more math to do. It starts with a 4 byte hash of the file content. Then there are hashes applied to the cache file content in chunks of 262,144 bytes. For each chunk of the file, there’s a two-byte portion of a hash value. The file I’m using here is 193,903 bytes.

193,903 / 262,144 = 0.739681243896484375

The last chunk will always be a value less than 262,144. If there is any remainder from the division, then round up. This file is under the chunk size, so we just round up to having one chunk. This means we skip 6 bytes (4 hash, 2 chunk hash).

Now we’re at the meat of the metadata. If you haven’t caught on yet, these numbers are all stored in Big Endian. It is referred to as network byte order in the code.

At the end of the URI is a NULL byte 0x00. Following the NULL byte are attributes in NULL terminated name value pairs. Here is an attribute named ‘security-info’ followed by a NULL byte. The value of this attribute is Base64-encoded and contains information about the HTTPS stream that was used to fetch this file.

There is no length property so you just go until you find the NULL byte. The next attribute is named ‘request-method’ with GET as the value.

Then there is a ‘response-head’ name. Followed by the HTTP response data sent from the server.

Automated Tools
Finally, to follow in the EnCase forensic training methodology, the customary ”easy way” after digging into the details of the data. It’s a personal goal of mine to start learning Python, so this worked out to be a nice, simple project to get a start. I posted the code on GitHub to share with the community. (

Simon Key ( wrote an EnCase Evidence Processor plugin in EnScript. This will create a LEF to contain parsed data from the cache files. It doesn’t go after the index file because it doesn’t contain enough data on its own and doesn’t really add much beyond the metadata inside the cache files.

The results of the Module are added in the Records tab. There you will find the files stripped of the added cache2 metadata, and also decompressed if they were transferred in gzip compression.

 All of this will be added to our Advanced Internet Examinations course shortly, along with a grip of other new material. Check back soon to see the updated syllabus, or follow me on Twitter for updates as they come out.

You can find the Evidence Processor Module in the EnCase App Central store here. Commercial Tools So far, there are three commercial tools that I know of that support this cache2 structure. I have listed them on my GitHub page and I will keep that list up to date as I am told about more tools supporting the format. Here is the list as of now:
Digital Detective
Foxton Software
If you know of other tools, please reach out to me and I am happy to update the list, and drop a comment below if you have any thoughts about what I’ve covered here.
James Habben

No comments :

Post a Comment