It’s always fun to play with new toys, and when the new hotness is a purpose-built, linearly scalable, password-cracking behemoth, how can one not share? I did a bit of digging while running a two-server Tableau Password Recovery setup through its paces in our labs here in Pasadena, California, and while I found many good tools and tutorials for password cracking, I found it difficult to differentiate the theoretically possible from the actually practical. Here are some thoughts from that process.

Robert Bond

I’ve been fortunate enough to meet a number of forensic investigators—both in law enforcement and inside corporations—and to hear a little about how they do their work. All of us in every line of work have preferred tools, checkpoints, and workflows, so it can be very easy to procrastinate on making the change to a new version of a favorite tool. However, I’m genuinely excited to tell you that, if you’ve been waiting for the right time to upgrade to EnCase Forensic version 7, that time is now.

EnScript and Python: Exporting Many Files for Heuristic Processing - Part 1

James Habben with Chet Hosmer

I discovered something very cool this year at CEIC: people actually read my blog posts! The realization came when I found out there were two sessions focusing on Python, and both of them talked about my #en2py techniques that I presented in this blog last year.

One of the sessions, Heuristic Reasoning with Python and EnCase, was presented by the Python forensics guy, Chet Hosmer of I got a chance to chat with him after his session, and the discussion led to what you are about to read. Chet has a number of Python scripts that can make a difference in forensic cases, and we decided a joint blog post would be a fun way to touch on the integration between EnCase and Python with another technique. This will be a two-part post with the first part focusing on getting the files out. The second will get some fancy on it by putting a GUI on the front to accept options in the processing. I will now let Chet explain the benefits of his work.

Function and Benefits of Heuristic Reasoning

Applying heuristics during deep-dive investigation allows us to apply rules of thumb during the process. In order to bring this to light, we chose to integrate a Python script that performs “what I call” heuristic indexing of binary files. Binary files like memory snapshots, executable files and photo graphic images have ASCII text embedded with the binary data. Extracting these “text sequences or remnants” and then making sense of them can be a challenge. 

The issue with traditional approaches like dictionary comparisons or keyword lists, is the occurrence of misspelled words, slang, technical jargon, malware strings, filenames, and function names. These can all be missed because they are not in the dictionary or keyword list, an example is shown in the Casey Anthony investigation. Another traditional approach would be to report on all “text sequences or remnants” this can results in a voluminous number of nonsensical meaningless text strings that can overwhelm investigators.

My approach (originally outlined in my text, Python Forensics) uses a set of 400,000 common English words, (loosely a mini corpus of words) to generate a weighted heuristic model.  I have since created additional models for medical and pharmaceutical domains and I’m working on common words used within text messages.  

Using Python, I load the specific weighted heuristics into a Set. Then during the process of extracting “text sequences or remnants” from the binary file(s), the same algorithm is applied to each extracted sequence as was used to build the weighted heuristics. The calculated heuristic is then used as a lookup value. If the value is found in the loaded weighted set, then the word is considered probable and reported. One other final step I should mention…. most languages have what are referred to as “stop words” such as, (whenever, always, another, elsewhere etc). English is no exception. These stop words are filtered from the final list as they typically have little probative value. Each identified word that passes these filters is stored in a dictionary, one of the great built-in data structures within Python. Dictionaries are key, value pairs, in this case the key is the probable word string and the value is the number of times the word is discovered. This allows me to then produce a resulting list of probable words either sorted alphabetically or by frequency of occurrence.

Therefore, the bottom line benefits of heuristic indexing include:
  1. Accurate identification of a broad set of probable words from binary data
  2. Slang, technical jargon, filenames, misspelled words are also identified
  3. Strings that represent nonsense strings are filtered out
  4. Common stop words are ignored
  5. The frequency of words found or alphabetical results are possible
  6. New weighted heuristic models can be created
In order to apply this method more broadly to a case instead of a single file, we needed a method to allow EnCase (via an EnScript), to export multiple selected files to be processed by the Python script. I turned to James, the EnScript Guru for help.

Method of Choosing Files

In my previous posts, I used a simple technique in EnScript to send the highlighted file out from EnCase to the local disk to allow for a Python script to access the data. This works great for Python scripts that are designed to process one file at a time, but it is not very efficient for the examiner when that one file has not been pinpointed yet. There are many Python scripts out there that are designed to process a whole set of files in a designated folder.

In another post, I looped through files in the case, but I was targeting certain filenames known to contain evidence from Windows 8 Phone apps. The structure there is similar to what I have here, but the interaction with Python is the difference.

Chet and I talked at CEIC about how to do exactly this in EnScript, and came to the conclusion that the rest of the world should know about this as well! OK, maybe not the world, but I’m sure you appreciate that we didn’t keep this buried in some dark closet somewhere.

I have talked about ItemIteratorClass before, but it was in a simple post about the changes in EnScript from EnCase v6 to v7. This is the class that gives us access to all of the files in the case. There are a lot of options explained in that post, so I won’t drag it out here. The mode we will focus on is CURRENTVIEW_SELECTED, which will give us a collection of the files that the examiner has blue-checked in the EnCase interface before running the EnScript.

Because we are processing multiple files, the execution of the Python script needs to happen once the loop is complete. The loop will be doing the work of identifying selected files and exporting them to the disk.

EnScript Walkthrough

The usage of ItemIteratorClass starts off with setting some values in variables. I defined these as global variables for reasons you will see in part 2. The mode I chose here allows an examiner to blue-check any number of files in EnCase, and send this collection to the EnScript for export.

The NOPROXY is used because I am not looking to get any hashes calculated and it speeds up the loop. The NORECURSE option is also used to speed up the loop. With the mode using the current view, the recursing into compound files isn’t possible, anyway.

Then we enter into the loop to find all of the files. There's a fairly bulky chunk of code here, but it has a purpose behind it. When you are dealing with files from evidence, you are potentially pulling files from folders all over the drive. Chances are good you will find a couple files with the same name. On line 22, I am using a GUID that is generated by EnCase and is unique inside the evidence file. Lines 20-23 all together are modifying the filename to include this GUID, but also retain the same extension for identity.

There is a little irritation that pops up when you use any of the modes focusing on the current view. It locks that view in EnCase for the examiner running the EnScript while the iterator is active. Line 31 happens immediately after the looping export code, and this clears the iterator to release the view for the examiner while Python does its thing. Little things matter!

Depending on the Python script you are using and the amount of data you are processing, you may have to adjust the timeout value on line 41. If this value is not large enough, the output from Python will be either missing or cut short.

You're getting a two-for-one deal in this joint blog post, because now Chet is going to explain some Python code now. (I don’t want to read any complaints about the length of this post!)

Python Walkthrough

The overview of the Python script is shown in the figure below:

The Script employs a Heuristic Model created from one or more word dictionaries. Dictionaries and vernaculars can be expanded through the training of the model. The Heuristic Indexer receives selected file(s) from EnCase and then extracts possible word strings from each of the files. Heuristics are calculated for each extracted string and then compared against the Heuristic Model. The result is a report that is delivered back to EnCase.

For Part I of the blog I want to focus on the primary integration between James’ EnScript and the Python Heuristic Indexer.

The main entry point for the Python Script prints out some information messages and then obtains the path and individual filenames exported by the EnScript by parsing the command line arguments. Then for each file found, the IndexAllWords() function is called to perform the string extraction and subsequent Heuristic analysis.  I have highlighted the key lines of the Python script.

Python Main Entry Point

# Main program for pyIndex

if __name__ == "__main__":

    # Print Script Basics
    print "\nHeuristic Indexer v 1.1 CEIC 2015"
    print "Python Forensics, Inc. \n"

    print "Script Started", str(

    # Obtain the arguments passed in by the Enscript
    # In Phase I the only argument passed is
    # path where the EnScript copied the selected files

    targetPath = ParseCommandLine()
    print "Processing EnCase Target Path: ", targetPath

    # using the targetPath, obtain a list of filenames
    # using the Python os module

    targetList = os.listdir(targetPath)
    # Creating an object to process
    # probable words
    # the matrix.txt file contains heuristic model

    wordCheck = classWordHeuristics("matrix.txt")
    # Now we can iterate through the list of files
    # Calling the IndexAllWords() function for each
    # file. The IndexAllWords() performs the word
    # extraction, heuristic processing and reports
    # results back to EnCase via Standard Out

    for eachFile in targetList:
        fullPath = os.path.join(targetPath, eachFile)
        print "####################################"
        print "## Processing File: ", eachFile
        print "####################################\n"
        IndexAllWords(fullPath, wordCheck)
    print "Script Ended", str(

    # Script End

Results: So What Do I Get From All of This?

Here is a screen shot and an abbreviated excerpt from an actual EnCase / Python marriage.

Closing Thoughts

James: This was a new (and exciting) opportunity for me to have a guest author in a joint post. I am so happy to hear that my #en2py techniques have helped others. EnCase is a powerful platform on its own, but enhancing it with the libraries available in other languages and tools just makes everything that much better for examiners. I hope you find this useful and thanks for taking the time to read through this!

Chet: The catalyst behind Python Forensics, Inc. is to create a collaborative environment for the rapid development of new investigative scripts that can directly benefit investigators.  I hope this blog will get you interested in developing and/or using EnScripts and Python in your next endeavor.  I would like to thank James for his enthusiasm for the project and I look forward to Part II.

The Final Details

Download the EnScript here.
Download Chet's Python script here.
Look for an email invitation and announcements on Twitter about an upcoming webinar we're planning with Chet called, "EnCase and Python: Extending Your Investigative Capabilities."

Chet Hosmer
Founder of

James Habben
Master Instructor at Guidance Software

My Thoughts on CEIC 2015

CEIC 2015 is Over

This year’s CEIC is over. After a long and relaxing holiday weekend, it feels almost like it was months ago. I really enjoy being involved with CEIC every year because it gives me a chance to catch up with old friends and meet new ones. The real reason (at least the one we tell our bosses) we all go to CEIC is for the great sessions. There were so many of them this year that I wish I could have cloned myself to see them all. To make it a bit more difficult, CEIC is not just a training conference for me since I am part of the team putting it on. I wanted to put down some of my experiences from this year.

The most rewarding thing to me during the entire conference is to hear from past students about their success in completing the EnCE certification. The only way to achieve that cert is by dedication and perseverance. I get thanks from them for teaching classes they attended, but I didn’t take the test. Their excitement and enthusiasm is infectious and I love it! Congratulations to everyone who passed the 1st phase during CEIC, and good luck on the 2nd.

If you didn’t get to attend CEIC this year, you missed a good one. Try again for next year, and I think you will be well rewarded.

Some Sessions

Because I am part of the setup and operations of CEIC, I am not usually able to attend full session, but there are a few that I really enjoyed that I wanted to give mention to.

Monday started off great hearing about new features in IEF from Jamie McQuaid and Rob Maddox of Magnet Forensics in Investigating a User’s Internet Activity across Computers, Smartphones and Tablets. This team knows how to stay on top of industry trends and to enhance their tools with a quick response. It is great to know that Guidance has a partner dedicated to examiners like we are.

A must-see for me is Tracking the Use of USB Storage on Windows 8 by Colin Cree. He has been researching USB artifacts on Windows for many years, and somehow seems to find new intricacies every year. No disappointment this year!

It’s a safe bet on the SANS crew. I enjoyed APT Attacks Exposed: Network, Host, Memory and Malware Analysis since you can never learn too much about how others operate and think. It helps us all grow, and I am glad that Rob Lee, Anuj Soni, Chad Tilbury, and Jake Williams are sharing their experiences.

I am a firm believer in everyone learning to code as a skill. Mari DeGrazia and Ron Dormido laid out a great foundation in Practical Python Forensics for those wanting to learn Python as their language. Extra points since they showed how to integrate EnCase and Python!

Memory forensics has become a huge source of information in all types of investigations, and Jamie Levy knows this better than most. As a part of the Volatility team, she is an immense resource and shared it in Rootkits, Exfil and APT: RAM Conquers All to help us all. I learned a lot about using Volatility from this session. I also learned about her twitter handle outside of the session, but leave it to her to spread that.

My Sessions

I had a lot of fun this year talking in my sessions. I talked about how you can expand EnScript with .NET and Python code. It was exciting to me since everyone seemed to also be excited about the possibilities. I also got a chance to speak with Matt McFadden about EnCase Portable and the huge potential it has for examiners. Got to share how I used Portable on a case to handle a location with 4 examiners and 60+ computers, and we were done before dinner! Talked to many after the session that were excited about using it at home.

Deserved Recognition

Lastly, I wanted to give some recognition for a couple people from the Guidance Software team that really make CEIC the conference that it is. The entire Guidance team works really hard for this event, but these two really make it shine.

There is a technical team that I am part of every year, and it is managed by Jamey Tubbs from the training division. He puts in a ton of hours, before many of you even register for CEIC, in working with the event team, hotel technical staff, and our computer rental vendor. Our conference is unique from many others because of the large scale labs with supplied computers, and it would not be the same without him.

On the event team, we are lucky to have Jennifer Iwata take on CEIC this year. She has been involved for a couple years, but she was the boss this year and knocked it out of the park. I think this was the smoothest CEIC yet for the operational staff and I heard the same from many others as well. I am sure that she is already on top of planning an even better CEIC for next year!

Until you read from me again!
James Habben

Digital Forensic Notables and Top-flight Instructors On Tap at CEIC 2015

(This is Part 3 of a 3-part series on the all-new and enhanced digital forensics labs and lectures at CEIC 2015.)

The first post in this series talked about how we're expanding on the core competency of the EnCase community who converge on CEIC each year. The second post drilled down into the plethora and diversity of digital artifacts and showcased sessions designed to address these exploding challenges. In this final post, we present the marquee of acclaimed industry experts who will be on hand to teach new technologies and tools and share hard-earned insight from decades of experience in digital investigations.