Firrstly all the usual "new normal" well wishes to you all and I hope you and your families are doing OK amidst our national pandemic!
In my
previous post and Video 59 which was as recent as, cough cough, Nov 2019 (!!!), I introduced the basics of developing X-Tensions for X-Ways Forensics. I promised, then, I'd follow up "soonafter" with more. And I promise, I did intend to. But then just two months later the world got flipped upside down with Coronavirus and the entire daily existence between working, home life, kids not being school, and not being able to go anywhere had 'a bit of an impact'. This narrative is been recorded with family life going on in the background and a noisy fan trying to keep cool in my box room, but I hope it has some value at least.
This narrative and the accompanying video surrounds two factors of X-Ways Forensics that centre around your options for exporting files using X-Ways itself, and your options for handling files as a developer of
X-Tensions.
First I would like to discuss some new(ish) export options that have arrived in the main tool of X-Ways Forensics in the last 8 months or so before I go on to talk about X-Tensions; of particular mention is the ability to export file objects as PDF files and as Text files!
This is a massive addition to XWF and something my friends and I have wanted (the pdf export in particular) for a long time. When you need to, or when you have to, send some files to someone from a forensic tool, the ability to do so as PDF is a game changer. So I am delighted the ever responsive developers at X-Ways Software added this back in v19.9, in Sept 2019. What does this mean exactly? It means, simply, you can right click a file(s), choose "Recover\Copy", and then tick the box for "Convert to PDF format" and then whatever the file object is, it will be exported as PDF if possible. This could be a single e-mail or word document, or it could be a picture file, or it could even be an entire PST e-mail cabinet as one enormous PDF.
As if that were not enough, but the ability also exists to export the file as text only. Now, I must say that in addition for the appeal to enable exporting as PDF, this more unusual feature was something I, for one, asked for. Perhaps there were others as well as I am sure the guys at X-Ways Software don't just take my requests in isolation. The ability to easily select a bunch of files and simply have them all exported as text files in our age of data linking, "big data" and the myriad of tools we often need to use these days is really useful.
The reason it is particular useful to me is for programmatic reasons though. Soon after adding this feature, they added a series of flags to their API function "XWF_OpenItem" and this is where the meat of this discussion leads and why I want to talk about it.
As an X-Tension developer, if you could not interact with files, you would be rather stuck. The function
XWF_OpenItem is the function you need to call to obtain a
handle to a file object (typically you call it at least once, or repeatedly for multiple items, via the core function
XT_ProcessItem) and then you can interact with it. Once you have that handle, you can do pretty much whatever you like with the file. Read the entire thing into a memory buffer, read a 4 or 8 byte header, read a sector, read multiple sectors one after the other, read the entire thing as a memory stream, or anything else. And, once you can do that, you can then use any programmtic function calls you like and get output you require, including using existing functions that you may have written years ago. And remember, X-Tensions are compiled language DLL's, so your previous native C or C++ or Delphi code can work with it. And when you are done, you release the handle (using
XWF_Close).
So, prior to v20.0 of X-Ways Forensics, if you wanted to read the text of a given file in an X-Tension you were developing, you could do so IF the file itself was rather basic in structure. For example, a handle to an e-mail file, or an RTF file would allow you to simply read the content of the file (using XWF_Read) and then you could run your textification routines over the buffer to another buffer or memory stream and then save it as a file. Indeed, I did just that using my own routines. But, I wasn't really happy with my implementation. Anyway, my point is that you'd 'simply' get the default handle that X-Ways gives you for the file, and although you can read every byte from start to end, put it in a buffer and do whatever you need to do, there is often more to do with more complex files. For example if the file was not a basic file and was instead something like a Microsoft Word 2013 docx file, or an OpenOffice ODT file, then the handle you get is not a handle to what the user sees when they click "Preview" in the viewing component, but you get a handle to the entire file. Those of you who understand how these files are structured know where this is leading : such files are compressed files like zip files. In fact they have a PK header (0x50 4B 03 04). And they actually contain a filesystem of their own in many ways that includes a number of XML files. In the case of MS Word files, the file that contains the text that you see in the viewing component is stored in "\word\document.xml" and for OpenOffice it is stored in "Content.xml". And then a whole series of other XML files dictate how this data is shown to the user such as fonts, size, etc. And of course there is also Excel files and so on to think about.
So, as a developer, prior to XWF v20.0, if all you want is the text body of a Word document and you have no care for any other aspect of it, you need the parsed XML content of document.xml. Which means your code has to get the handle to the docx file first, then open the content into a buffer or save it straight out to a temp area, decompress the docx file (so another handle needed), filewalk the output to word\document.xml, get a handle to that (so another handle), open it, parse the XML tags and then you have the content you need. I did all this originally more or less, and then I asked the developers at X-Ways if they would be so kind as to add a flag to XWF_OpenItem instead so that I could simply get a handle to the text view of a file. That would give me a handle to the viewing component view of the file, instead of the raw file as stored on disk. And I am happy to report that this came with XWF v20.0, currently in Beta but due for release soon I suspect.
So, as of v20.0, all you have to do is switch the flag to your requirements and this will dictate what kind of handle you get to the file object. And by using XWF_GetSize() you can match any buffers to the size of the file object, allowing you to create, store and release the right amount of memory for the item currently being dealt with by XT_ProcessItem, allowing easy and smooth memory management. Because lets face it, nobody likes memory leaks, or their associated costs or loss of production time :-(
But there is more. Not only can you get a handle to the text based version of the file, but you can also choose between UTF8, or UTF16. And did I mention you can also now get a handle to a PDF version of the file (where supported)!! So, as a developer, you can now code a solution that says "
With these selected files, convert them to PDF and\or to Text, read them, parse them, search them, do a little dance with them and then save them to location X". This is mind blowingly useful because it means you can automate tasks for your users via X-Tensions. The list of flags available now (and this is pretty much tripled in recent months!) is below, and is from the
API website authored by X-Ways Software AG :
In the video below, I have prepared a basic example and I show you how you can utilise some of these new flag options and from there. Excuse the not so great sound....a fan was blowing! I will leave you to explore and enjoy this new world of creative freedom with the video below :
https://youtu.be/n4nDtx-BYpg