Microsoft Office File Format Internals: A given MS Office document is organized internally using OLE Structure Storage. OLE Structured Storage is defined as a systematic organization of components of any MS Office document. Each document has a root component which contains storage and stream components. The OLE Structured Storage is synonymous with the file system structure, such that 'storage' components are equivalent to directories and 'stream' components are equivalent to files. A storage component may exist as a standalone component. Each storage component may have one or more sub-storage components and stream components. Also the root component may have stream components directly within it.
The actual implementation details are defined in The Windows Compound Binary File Format specification.
Most of my research on MS Office File Format was conducted using the Ruby OLE library which allows easy and abstract read-write on the various streams and storages packed in the internal OLE structures. Install the Ruby-OLE gem before trying out any of the examples below.
Examples:
Dumping the OLE structure of a given word document:
user@sigsegv$ oletool --tree sample2.doc
- #<Dirent:"Root Entry">
|- #<Dirent:"1Table" size=34907 data="^\004\032\000\022...">
|- #<Dirent:"\001CompObj" size=121 data="\001\000\376\377\003...">
|- #<Dirent:"MsoDataStore">
| \- #<Dirent:"F\303\223\303\216\303\226U\303\2261\303\2305U4\303\217\303\2201BEKP\303\235N\303\203\303\200==">
| |- #<Dirent:"Item" size=216 data="<b:So...">
| \- #<Dirent:"Properties" size=341 data="<?xml...">
|- #<Dirent:"WordDocument" size=15429 data="\354\245\301\000}...">
|- #<Dirent:"\005SummaryInformation" size=4096 data="\376\377\000\000\005...">
\- #<Dirent:"\005DocumentSummaryInformation" size=4096 data="\376\377\000\000\005...">
user@sigsegv$
Sample code to display the size of the WordDocument stream inside a doc file:
#!/usr/bin/ruby
require 'rubygems'
require 'ole/storage'
ole = Ole::Storage.new("sample2.doc")
buf = ole.file.read("/WordDocument")
ole.close
puts "WordDocument stream size: #{buf.size}"
Sample code to display only the text part of a doc file:
Reverse Engineering a Microsoft Office Patch: The patches against Microsoft Office Suite as distributed by Microsoft usually consists of self extractable MSP or MSI packages extracting which is not exactly same as that of other patches.
require 'rubygems'
require 'ole/storage'
require 'lib/fib'
if __FILE__ == $0
if ARGV.size != 1
exit
end
ole = Ole::Storage.new(ARGV[0])
docbuf = ole.file.read("/WordDocument")
fib = Word::FIB.load(ole)
off_start = fib.fcMin
off_end = fib.fcMac
puts "Text Offset start: #{off_start}"
puts "Text offset end: #{off_end}"
text = docbuf[off_start, off_end - off_start]
puts text.inspect
end
Step1:
After fetching the patch installer executable, the first thing to do is to have to the installer extract the MSI/MSP installer programs:
officexp-KB-XXX.exe /C /T:e:\ms08-042-extracted\The above command will extract the actual patch installer files to e:\ms08-042-extracted\ directory. Among the extracted files, there will be an MSI or MSP file which is the main patch installer program.
Step2:
The MSI/MSP files are special OLE structured installer programs. Details can be found here, here. There is also an utility for extracting MSI/MSP files here.
msix.exe WINWORD.msp /out:e:\ms08-042-extracted\ /extThis should extract all the table data and other relevant information along with a CAB file containing the actual patch binaries which we are interested in. Find the CAB file among the extracted files and extract it normally using WinZIP/WinRAR etc. and BANG!
Bug Hunting: A good number of bugs, including theoretically Security Vulnerabilities where discovered using very trivial bit-byte alteration fuzzing of various structures including the File Information Block (FIB) in Word Documents, random structures in the TableStream etc. There are a no. of structures in the File Formats particularly the Word File Format whose sizes are also read from the document itself, these areas can be good vectors for fuzzing particularly when there are multiple structure load from file with size value read from the file itself.
Thank you for the article.
ReplyDeleteIt would be useful for everyone to know how to check a file for vulnerabilities. By the way, some companies started using cloud solutions for the business, this allows going without MS Office. And some, as far as I know use the best virtual data room services for performing the same operations.
Using external cloud offerings can yield even more pronounced savings. Some executives cite examples of 60 to 70 percent savings by replacing custom-developed internal applications with software-as-a-service alternatives sourced from the public cloud.
ReplyDeleteelectronic data room due diligence
Microsoft, on the other hand, has a well known bureaucratic problem. There are over 12 layers of middle managers between executives and the developers. Office team timesheet
ReplyDeleteThere are certainly a lot of details like that to take into consideration. That is a great point to bring up. I offer the thoughts above as general inspiration but clearly there are questions like the one you bring up where the most important thing will be working in honest good faith. I don?t know if best practices have emerged around things like that, but I am sure that your job is clearly identified as a fair game. Both boys and girls feel the impact of just a moment?s pleasure, for the rest of their lives.
ReplyDeleteDownlaod winzip license key for 64 bit window 10