Powershell : Decompiling – Compiled HTML Help (.CHM) files and Data Wrangling


WHAT IS COMPILED HTML HELP (.CHM)?

Microsoft Compiled HTML Help is a Microsoft proprietary online help format, consisting of a collection of HTML pages, an index and other navigation tools. The files are compressed and deployed in a binary format with the extension .CHM, for Compiled HTML. The format is often used for software documentation, like for Sysinternals tools.

APPROACH :

Today me and my friend were looking for a approach through which we can Decompile .chm files into HTML and then parse the HTML DOM to extract some information. After some googling I found that there is Windows command line utility HH.exe shipped with Windows operating system which can decompile the .CHM files to HTML using some command line options.

So I wrapped up the commands into a Powershell function, like below

and then extracted the required information using following  piece of code

HOW TO RUN : 

Here I chose Compiled HTML Help file of  ProcMon.exe (Process Monitor – SysInternal Tool) as a sample .chm file.

decompile (4)

Hope you find it useful, happy learning 🙂

Prateek Singh  

Advertisements

2 thoughts on “Powershell : Decompiling – Compiled HTML Help (.CHM) files and Data Wrangling

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s