Thursday, August 5, 2010

Object Models of Microsoft Office

Despite its hideous looks and varied shortcomings, the Component Object Model (COM) technology from Microsoft does make interfacing to Windows applications a lot easier. Notably, programmatic manipulation of Microsoft Office files (Word documents and Excel spreadsheets, mostly) is almost always implemented by way of automation – i.e. driving the corresponding Office application through its COM API to perform the desired actions.

COM is language-agnostic and widely supported, making it a readily-available option in most programming environments. Accordingly, the web is littered with tutorials on the basics of setting up a COM session and performing simple MS Office automation tasks; almost anything available on Windows is covered, from C++ through C# all the way into Python. However, a more in-depth API reference is often missing; small articles, forum posts and code snippets can be found for any one particular topic, but learning even moderately complex tasks can quickly turn into a scavenging hunt for bits and pieces of information scattered across a dozen websites.

Microsoft Office Development with Visual Studio is MSDN's misleadingly-titled article on Office automation. It was last updated in 2001, so it even lacks a .NET section; despite that, it remains an invaluable resource, as it contains one of the most complete Object Models of the Office suite still available – and those seldom changed (if at all) since then. It's still not the definitive reference one would hope for – I had trouble finding information on how to open an Excel file, and what I did eventually find was barely enough – but it's certainly a very valuable addition to the information pool.