Getting started with OOXML

It can be hard knowing where to get started when working with a new technology. I have recently commenced work on a project heavily involving OOXML and I thought I’d share the websites and resources I found most useful to other people just starting out.

Look at that outrageous markup! You magnificent bastard, I salute you!Open XML Developer - http://openxmldeveloper.org/

Open XML Developer is the best all-in-one OOXML site on the web. It features OOXML news, articles, examples and an active community. If you have questions that aren’t answered on the site the Open XML Developer site has forums on just about every OOXML topic you could think of.

Microsoft SDK for Open XML Formats - http://msdn2.microsoft.com/en-us/library/bb448854.aspx

This is Microsoft’s SDK for working with Open XML. Right now the name is slightly confusing as the SDK only provides an API over the OOXML package, not the OOXML file formats themselves. You are able to read a docx for example, and pick out all the individual style, formatting and document parts; but the actual part contents are still XML that you must read and write yourself. The SDK is still in preview at the moment so I’m sure that support for the markup languages will improve as time goes on.

Books are useless! I only ever read one book, 'To Kill A Mockingbird,' and it gave me absolutely no insight on how to kill mockingbirds! Sure it taught me not to judge a man by the color of his skin…but what good does *that* do me? Open XML Explained - http://openxmldeveloper.org/articles/1970.aspx

Open XML Explained is the first book on Open XML development and is freely available to download. The book is 128 pages long and provides a good high level introduction to OOXML and the three main markup languages: WordprocessingML, SpreadsheetML and PresentationML.

Ecma Office Open XML specification - http://www.ecma-international.org/publications/standards/Ecma-376.htm

If you really want to dig into the details of OOXML, the specification is the best place to look. Although there has been much rendering of clothing and gnashing of teeth over the specification’s 6000 page length, that page count includes introductions and primers to the specification. Also the markup reference document, which is by far the largest of the specification documents, is padded out significantly with many elements and attributes described a number of times.

  • Part 1: Fundamentals (174 pages)Gives an overview of Open XML packages and the parts that make up the markup languages.
  • Part 2: Open Packaging Convention (129 pages)Goes into more detail of the Open XML package conventions.
  • Part 3: Primer (472 pages)Describes the markup languages and how they work. Recommended as a good introduction to OOXML.
  • Part 4: Markup Language Reference (5219 pages)Provides descriptions of every element and attribute. There is a lot of detail in this document but repetition also contributes to its size. I have found using the document links is a good way to navigate the content and find what you are looking for.
  • Part 5: Markup Compatibility and Extensibility (43 pages)Describes how additional markup can be added to the format while still conforming to the specification.

WinRAR - http://www.rarlab.com/

I’m sure there are better tools for this, but I have been using WinRAR to explore existing OOXML packages. Since the packages are just zip archives any zip tool will let you view the contents.

If you know of any other OOXML resources I’d love to hear about them.

kick it on DotNetKicks.com