Innovative Software Solutions for Publishers   
 
home
extyles product info
professional services
news & events
customers
customers
resources
about
about
about
contact us
Learn how eXtyles is used
in scholarly publishing

>> Download here
Inera brings you practical tips
for editing in Microsoft Word

>> Subscribe here
 
 

Word 2007 Math

(Updated 06-22-07)

Last year we reported to eXtyles® customers that Word 2007's new equation editor did not have support for MathML. We have recently learned that this is not correct. Several recent blog postings, both by Microsoft employees:

http://blogs.msdn.com/brian_jones/archive/2006/08/16/700494.aspx

http://blogs.msdn.com/murrays/archive/2007/03/16/math-find-replace-and-rich-text-searches.aspx

http://blogs.msdn.com/murrays/archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx

and by non-Microsoft employees:

http://dpcarlisle.blogspot.com/2007/04/xhtml-and-mathml-from-office-20007.html

http://www.robweir.com/blog/2007/04/math-markup-marked-down.html

have shed new light on this situation.

Word 2007 does have support to convert equations to/from MathML via the clipboard, although this feature is not turned on by default, and a recent Google search shows that the control to turn it on/off is virtually undocumented by Microsoft. To turn this support on in Word 2007, you must select the Insert Ribbon, add an equation (using the new Word 2007 equation editor), and then in the Design Ribbon that appears, you must click the down arrow to the right of "tools" and select the option "Copy MathML to the Clipboard as plain text" in the Equation Options dialog that appears.

The transformation that allows you to copy/paste equations via MathML is driven by two XSLT scripts (omml2mml.xsl and mml2omml.xsl). These scripts can be used outside of Word if you are reading or manipulating DOCX XML files directly.

So if Word 2007 supports MathML, then what's all the fuss about? We at Inera still see several problems:

1. If a Word 2007 file is read into an earlier version of Word, the equations from the new equation editor change to graphics.

This point is a key problem. Most publishers are not yet ready to upgrade to Word 2007, and if they edit documents in earlier versions of Word, the equations must be re-keyed for the remainder of the publication workflow (typesetting and/or XML production).

2. Scholarly publishers aren't ready to upgrade to Word 2007.

Most publishers are a long way from upgrading to Word 2007. There are two main reasons for this delay. First, other systems in the publication workflow, most notably online submission/peer review systems and editorial tools, are not yet compatible with DOCX files. If the surrounding infrastructure doesn't support DOCX, there's no impetus to switch internally. Second, the user interface is very different from previous versions of Word, and this change brings transition issues, especially with editors who may balk at the degree of change and see Word as a tool to do their work, not a tool to relearn with each new release.

3. We have concerns about the Microsoft MathML transform.

Even if all parties in the workflow change to Word 2007, we still have concerns about the Microsoft MathML transform. At the micro level, we have talked with several XML professionals who have tested this transform, and all noted bugs within an hour of testing. So we do not believe that the transform is "ready for prime time." At the macro level, we at Inera developed a transform many years ago from a (different) linear format to MathML. The transform, after much engineering work, was not flawless, and we discovered that certain linear constructs could not be unambiguously converted to correct MathML without human intervention. So we have concerns with the degree to which this Microsoft conversion can be made robust. As an additional data point, MathType, which has been in the business of math much longer than Microsoft, is still tuning their MathML conversion to this day because it's just not a simple task.

So where does all this leave us? We expect that it is inevitable that publishers will have to accept DOCX files, and publishers must prepare for this day sooner rather than later. However, until internal systems, especially in editorial, have been switched to Word 2007, we are strongly recommending that publishers discourage use of the new Word 2007 equation editor by authors. Instead, they should recommend that authors continue to use the legacy equation editor, which can be accessed from Insert Object from the Insert Ribbon. Only when publishers have fully converted inside to Word 2007 will it be reasonable to consider accepting DOCX files with the new equation format. However, even in this case, we believe that the transform to MathML requires further testing and tuning before it can be used in production. For this reason, we recommend updating instructions to authors to advise use of the legacy equation editor for the foreseeable future.

And where does eXtyles fit in with all of this? As of June 1, 2007, eXtyles is compatible with Word 2007, excepting equations inserted with the new equation editor. We have a strategy for this last part that we expect to implement over the summer, and hope to report by September that eXtyles is fully compatible with Word 2007. However, this strategy is dependent on Word 2007's MathML transforms, and so it will only be as robust as the transforms themselves.

Note: Design Science, which produces MathType, has posted a press release about Word 2007, equations, and scientific journal submissions. This press release includes a link to instructions for adding a button for the legacy equation editor to the Word 2007 quick-access toolbar.

Information about the Word 2007 DOCX file format is available here.

 
 ____________________________________ 
 

Inera, eXtyles, and refXpress
are registered trademarks
of Inera Inc.

Inera's Privacy Policy