Innovative Software Solutions for Publishers   
 
home
extyles product info
professional services
news & events
customers
customers
resources
about
about
about
about
contact us
contact us
Learn how eXtyles is used
in scholarly publishing

>> Download here
Inera brings you practical tips
for editing in Microsoft Word

>> Subscribe here
 
 

Word 2007 Scholarly Publishing Update

(Updated 04-07-08)

Last June, we wrote about our concerns regarding Word 2007 in scholarly publishing workflows. As the New Year begins, it is a good time to revisit this blog.

Scholarly Publishing Status

In late July 2007, a group of scholarly publishers and vendors met with Microsoft to discuss these issues (see meeting summary by Howard Ratner). In response, a web site was set up by Microsoft in December 2007 that includes information of use to scholarly authors, publishers, and vendors who service publishers.

Today, most scholarly publishers are not ready to accept DOCX files, and several (e.g., Science, Nature) have explicitly asked authors to avoid using Word 2007 or to use it in compatibility mode. Many of the specialized applications used by publishers are not yet fully compatible with Office 2007. For example, most online submission systems are not fully compatible with Office 2007 as of this writing, and so authors find they must revert to DOC format for manuscript submission to journals. Inera expects most of the specialized applications to become fully compatible during 2008.

In an unscientific survey of its customers, Inera has found few DOCX files have actually been submitted to their journals. Inera believes this small number of DOCX submissions may be attributable to more than submission-system incompatibility. The collaborative nature of scholarly research may contribute to this dearth of DOCX files. When a researcher emails a DOCX file for review, the recipient who lacks Word 2007 is likely to report back the inability to open the file, especially if their institution does not provide them administrator privileges to install the Word 2003 compatibility pack. As a result, authors may set Word 2007 to save to DOC rather than DOCX format by default, and therefore significant numbers of Word 2007 users may not be using the new DOCX file format because of failed attempts to share their documents. For these reasons, Inera believes that DOCX files will not be widely used by researchers before 2009.

eXtyles® and Word 2007 Equation Builder

Inera completed eXtyles compatibility for Word 2007 last October. However, during testing, our engineering team encountered two significant problems with Word's new Equation Builder feature.

1. Word 2007 has a frequently occurring bug in which some characters in Equation Builder equations become corrupt when DOCX files are saved in RTF format (part of the eXtyles process). For example, note the equation in the following excerpt:

        

and its appearance after the file is saved to RTF:

         

The "W" changes to "r". In broader testing, a more common occurrence is for Chinese characters to appear, such as when the "n" in this line:

is saved to RTF:

On March 18, 2008, Microsoft resolved this problem via a hotfix. Inera advises all organizations that use RTF as part of their workflow when handling DOCX files to obtain this patch from Microsoft.

2. The transform provided in Word 2007 to convert Equation Builder math to MathML (OMML2MML.XSL) has bugs. For example, the expression:

is incorrectly converted by Word 2007 to this MathML:

<munder>
<mo>&#x2211;</mo>
<mi>i</mi><mo>,</mo><mi>j</mi>
</munder>
<mrow>
</mrow>

The correct MathML is:

<munder>
<mo>&#x2211;</mo>
<mrow>
<mi>i</mi><mo>,</mo><mi>j</mi>
</mrow>
</munder>

Though Inera has not conducted exhaustive testing of the OMML-to-MathML transform, we remain concerned that several problems were encountered in testing fewer than 100 equations.

Inera has found that there are no differences among the transforms shipped with the beta, release, and SP1 versions of Word 2007. We believe that Microsoft's transform has not been adequately tested and debugged. This problem stems in part from lack of use. Many users have not discovered the transform because it is undocumented by Microsoft (a search in Word 2007 SP1 Help for "MathML" yields no results).

It is likely that many other problems will need to be flushed out before the OMML-to-MathML transform will be sufficiently robust for use by scholarly publishers. Microsoft is working on an update to the transforms, but we do not have any information about a release plan.

Summary

As of January 2008, most scholarly publishers are not ready to accept DOCX files. Inera expects that most third-party applications essential for scholarly publishers will be compatible with Office 2007 by the end of 2008. Beyond the lack of fully compatible systems, most publishers we have (unscientifically) surveyed have not upgraded their editorial and production staff to Office 2007. Inera expects such upgrades to start slowly in 2008 but not to accelerate before 2009.

For 2008, Inera recommends that publishers continue to develop and test their workflows for handling DOCX files and postpone accepting such files from authors until all systems are fully tested.

Inera recommends that publishers who handle content with any significant amount of math decline to accept files that use Microsoft's Equation Builder until the two specific issues listed above are resolved by Microsoft and fully tested by publishers.

Additional information about Word 2007 math is available here.

Additional information about the Word 2007 DOCX file format is available here.

 
 ____________________________________ 
 

Inera, eXtyles, and refXpress
are registered trademarks
of Inera Inc.

Inera's Privacy Policy