Creating semantic structure using paragraph styles, part 2: How eXtyles uses paragraph styles

In our long experience of helping editorial and production staff transition to using eXtyles, we’ve observed that making the mental leap from formatting Word content to applying semantic structure to Word content is often one of the most challenging aspects of adopting an XML-driven workflow.

What are some differences between formatting and semantically structuring documents? Why is this challenging transition so important to make? What are some benefits of a semantic mindset? In this 3-part series of posts, we’ll discuss

Are you sitting comfortably? Then let’s begin!

Working with styles in Word

Many publishers use Word styles in preparing manuscripts for typesetting, and these processes are often of long standing. InDesign is very good at translating input Word styles into InDesign paragraph and character styles to help typesetters work more effectively, but its predecessors in this space, including PageMaker, QuarkXpress, and FrameMaker, also used this technique.

Word provides some basic default styles (including headings, block quotes, lists, and “normal” paragraph text), which can be used to style simple documents. Many publishers create their own sets of styles, either universal or specific to particular publication types or disciplines. These style sets can be fairly bare-bones or enormously complex, and can be organized in a wide variety of ways. Learning to navigate a particular publisher’s style library can be a substantial part of the training process for new in-house or freelance staff.

As we discussed in Part 1, it’s likely you’re already using Word styles to convey information in your publishing workflow. Since you’re already doing the work of applying the styles, let’s leverage that work to build structure into your content!

How many styles do you need?

Part of the onboarding process for new eXtyles customers is usually some kind of conversation about existing style libraries. While there is theoretically no set limit on how many styles we can support in a customer template, practical limitations do exist, and it’s not unusual for us to suggest pruning down the number of styles you use. For example, if your style template includes separate styles for bulleted and numbered lists, as well as separate styles for the first, middle, and last items in each list type, we will likely recommend condensing down to a single list style; eXtyles can auto-detect the list type from the label and tag it accordingly in the XML.

We also often recommend or require adding styles based on our analysis of your content, your style libraries, and your output target (e.g., JATS or BITS XML).

There are several reasons it makes sense to keep the number of styles as small as you can: sprawling style libraries can invite confusion, increase the risk of human error, and slow down the styling process. Working with us on your eXtyles configuration is a great opportunity to review your entire library of styles and look at which ones it makes sense to keep, which you can say goodbye to, and whether there are any new styles you need to add!

Why would you need to add styles? In Part 1 of this series, we talked about the importance of distinguishing between what things are and what things look like. Semantic structure is all about the former, and this is where design-driven style templates in Word can start to create problems.

For example, in many journal designs, front-matter headings (for abstracts, lists of abbreviations, etc.), back-matter headings (headings used for appendices, reference lists or bibliographies, endnotes, etc.) are visually identical to level 1 headings in the body of the article. If your goal is to produce consistently formatted PDF files, it makes perfect sense to use, say, H1 or Heading1 in all these cases: you get the visual results you want, and you avoid style sprawl.

But when you move from a design-driven workflow to a structure-driven workflow using XML, you need to think about this decision differently, because even if front-matter headings, level 1 headings, and back-matter headings look the same, their function is different.

Making your content flexible—that is, setting it up to be published in a variety of formats and to be revisited and repurposed in the future—means getting serious about understanding structure and applying it consistently and accurately. Using Word styles in a way that clarifies the structure of your content rather than obscuring it can be a key first step!

How eXtyles can help

When our configuration team sets up eXtyles for your organization, we’ll work with your existing style template as part of our efforts to make your workflow transition as smooth as possible. We’ll likely suggest deleting or consolidating some styles, adding some new ones, and perhaps renaming a few whose legacy names no longer match their current (or future!) use in your documents, and might cause confusion.

Our goal is that once your eXtyles implementation is up and running, you’ll have a set of Word paragraph styles that

  • Meets the structural needs of your content
  • Is friendly to the eXtyles users in your organization
  • Does not depart too drastically from what in-house and freelance editors expect
  • Can be accurately mapped to your chosen DTD during XML export
  • Produces effective outputs for page layout and other content transformations

Your Word template will also include a substantial number of character styles. Most of these will be applied automatically by various eXtyles processes:

  • Styles that identify structural elements within a reference entry, such as bib_article and bib_fname (applied by eXtyles Bibliographic References)
  • Styles identify in-text citations or callouts that will be exported as internal links, such as cite_fn and xref (applied by eXtyles Citation Matching)
  • Styles that will be exported as external links, such as url and bib_url (applied by eXtyles URL Checking) or bib_doi (applied by eXtyles Crossref Linking and Correction)
  • Styles that identify elements of author metadata, such as au_fname and au_deg (applied by eXtyles Author Processing)

An important point to remember: the eXtyles export filter can map Word paragraph and character styles to XML elements irrespective of whether the names match! In the examples below, the Word paragraph style RptTitle is mapped to the BITS element name  <book-title> (top) and the Word character style cite_bib is mapped to the BITS element name <xref>.

Screenshots illustrating how paragraph and character styles in a Word file map onto elements in an XML file

In Part 3, we’ll discuss how to overcome some of the most common challenges and pain points in moving from a format mindset to a structure mindset.