eXtyles and preprint citations

Back in March 2020, the exploding number of preprints—and citations to those preprints—in response to the COVID-19 pandemic posed an urgent new challenge to eXtyles.

We’re happy to announce that starting with Build 4550, we’ve updated eXtyles to handle preprint citations in your reference lists!

How it works

eXtyles parses references to preprints, but doesn’t restructure them. (This is similar to how eXtyles handles references to conference proceedings.)

That’s because guidelines for citing preprints are still maturing, which means that it’s safer to leave decisions about which elements to include and how to format those elements to the judgement of experienced human editors. It’s also because we see wide variations in the citation data submitted by authors, and we can’t always count on finding all the elements necessary to create a complete citation.

If your eXtyles configuration includes Crossref Linking and Correction, you’ll notice that eXtyles now queries Crossref to find DOIs for preprints, just as it does for other types of references, and retrieves DOIs for preprints that have a DOI deposited with Crossref.

Here’s what you should see:

  • If Crossref returns a preprint DOI, eXtyles will insert it.
  • If Crossref returns a DOI for the published version instead—which is something we’re seeing quite often—eXtyles can still insert the DOI for the preprint, as long as the preprint and article metadata are correctly linked in Crossref.
  • If the Crossref metadata indicates a final publication DOI for the preprint, eXtyles will also add a comment with the DOI and other publication for the journal article, so that you can see what’s happened to the reference since the author cited the preprint.

The XML of preprint citations exported from eXtyles to JATS complies with JATS4R recommendations for citing preprints.

Crossref linking and preprints

When you run Crossref Linking and Correction, you may see inconsistent results for preprints. eXtyles does its best, but the results depend on how Crossref resolves link queries. Unfortunately, Crossref query results are inconsistent, and Crossref doesn’t always have complete metadata.

To illustrate, here’s example of an input reference and the “ideal” output, in which eXtyles has added the preprint DOI to the reference and the comment gives information about the final publication as a journal article:

Input

1. Dunham I. FORGE: A tool to discover cell specific enrichments of GWAS associated SNPs in regulatory regions. bioRxiv. 2014

Output

Same reference entry as above, parsed into JATS reference elements, with the preprint DOI added and a comment giving publication info (including DOI) for the subsequent published version.

In this case, eXtyles can provide both DOIs, because the preprint server and the publisher have provided all the necessary metadata and Crossref has correctly cross-referenced the two DOIs.

However, we also see cases like this one:

Input

2. Lou B, Li T, Zheng S, et al. Serology characteristics of SARS-CoV-2 infection since the exposure and post symptoms onset. medRxiv. Preprint posted March 27, 2020.

Output A (final publication info missing)

Same reference entry as above, parsed into JATS reference elements, with the preprint DOI added but no final publication info

Output B (preprint DOI missing)

Same reference entry as above, parsed into JATS reference elements, with no DOI added; final publication info is given in a comment

Outputs A and B come from submitting exactly the same input to Crossref in two different queries, which were submitted less than 1 minute apart.

In Output A, eXtyles was able to find the correct DOI for the preprint, but because the preprint and the published journal article aren’t linked in the Crossref database, there’s no comment providing information about the published article.

In Output B, eXtyles has instead found the DOI for the published journal article in the Crossref database. eXtyles is smart enough to recognize that this isn’t the correct DOI for the preprint, so it inserts a comment instead of just adding the DOI to the reference. But since the preprint and the final publication aren’t linked in the Crossref database, eXtyles can’t add the preprint DOI.

Why now?

John Shaw of Sage recently commented that preprints are “a mature process that’s not at all mature.” We had originally hoped to have this update go live in April, but ran smack into John’s statement: we’ve found a more varied and inconsistent data set, both from authors and at Crossref, than we originally anticipated!

Read all about what we learned from our development work in this Scholarly Kitchen guest post.

Thanks to the work of our Development team, we’re now able to deploy a robust solution for preprint citations.

Questions? Need more information? Contact us at [email protected]!