How can I only return a few metadata fields instead of all of them when I look up a DOI?

Hi! I’m not a developer but so excited about the possibilities with crossref!!! What I need help with:

/works/https://0-doi-org.pugwash.lib.warwick.ac.uk/10.1007/978-3-540-68161-8_1

is working great for me to get the metadata from lists of DOIs, but I don’t need ALL the metadata.

On the documentation page under funders I was reading that

Crossref metadata records can be quite large. Sometimes you just want a few elements from the schema. You can “select” a subset of elements to return using the select parameter. This can make your API calls much more efficient. For example:

/works?select=DOI,prefix,title

Really what I need is title, author, published, and I found those elements in the list in the link above.

I thought I could build a query like this:

/works/{doi}/works?select=DOI,prefix,title

but that structure is returning an error.

I also thought maybe the parameters are specified simply by replacing “agency” below with the metadata field I need,

/works/{doi}/agency

to

/works/{doi}/title

but that didn’t work either!

If I understand correctly, those requests are all with the REST API, and I was also experimenting with OpenURL, something like:

/openurl/pid={email}&format=unixref&id=doi%3A10.1577%2FH02-043&noredirect=true

or DOI-to-metadata query requesting UNIXSD results

/servlet/query?pid={email}&format=unixsd%2A%2A&id=10.1577%2FH02-043

but I am running into the same problem - how do I narrow down the metadata fields returned?

Hello, and thanks for your question.

I think the ‘select’ parameter in the REST API is what you’re after, but you need to narrow down which DOIs/works you’re looking for before you use it.

Select won’t work if you specify an individual DOI, like 10.1007/978-3-540-68161-8_1
In that case, you can only get the full record using
https://0-api-crossref-org.pugwash.lib.warwick.ac.uk/works/10.1007/978-3-540-68161-8_1

But, for example, say you wanted to get metadata for all DOIs in that journal “Beiträge zum ausländischen öffentlichen Recht und Völkerrecht”.

You could filter by its ISSN and then use the select parameter like this:
https://0-api-crossref-org.pugwash.lib.warwick.ac.uk/works?filter=issn:0172-4770&select=title,author,published&rows=1000

That will return article title, authors, and publication date for the first 1000 records out of a total 2765 records containing that journal’s ISSN 0172-4770.

To page through to the subsequent records, you’d need to use the cursor or offset parameters, because there’s a 1000 record limit per request.

If you want to narrow down the content you’re looking for in other ways than by journal, let me know, and I’ll be happy to run through a few other examples.

You also might find these posts useful
Getting Started with REST API Queries
Using Postman for API Queries

-Shayn

2 Likes

Thanks for the question Morgan, and thanks for the response Shayn!

I’m following up just to say that I thought of a workaround for using the select parameter for single DOIs.

If you request a DOI using the filter=doi parameter, you can add the select parameter as well. For example: https://0-api-crossref-org.pugwash.lib.warwick.ac.uk/works?filter=doi:10.1007/978-3-540-68161-8_1&select=DOI,prefix,title

The full list of “selectable” metadata elements is listed in the error message when an invalid select value is requested, eg: https://0-api-crossref-org.pugwash.lib.warwick.ac.uk/works?select=asdf

3 Likes

THANKS!!! The workaround is perfect, since I’m using a google sheets plugin called “API Connector,” and I can plug in hundreds of requests for individual DOIs all at once to then populate rows corresponding to each DOI! I am so happy to have the list from the error message, thanks for including that too! For future researchers who might want to do something similar, here are the selectable metadata elements!

abstract, URL, member, posted, score, created, degree, update-policy, short-title, license, ISSN, container-title, issued, update-to, issue, prefix, approved, indexed, article-number, clinical-trial-number, accepted, author, group-title, DOI, is-referenced-by-count, updated-by, event, chair, standards-body, original-title, funder, translator, published, archive, published-print, alternative-id, subject, subtitle, published-online, publisher-location, content-domain, reference, title, link, type, publisher, volume, references-count, ISBN, issn-type, assertion, deposited, page, content-created, short-container-title, relation, editor

Is there somewhere with descriptions/explanations of these elements?

Thanks @ppolischuk -that’s a good reminder! I forgot about the trick of filtering by a DOI.

Is there somewhere with descriptions/explanations of these elements?

Unfortunately, we don’t have a full metadata output schema for the REST API yet, or documentation that defines every field. Some of the API-specific elements (for example, the difference between the published, updated, indexed, and created dates) are clarified in the API documentation.

Most of the bibliographic metadata elements are straightforward (e.g. abstract, title, ISSN), but for others, you may need to look to our metadata input schema.

The problem is that, because the metadata is input in XML and output (in the REST API) in JSON, the metadata elements aren’t always named exactly the same. For example, ‘license’ in the JSON output is <license_ref> in the XML input; or ‘container-title’ in the JSON is <full_title> in the XML input for journals, and in the XML input for books, proceedings, etc.

But if you have any questions about specific metadata elements, just let us know, and we’ll be happy to clarify.