I’m using the event data which, in theory, is a great resource. However, I’m finding lots of errors and I’m struggling to understand how these inaccuracies are finding their way into the data. Is there somewhere to report problems?
I have examples I can post but I didn’t want to post too much information here
Thanks, could you post a couple of examples here so I can understand a bit better the kind of problems you’re talking about? Some of it is inherent to the data and there’s not too much we can do about it, but we’d like to know if there’s something going wrong when we process and display events.
Here are a couple of examples, the titles of the referencing articles are what flagged the issue: they are completely outside the domain of the published datasets:
Example 1. Events for object DOI 10.5285/ecb17680-da2e-49ae-b250-2d04a6a08d2a
This is a dataset entitled: Survival and performance of Speckled Wood butterflies in relation to microclimate
API call: https://0-api-eventdata-crossref-org.pugwash.lib.warwick.ac.uk/v1/events?from-occurred-date=2022-09-01&obj-id=10.5285/ecb17680-da2e-49ae-b250-2d04a6a08d2a
Extract of results (we aren’t interested in the wiki links but I include some here to illustrate the problem):
Knowledge as a Factor Associated with Lifestyle in Controlling Hypertension (doi:10.31965/infokes.vol20.iss2.930)
*Atmospheric air pollution of the largest cities of the Volyn region: prerequisites, consequences and ways to solve the problem (doi:10.26565/2410-7360-2022-56-16)
Wikipedia article: Catalan countries (https://en.wikipedia.org/w/index.php?title=Catalan_Countries&oldid=1114129809)
Wikipedia article: Dionisio Arango Mejía (https://es.wikipedia.org/w/index.php?title=Dionisio_Arango_Mej%C3%ADa&oldid=145743414)
Example 2 Events for object DOI 10.5285/507a5e1f-e056-454c-8ff6-d185f3da8556
This is a dataset entitled: Water chemistry, hydrology and fluvial carbon data for two Amazonian small streams
API call: https://0-api-eventdata-crossref-org.pugwash.lib.warwick.ac.uk/v1/events?from-occurred-date=2022-09-01&obj-id=10.5285/507a5e1f-e056-454c-8ff6-d185f3da8556
Extract of results:
Empirical evidence of management control system in the emerging market doi:10.22495/cbsrv3i2art10
Reflections on the absence of investigative skills in accounting and auditing students ( doi:10.24142/rvc.n26a4)
Predicting the Financial Failures of Manufacturing Companies Trading in the Borsa Istanbul (2007-2019) (doi:10.4236/jfrm.2021.104023)
Inflaming public debate: a methodology to determine origin and characteristics of hate speech about sexual and gender diversity on Twitter (doi:10.3145/epi.2023.ene.06)
Many thanks for the examples. They look to be related to how we interpret events that we find using domains. In short, we look for domains on websites that match those we know publishers use (e.g. http://revista.religacion.com) then use several methods to determine a DOI based on the webpage.
In all of the cases above, the method used is called ‘landing-page-meta-tag’ and ended up matching the wrong DOI.
I’ve created a ticket for us to look into it. If you’d like to add other examples there please go ahead .