Centre for Science and Technology Studies Centre for Science and Technology Studies 2333AL Leiden Zuid Holland 31715273909
  • Events
  • What do we lose when MAG goes away?

What do we lose when MAG goes away?

Bianca Kramer and Cameron Neylon

Fri 3 Sep 2021 | 15:00 - 16:15 (CEST)
Bianca Kramer and Cameron Neylon

Download the recording of this Seminar

Code and data on Github

Presentation on Github

0903Microsoft announced in June that the Microsoft Academic Graph (MAG) product is going to be retired at the end of 2021. MAG has become an important resource for many seeking to explore how open data might replace existing proprietary data sources as well as the exploratory development of new capabilities. It provided wide disciplinary coverage and is amongst the most comprehensive information of any freely available and openly licensed dataset.

A common strategy for many projects and services (eg Unpaywall, Lens, Semantic Scholar, the Curtin Open Knowledge Initiative and others) has been to use MAG to enrich Crossref metadata. Crossref metadata is open, but does not have the same coverage as MAG, both in terms of objects (eg books and articles without DOIs) and metadata completeness (eg proportion of objects with associated affiliations, abstracts or citations). MAG also provides specific elements that Crossref does not, including a subject classification which is useful in many contexts.

In this talk we will discuss our strategy and initial results examining what we will lose when MAG is retired. While the existing data will remain available, as it was provided as open data, it will rapidly become stale. The completeness of Crossref metadata provided by publishers is improving, but not at a rate that will replace the lost MAG coverage within the next few years. In addition, MAG data themselves are also not complete, and there are gaps not filled by either Crossref or MAG.

How big are the gaps? How can they be measured? And what strategies could be developed that will fill that gap? What risks are there for the various projects that are offering to replace MAG and those who are relying on them? What other sources of open metadata can play a role? And perhaps most importantly, what prospects are there for leveraging the plans of Crossref and others to provide a route for community sourced data to develop long term solutions for the provision of trusted, sustainable and accurate open metadata for the future?


  • Bianca Kramer, Utrecht University Library, Utrecht University, Utrecht, Netherlands.
  • Cameron Neylon, Centre for Culture and Technology, School of Media, Creative Arts and Social Inquiry, Curtin University, Western Australia.
Share on:
Subscribe to:
Build on Applepie CMS by Waltman Development