Shortcut: WD:WikiProject Source

Wikidata:WikiCite

From Wikidata
Jump to navigation Jump to search

WikiCite WikiProject (formerly WikiProject Source Metadata)

WikiCite: creating a shared bibliographic repository for all Wikimedia projects
Bibliographic data form a big and well-connected part of Wikidata
Example query based on such metadata: Gallery of authors of scientific articles that have been published on this day of the year
A representation of how different aspects of the WikiCite ecosystem fit together
The World needs citations, and citations need source metadata.
The English word "citation" sung by a soprano singer
Metadata on the way from a physical copy of a book into Inventaire, a Wikidata front-end
Various tools exist to assist with the entry and curation of metadata, e.g. the Wikidata Edit Framework depicted here.
Topics co-occurring with Zika virus (Q202864)
Usage history of some key Wikidata properties around bibliographic and citation data as of January 2019. Live results here

The aim of WikiCite is:

  • to act as a hub for work in Wikidata involving citation data and bibliographic data as part of the broader WikiCite initiative.
  • to define a set of properties that can be used by citations, infoboxes, and Wikisource.
  • to map and import all relevant metadata that currently is spread across Commons, Wikipedia, and Wikisource.
  • establish methods to interact with this metadata from different projects.
  • to create a large open bibliographic database within Wikidata.
  • to reveal, build, and maintain community stakeholdership for the inclusion and management of source metadata in Wikidata.

There have been various proposals over the years for similar projects (see meta:WikiCite for details). Now that Wikidata is here, we can make it happen.

Current activities[edit]

Past activities[edit]

  • A WikiCite Roadmap of the future of bibliographic data in Wikidata has been discussed
  • so9q worked on a python bot to follow the Wikimedia eventstream and list DOIs and ISBNs found in Wikipedia articles mentioned in the stream. Next step it to import them if the bot request is approved..

Ongoing imports[edit]

Properties[edit]

See this subpage for more details.

Projects[edit]

Examples[edit]

Timeline (from 1952 till early 2016) of Wikidata items with publication date (P577) and with main subject (P921) being set to Zika virus (Q202864) and/ or Zika fever (Q8071861), as per this Wikidata list
References in a Wikipedia article

Here is an example that creates a reference list with the articles

based on the following code:

  • {{#invoke:Cite | reflist | Q14405740 Q13416617 Q20058533 Q15567682 }}

This results in

  • Wulf D. Schleip and Mark O'Shea, "Annotated checklist of the recent and extinct pythons (Serpentes, Pythonidae), with notes on nomenclature, taxonomy, and distribution", ZooKeys, vol. 66, 66, , doi: 10.3897/ZOOKEYS.66.683, PubMed ID: 21594030 , PubMed Central ID: 3088416 , Creative Commons Attribution 3.0 Unported
  • Stefan Martin Schmid, Bernhard Fügenschuh and Eduard Kissling, "Tectonic map and overall architecture of the Alpine orogen", Swiss Journal of Geosciences, vol. 97, 1, , doi: 10.1007/S00015-004-1113-X
  • Rudolf Jung, "Uffenbach, Zacharias Konrad von", Allgemeine Deutsche Biographie, 39th volume, vol. 39,
  • Eric N. Rittmeyer, Allen Allison, Michael C. Gründler, Derrick K. Thompson and Christopher C. Austin, "Ecological guild evolution and the discovery of the world's smallest vertebrate", PLOS One, vol. 7, 1, , doi: 10.1371/JOURNAL.PONE.0029797, PubMed ID: 22253785 , PubMed Central ID: 3256195 , Creative Commons Attribution 2.5 Generic
  • «ГОСТ» examples:

    Tasks[edit]

    For a list of specific tasks and todos (missing data, missing properties, cleanup tasks) see /ToDo

    Workflow for profiling researchers[edit]

    How to create a scholarly profile for a researcher in Wikidata

    1. Consider the platform
      1. Visit Wikidata
        1. Wikidata is the database which anyone can edit
        2. The Wikidata community curates this data
      2. Consider Wikicite
        1. Wikicite is the community project within Wikidata which curates source metadata
        2. The Wikicite community is a subset of the Wikidata community
      3. Consider how anyone accesses data
        1. Scholia is the specialized Wikidata tool for viewing academic profiles of people, topics, universities, etc
          1. If a profile looks good in Scholia, then the data is correctly formatted to be maximally open and accessible in Wikidata and the Semantic Web
          2. Making a profile look good in Scholia is the quickest and easiest way to format data once and for all
        2. the Wikidata Query Service is the general Wikidata tool for viewing groups of Wikidata content
        3. Everyone else, including big tech, big publishing, big government, etc scrape Wikidata and reuse this content, so what is in Wikidata goes everywhere else
    2. Identify or create the Wikidata item for the researcher to profile
      1. use basic Wikidata search by the person's name
        1. if the item for the person exists, then use it
        2. if the item does not exist, then create it
          1. follow the instructions for creating a profile for a human in Wikidata:WikiProject Biographies
          2. add enough information to uniquely identify this person by name and a few other characteristics
        3. If there is ambiguity because multiple people have the same name and characteristics, then create a new item. Items can be merged, and merging duplicates is easier to fix than separating mixed items.
    3. Try to add the ORCID, which is a unique scientific identifier
      1. visit https://orcid.org/
      2. search for the researcher
      3. if there is an easy and obvious match, then grab the ORCID
        1. go back to Wikidata
        2. click "add statement", enter ORCID, paste the ORCID, publish
        3. run ORCIDator, a Wikidata tool to import ORCID data into Wikidata
          1. Access through the SourceMD tool - https://www.wikidata.org/wiki/Wikidata:SourceMD
          2. further documentation at https://www.wikidata.org/wiki/Wikidata:ORCIDator
      4. there is often no ORCID, or the ORCID is blank, or there is ambiguity - pass if this is the case
    4. Use the "Wikidata Author Disambiguation tool"
    5. This will match papers indexed in Wikidata to the target researcher
    6. https://tools.wmflabs.org/author-disambiguator/
      1. Enter the target researcher's name
        1. in 2019 the tool is clunky
        2. try name variations, including initials, or whatever is likely in an academic paper
      2. Identify name variations
        1. go back to the Wikidata item for the person
        2. add the variations to the "also known as" field at the top of the item
        3. noting the variations greatly assists ongoing maintenance and profile updates
    7. Wait
      1. Wikidata is a nonprofit project of the Wikimedia Community
      2. technical infrastructure is modest; in 2019 updates typically take 5-30 minutes
      3. Like Wikipedia, Wikidata depends on volunteer contributors of content and donor funding
      4. thanks for editing, it is the most valuable contribution anyone can make
    8. View incomplete profile on Scholia
      1. enter the person's name - it should autocomplete
      2. profile generated based on available data
    9. use Scholia's "missing content" tool
      1. this is weird - access by adding "/missing" at the end of the scholia URL
      2. the missing tool is actually a collection of tools which search and suggest possible data to add to the profile
      3. building out the network of collaborators is easy from here
        1. consider building profiles for top co-authors
        2. consider building profiles for people who commonly cite target researcher's papers

    Possible Data Collaborators[edit]

    ContentMine presentation, Wikimania 2014. Wikiwish: "An Open Bibliography of science, updated daily" (the first bulletpoint at 27:30)
    Citing as a public service: presentation by User:DarTar at the 2015 Wikipedia Science Conference pitching Wikidata as an open bibliographic and citation data repository

    Some possible Data Collaborators have expressed interest on working on source metadata in Wikidata: others might usefully be approached.

    OCLC, which runs WorldCat, is very keen on collaborating with Wikidata; User:Maximiliankleinoclc wrote a letter about the possibilities.

    ContentMine has some excellent open software tools, which we could use to let Wikidata answer queries like "List all the review papers ever written on malaria vaccines", "List all the articles that mention Lygodactylus williamsii", "List every paper ever written by John Tuzo Wilson" and "List all the papers cited in Wikipedia articles that have been retracted". They listed "An Open Bibliography of science, updated daily" as a "wikiwish" at Wikimania 2014, apparently unaware that this project has been started at a slightly earlier workshop.

    PLOS has an API for RichCitations, which contains metadata on all PLOS papers up through late 2014. Rich Citations is a novel structured format to express each citation as a data element, and it includes a set of useful, additional terms specific to scholarly literature that enable research about the knowledge web citations create. It also includes a display feature much like Reference Tooltips, but linked to a database (which is open licensed), so it can update metainformation. They presented at Wikimania 2014 and are keen to collaborate and share their results with us.

    Zotero is interested in the idea of a proofread metadata source. Some Zotero users currently upload to cloud storage; we might build tools to let them upload here, instead. CiteseerX has a large open-licensed database of article metadata, and might want to set up an exchange, but have not responded to e-mails.

    The Cochrane Collaboration is developing an API to its metadata (they were contacted about this project in July 2014, so this use case may have helped shape the API). They produce large amounts of non-conventional metadata on works they review, and on works they produce, both of which Wikimedians quote.

    Institutional repositories are also increasingly interested in open APIs and linked databases, and seem generally receptive to this project. The university-run academic search engine BASE aggregates and normalizes these repositories and makes its data collection available for non-commercial purposes.

    Resources[edit]

    Statistics[edit]

    Of the 41,458,756 items which are instance of (P31) View with SQID of scholarly article (Q13442814)  View with Reasonator View with SQIDView profile on Scholia:

    With property Without property Coverage
    DOI (P356) View with SQID 29,596,004 11,862,752 71.4%
    PubMed ID (P698) View with SQID 32,036,547 9,422,209 77.3%
    main subject (P921) View with SQID 17,457,870 24,000,886 42.1%
    language of work or name (P407) View with SQID 16,793,065 24,665,691 40.5%
    author (P50) View with SQID 11,606,828 29,851,928 28.0%
    author name string (P2093) View with SQID 38,760,042 2,698,714 93.5%
    Only author (P50) View with SQID, no author name string (P2093) View with SQID 1,267,160 40,191,596 3.1%
    Only author name string (P2093) View with SQID, no author (P50) View with SQID 28,420,374 13,038,382 68.6%

    Updated: 11 March 2024.

    Subpages[edit]

    The following subpages belong to the project:

    Contact[edit]

    Participants[edit]

    The first list has now reached the maximum number possible for {{Ping project}}. Please therefore add your name to the second list below.

    To ping both lists, use {{Ping project|Source MetaData}} and {{Ping project|Source MetaData/More}} in two different posts.

    The participants listed below can be notified using the following template in discussions:
    {{Ping project|WikiCite}}


    Historical discussions[edit]

    There have been historical discussions about Wikidata hosting information about the sources of data.

    See also[edit]