From c6dd01e26b74e066437821575ca6afd3ab9b07fc Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Sun, 17 Jun 2018 13:27:59 -0700 Subject: rename RFC document --- README.md | 4 +- fatcat-rfc.md | 363 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ rfc.md | 363 ---------------------------------------------------------- 3 files changed, 365 insertions(+), 365 deletions(-) create mode 100644 fatcat-rfc.md delete mode 100644 rfc.md diff --git a/README.md b/README.md index 9d2a754c..28664b86 100644 --- a/README.md +++ b/README.md @@ -8,8 +8,8 @@ ... catalog all the things! -The [RFC](./rfc) is the original design document, and the best place to start -for background. +The [RFC](./farcat-rfc.md) is the original design document, and the best place +to start for background. There will be three main components: diff --git a/fatcat-rfc.md b/fatcat-rfc.md new file mode 100644 index 00000000..21495f6d --- /dev/null +++ b/fatcat-rfc.md @@ -0,0 +1,363 @@ +fatcat is a half-baked idea to build an open, independent, collaboratively +editable bibliographic database of most written works, with a focus on +published research outputs like journal articles, pre-prints, and conference +proceedings. + +## Technical Architecture + +The canonical backend datastore would be a very large transactional SQL server. +A relatively simple and stable back-end daemon would expose an API (could be +REST, GraphQL, gRPC, etc). As little "application logic" as possible would be +embedded in this back-end; as much as possible would be pushed to bots which +could be authored and operated by anybody. A separate web interface project +would talk to the API backend and could be developed more rapidly. + +A cronjob would make periodic database dumps, both in "full" form (all tables +and all edit history, removing only authentication credentials) and "flat" form +(with only the most recent version of each entity, using only persistent IDs +between entities). + +A goal is to be linked-data/RDF/JSON-LD/semantic-web "compatible", but not +necessarily "first". It should be possible to export the database in a +relatively clean RDF form, and to fetch data in a variety of formats, but +internally fatcat would not be backed by a triple-store, and would not be +bound to a specific third-party ontology or schema. + +Microservice daemons should be able to proxy between the primary API and +standard protocols like ResourceSync and OAI-PMH, and bots could consume +external databases in those formats. + +## Licensing + +The core fatcat database should only contain verifiable factual statements +(which isn't to say that all statements are "true"), not creative or derived +content. + +The goal is to have a very permissively licensed database: CC-0 (no rights +reserved) if possible. Under US law, it should be possible to scrape and pull +in factual data from other corpuses without adopting their licenses. The goal +here isn't to avoid all attribution (progeny information will be included, and a +large sources and acknowledgments statement should be maintained), but trying +to manage the intersection of all upstream source licenses seems untenable, and +creates burdens for downstream users. + +Special care will need to be taken around copyright and original works. I would +propose either not accepting abstracts at all, or including them in a +partitioned database to prevent copyright contamination. Likewise, even simple +user-created content like lists, reviews, ratings, comments, discussion, +documentation, etc., should go in separate services. + +## Basic Editing Workflow and Bots + +Both human editors and bots would have edits go through the same API, with +humans using either the default web interface, arbitrary integrations, or +client software. + +The usual workflow would be to create edits (or creations, merges, deletions) +to individual entities one at a time, all under a single "edit group" of +related edits (eg, correcting authorship info for multiple works related to a +single author). When ready, the editor would "submit" the edit group for +review. During the review period, humans could vote (or veto/approve if they +have higher permissions), and bots can perform automated checks. During this +period the editor can make tweaks if necessary. After some fixed time period +(72 hours?) with no changes and no blocking issues, the edit group would be +auto-accepted, if no auto-resolvable merge-conflicts have arisen. This process +balances editing labor (reviews are easy, but optional) against quality +(cool-down period makes it easier to detect and prevent spam or out-of-control +bots). Advanced permissions could allow some trusted human and bot editors to +push through edits more rapidly. + +Bots would need to be tuned to have appropriate edit group sizes (eg, daily +batches, instead of millions of works in a single edit) to make human QA and +reverts possible. + +Data progeny and citation would be left to the edit history. In the case of +importing external databases, the expectation would be that special-purpose +bot accounts would be used. Human editors would leave edit messages to clarify +their sources. + +A style guide (wiki), chat room, and discussion forum would be hosted as +separate stand-alone services for editors to propose projects and debate +process or scope changes. It would be best if these could use federated account +authorization (oauth?) to have consistent account IDs across mediums. + +## Edit Log + +As part of the process of "accepting" an edit group, a row would be written to +an immutable, append-only log table (which internally could be a SQL table) +documenting each identifier change. This log establishes a monotonically +increasing version number for the entire corpus, and should make interaction +with other systems easier (eg, search engines, replicated databases, +alternative storage backends, notification frameworks, etc.). + +## Identifiers + +A fixed number of first-class "entities" would be defined, with common +behavior and schema layouts. These would all be semantic entities like "work", +"release", "container", and "person". + +fatcat identifiers would be semantically meaningless fixed-length random numbers, +usually represented in case-insensitive base32 format. Each entity type would +have its own identifier namespace. Eg, 96-bit identifiers would have 20 +characters and look like: + + fcwork_rzga5b9cd7efgh04iljk + https://fatcat.org/work/rzga5b9cd7efgh04iljk + +128-bit (UUID size) would have 26 characters: + + fcwork_rzga5b9cd7efgh04iljk8f3jvz + https://fatcat.org/work/rzga5b9cd7efgh04iljk8f3jvz + +A 64-bit namespace is probably plenty though, and would work with most database +Integer columns: + + fcwork_rzga5b9cd7efg + https://fatcat.org/work/rzga5b9cd7efg + +The idea would be to only have fatcat identifiers be used to interlink between +databases, *not* to supplant DOIs, ISBNs, handle, ARKs, and other "registered" +persistent identifiers. + +## Entities and Internal Schema + +Internally, identifiers would be lightweight pointers to actual metadata +objects, which can be thought of as "versions". The metadata objects themselves +would be immutable once committed; the edit process is one of creating new +objects and, if the edit is approved, pointing the identifier to the new +version. Entities would reference between themselves by identifier. + +Edit objects represent a change to a single entity; edits get batched together +into edit groups (like "commits" and "pull requests" in git parlance). + +SQL tables would probably look something like the following, though be specific +to each entity type (eg, there would be an actual `work_revision` table, but +not an actual `entity_revision` table): + + entity_id + uuid + current_revision + + entity_revision + entity_id (bi-directional?) + previous: entity_revision or none + state: normal, redirect, deletion + redirect_entity_id: optional + extra: json blob + edit_id + + edit + mutable: boolean + edit_group + editor + + edit_group + +Additional type-specific columns would hold actual metadata. Additional tables +(which would reference both `entity_revision` and `entity_id` foreign keys as +appropriate) would represent things like external identifiers, ordered +author/work relationships, citations between works, etc. Every revision of an +entity would require duplicating all of these associated rows, which could end +up being a large source of inefficiency, but is necessary to represent the full +history of an object. + +## Scope + +Want the "scholarly web": the graph of works that cite other works. Certainly +every work that is cited more than once and every work that both cites and is +cited; "leaf nodes" and small islands might not be in scope. + +Focusing on written works, with some exceptions. Expect core media (for which we would pursue "completeness") to be: + + journal articles + books + conference proceedings + technical memos + dissertations + +Probably in scope: + + reports + magazine articles + published poetry + essays + government documents + conference + presentations (slides, video) + datasets + +Probably not: + + patents + court cases and legal documents + manuals + datasheets + courses + +Definitely not: + + audio recordings + tv show episodes + musical scores + advertisements + +Author, citation, and work disambiguation would be core tasks. Linking +pre-prints to final publication is in scope. + +I'm much less interested in altmetrics, funding, and grant relationships than +most existing databases in this space. + +fatcat would not include any fulltext content itself, even for cleanly licensed +(open access) works, but would have "strong" (verified) links to fulltext +content, and would include file-level metadata (like hashes and fingerprints) +to help discovery and identify content from any source. Typed file-level links +should make fatcat more useful for both humans and machines to quickly access +fulltext content of a given mimetype than existing redirect or landing page +systems. + +## Ontology + +Loosely following FRBR, but removing the "manifestation" abstraction, and +favoring files (digital artifacts) over physical items, the primary entities +are: + + work + type + contributors + subject/category + release + + release (aka "edition", "variant") + title + volume/pages/issue/chapter + open-access status + date + work + publisher + container + contributors + citetext release + identifier + + file (aka "digital artifact") + release + hashes + URLs + institution accession + + contributor + name + aliases + affiliation date span + identifier + + container + name + open-access policy + peer-review policy + aliases, acronyms + subject/category + identifier + container + publisher + + publisher + name + aliases, acronyms + identifier + +## Controlled Vocabularies + +Some special namespace tables and enums would probably be helpful; these should +live in the database (not requiring a database migration to update), but should +have more controlled editing workflow... perhaps versioned in the codebase: + +- identifier namespaces (DOI, ISBN, ISSN, ORCID, etc) +- subject categorization +- license and open access status +- work "types" (article vs. book chapter vs. proceeding, etc) +- contributor types (author, translator, illustrator, etc) +- human languages +- file mimetypes + +## Unresolved Questions + +How to handle translations of, eg, titles and author names? To be clear, not +translations of works (which are just separate releases). + +Are bi-directional links a schema anti-pattern? Eg, should "work" point to a +primary "release" (which itself points back to the work), or should "release" +have a "is-primary" flag? + +Should `identifier` and `citation` be their own entities, referencing other +entities by UUID instead of by revision? This could save a ton of database +space and chunder. + +Should contributor/author contact information be retained? It could be very +useful for disambiguation, but we don't want to build a huge database for +spammers or "innovative" start-up marketing. + +Would general-purpose SQL databases like Postgres or MySQL scale well enough +to hold several tables with billions of entries? Right from the start there +are hundreds of millions of works and releases, many of which having dozens of +citations, many authors, and many identifiers, and then we'll have potentially +dozens of edits for each of these, which multiply out to `1e8 * 2e1 * 2e1 = +4e10`, or 40 billion rows in the citation table. If each row was 32 bytes on +average (uncompressed, not including index size), that would be 1.3 TByte on +its own, larger than common SSD disk. I think a transactional SQL datastore is +the right answer. In my experience locking and index rebuild times are usually +the biggest scaling challenges; the largely-immutable architecture here should +mitigate locking. Hopefully few indexes would be needed in the primary +database, as user interfaces could rely on secondary read-only search engines +for more complex queries and views. + +I see a tension between focus and scope creep. If a central database like +fatcat doesn't support enough fields and metadata, then it will not be possible +to completely import other corpuses, and this becomes "yet another" partial +bibliographic database. On the other hand, accepting arbitrary data leads to +other problems: sparseness increases (we have more "partial" data), potential +for redundancy is high, humans will start editing content that might be +bulk-replaced, etc. + +There might be a need to support "stub" references between entities. Eg, when +adding citations from PDF extraction, the cited works are likely to be +ambiguous. Could create "stub" works to be merged/resolved later, or could +leave the citation hanging. Same with authors, containers (journals), etc. + +## References and Previous Work + +The closest overall analog of fatcat is [MusicBrainz][mb], a collaboratively +edited music database. [Open Library][ol] is a very similar existing service, +which exclusively contains book metadata. + +[Wikidata][wd] seems to be the most successful and actively edited/developed +open bibliographic database at this time (early 2018), including the +[wikicite][wikicite] conference and related Wikimedia/Wikipedia projects. +Wikidata is a general purpose semantic database of entities, facts, and +relationships; bibliographic metadata has become a large fraction of all +content in recent years. The focus there seems to be linking knowledge +(statements) to specific sources unambiguously. Potential advantages fatcat +would have would be a focus on a specific scope (not a general-purpose database +of entities) and a goal of completeness (capturing as many works and +relationships as rapidly as possible). However, it might be better to just +pitch in to the wikidata efforts. + +The technical design of fatcat is loosely inspired by the git +branch/tag/commit/tree architecture, and specifically inspired by Oliver +Charles' "New Edit System" [blog posts][nes-blog] from 2012. + +There are a whole bunch of proprietary, for-profit bibliographic databases, +including Web of Science, Google Scholar, Microsoft Academic Graph, aminer, +Scopus, and Dimensions. There are excellent field-limited databases like dblp, +MEDLINE, and Semantic Scholar. There are some large general-purpose databases +that are not directly user-editable, including the OpenCitation corpus, CORE, +BASE, and CrossRef. I don't know of any large (more than 60 million works), +open (bulk-downloadable with permissive or no license), field agnostic, +user-editable corpus of scholarly publication bibliographic metadata. + +[nes-blog]: https://ocharles.org.uk/blog/posts/2012-07-10-nes-does-it-better-1.html +[mb]: https://musicbrainz.org +[ol]: https://openlibrary.org +[wd]: https://wikidata.org +[wikicite]: https://meta.wikimedia.org/wiki/WikiCite_2017 + diff --git a/rfc.md b/rfc.md deleted file mode 100644 index 21495f6d..00000000 --- a/rfc.md +++ /dev/null @@ -1,363 +0,0 @@ -fatcat is a half-baked idea to build an open, independent, collaboratively -editable bibliographic database of most written works, with a focus on -published research outputs like journal articles, pre-prints, and conference -proceedings. - -## Technical Architecture - -The canonical backend datastore would be a very large transactional SQL server. -A relatively simple and stable back-end daemon would expose an API (could be -REST, GraphQL, gRPC, etc). As little "application logic" as possible would be -embedded in this back-end; as much as possible would be pushed to bots which -could be authored and operated by anybody. A separate web interface project -would talk to the API backend and could be developed more rapidly. - -A cronjob would make periodic database dumps, both in "full" form (all tables -and all edit history, removing only authentication credentials) and "flat" form -(with only the most recent version of each entity, using only persistent IDs -between entities). - -A goal is to be linked-data/RDF/JSON-LD/semantic-web "compatible", but not -necessarily "first". It should be possible to export the database in a -relatively clean RDF form, and to fetch data in a variety of formats, but -internally fatcat would not be backed by a triple-store, and would not be -bound to a specific third-party ontology or schema. - -Microservice daemons should be able to proxy between the primary API and -standard protocols like ResourceSync and OAI-PMH, and bots could consume -external databases in those formats. - -## Licensing - -The core fatcat database should only contain verifiable factual statements -(which isn't to say that all statements are "true"), not creative or derived -content. - -The goal is to have a very permissively licensed database: CC-0 (no rights -reserved) if possible. Under US law, it should be possible to scrape and pull -in factual data from other corpuses without adopting their licenses. The goal -here isn't to avoid all attribution (progeny information will be included, and a -large sources and acknowledgments statement should be maintained), but trying -to manage the intersection of all upstream source licenses seems untenable, and -creates burdens for downstream users. - -Special care will need to be taken around copyright and original works. I would -propose either not accepting abstracts at all, or including them in a -partitioned database to prevent copyright contamination. Likewise, even simple -user-created content like lists, reviews, ratings, comments, discussion, -documentation, etc., should go in separate services. - -## Basic Editing Workflow and Bots - -Both human editors and bots would have edits go through the same API, with -humans using either the default web interface, arbitrary integrations, or -client software. - -The usual workflow would be to create edits (or creations, merges, deletions) -to individual entities one at a time, all under a single "edit group" of -related edits (eg, correcting authorship info for multiple works related to a -single author). When ready, the editor would "submit" the edit group for -review. During the review period, humans could vote (or veto/approve if they -have higher permissions), and bots can perform automated checks. During this -period the editor can make tweaks if necessary. After some fixed time period -(72 hours?) with no changes and no blocking issues, the edit group would be -auto-accepted, if no auto-resolvable merge-conflicts have arisen. This process -balances editing labor (reviews are easy, but optional) against quality -(cool-down period makes it easier to detect and prevent spam or out-of-control -bots). Advanced permissions could allow some trusted human and bot editors to -push through edits more rapidly. - -Bots would need to be tuned to have appropriate edit group sizes (eg, daily -batches, instead of millions of works in a single edit) to make human QA and -reverts possible. - -Data progeny and citation would be left to the edit history. In the case of -importing external databases, the expectation would be that special-purpose -bot accounts would be used. Human editors would leave edit messages to clarify -their sources. - -A style guide (wiki), chat room, and discussion forum would be hosted as -separate stand-alone services for editors to propose projects and debate -process or scope changes. It would be best if these could use federated account -authorization (oauth?) to have consistent account IDs across mediums. - -## Edit Log - -As part of the process of "accepting" an edit group, a row would be written to -an immutable, append-only log table (which internally could be a SQL table) -documenting each identifier change. This log establishes a monotonically -increasing version number for the entire corpus, and should make interaction -with other systems easier (eg, search engines, replicated databases, -alternative storage backends, notification frameworks, etc.). - -## Identifiers - -A fixed number of first-class "entities" would be defined, with common -behavior and schema layouts. These would all be semantic entities like "work", -"release", "container", and "person". - -fatcat identifiers would be semantically meaningless fixed-length random numbers, -usually represented in case-insensitive base32 format. Each entity type would -have its own identifier namespace. Eg, 96-bit identifiers would have 20 -characters and look like: - - fcwork_rzga5b9cd7efgh04iljk - https://fatcat.org/work/rzga5b9cd7efgh04iljk - -128-bit (UUID size) would have 26 characters: - - fcwork_rzga5b9cd7efgh04iljk8f3jvz - https://fatcat.org/work/rzga5b9cd7efgh04iljk8f3jvz - -A 64-bit namespace is probably plenty though, and would work with most database -Integer columns: - - fcwork_rzga5b9cd7efg - https://fatcat.org/work/rzga5b9cd7efg - -The idea would be to only have fatcat identifiers be used to interlink between -databases, *not* to supplant DOIs, ISBNs, handle, ARKs, and other "registered" -persistent identifiers. - -## Entities and Internal Schema - -Internally, identifiers would be lightweight pointers to actual metadata -objects, which can be thought of as "versions". The metadata objects themselves -would be immutable once committed; the edit process is one of creating new -objects and, if the edit is approved, pointing the identifier to the new -version. Entities would reference between themselves by identifier. - -Edit objects represent a change to a single entity; edits get batched together -into edit groups (like "commits" and "pull requests" in git parlance). - -SQL tables would probably look something like the following, though be specific -to each entity type (eg, there would be an actual `work_revision` table, but -not an actual `entity_revision` table): - - entity_id - uuid - current_revision - - entity_revision - entity_id (bi-directional?) - previous: entity_revision or none - state: normal, redirect, deletion - redirect_entity_id: optional - extra: json blob - edit_id - - edit - mutable: boolean - edit_group - editor - - edit_group - -Additional type-specific columns would hold actual metadata. Additional tables -(which would reference both `entity_revision` and `entity_id` foreign keys as -appropriate) would represent things like external identifiers, ordered -author/work relationships, citations between works, etc. Every revision of an -entity would require duplicating all of these associated rows, which could end -up being a large source of inefficiency, but is necessary to represent the full -history of an object. - -## Scope - -Want the "scholarly web": the graph of works that cite other works. Certainly -every work that is cited more than once and every work that both cites and is -cited; "leaf nodes" and small islands might not be in scope. - -Focusing on written works, with some exceptions. Expect core media (for which we would pursue "completeness") to be: - - journal articles - books - conference proceedings - technical memos - dissertations - -Probably in scope: - - reports - magazine articles - published poetry - essays - government documents - conference - presentations (slides, video) - datasets - -Probably not: - - patents - court cases and legal documents - manuals - datasheets - courses - -Definitely not: - - audio recordings - tv show episodes - musical scores - advertisements - -Author, citation, and work disambiguation would be core tasks. Linking -pre-prints to final publication is in scope. - -I'm much less interested in altmetrics, funding, and grant relationships than -most existing databases in this space. - -fatcat would not include any fulltext content itself, even for cleanly licensed -(open access) works, but would have "strong" (verified) links to fulltext -content, and would include file-level metadata (like hashes and fingerprints) -to help discovery and identify content from any source. Typed file-level links -should make fatcat more useful for both humans and machines to quickly access -fulltext content of a given mimetype than existing redirect or landing page -systems. - -## Ontology - -Loosely following FRBR, but removing the "manifestation" abstraction, and -favoring files (digital artifacts) over physical items, the primary entities -are: - - work - type - contributors - subject/category - release - - release (aka "edition", "variant") - title - volume/pages/issue/chapter - open-access status - date - work - publisher - container - contributors - citetext release - identifier - - file (aka "digital artifact") - release - hashes - URLs - institution accession - - contributor - name - aliases - affiliation date span - identifier - - container - name - open-access policy - peer-review policy - aliases, acronyms - subject/category - identifier - container - publisher - - publisher - name - aliases, acronyms - identifier - -## Controlled Vocabularies - -Some special namespace tables and enums would probably be helpful; these should -live in the database (not requiring a database migration to update), but should -have more controlled editing workflow... perhaps versioned in the codebase: - -- identifier namespaces (DOI, ISBN, ISSN, ORCID, etc) -- subject categorization -- license and open access status -- work "types" (article vs. book chapter vs. proceeding, etc) -- contributor types (author, translator, illustrator, etc) -- human languages -- file mimetypes - -## Unresolved Questions - -How to handle translations of, eg, titles and author names? To be clear, not -translations of works (which are just separate releases). - -Are bi-directional links a schema anti-pattern? Eg, should "work" point to a -primary "release" (which itself points back to the work), or should "release" -have a "is-primary" flag? - -Should `identifier` and `citation` be their own entities, referencing other -entities by UUID instead of by revision? This could save a ton of database -space and chunder. - -Should contributor/author contact information be retained? It could be very -useful for disambiguation, but we don't want to build a huge database for -spammers or "innovative" start-up marketing. - -Would general-purpose SQL databases like Postgres or MySQL scale well enough -to hold several tables with billions of entries? Right from the start there -are hundreds of millions of works and releases, many of which having dozens of -citations, many authors, and many identifiers, and then we'll have potentially -dozens of edits for each of these, which multiply out to `1e8 * 2e1 * 2e1 = -4e10`, or 40 billion rows in the citation table. If each row was 32 bytes on -average (uncompressed, not including index size), that would be 1.3 TByte on -its own, larger than common SSD disk. I think a transactional SQL datastore is -the right answer. In my experience locking and index rebuild times are usually -the biggest scaling challenges; the largely-immutable architecture here should -mitigate locking. Hopefully few indexes would be needed in the primary -database, as user interfaces could rely on secondary read-only search engines -for more complex queries and views. - -I see a tension between focus and scope creep. If a central database like -fatcat doesn't support enough fields and metadata, then it will not be possible -to completely import other corpuses, and this becomes "yet another" partial -bibliographic database. On the other hand, accepting arbitrary data leads to -other problems: sparseness increases (we have more "partial" data), potential -for redundancy is high, humans will start editing content that might be -bulk-replaced, etc. - -There might be a need to support "stub" references between entities. Eg, when -adding citations from PDF extraction, the cited works are likely to be -ambiguous. Could create "stub" works to be merged/resolved later, or could -leave the citation hanging. Same with authors, containers (journals), etc. - -## References and Previous Work - -The closest overall analog of fatcat is [MusicBrainz][mb], a collaboratively -edited music database. [Open Library][ol] is a very similar existing service, -which exclusively contains book metadata. - -[Wikidata][wd] seems to be the most successful and actively edited/developed -open bibliographic database at this time (early 2018), including the -[wikicite][wikicite] conference and related Wikimedia/Wikipedia projects. -Wikidata is a general purpose semantic database of entities, facts, and -relationships; bibliographic metadata has become a large fraction of all -content in recent years. The focus there seems to be linking knowledge -(statements) to specific sources unambiguously. Potential advantages fatcat -would have would be a focus on a specific scope (not a general-purpose database -of entities) and a goal of completeness (capturing as many works and -relationships as rapidly as possible). However, it might be better to just -pitch in to the wikidata efforts. - -The technical design of fatcat is loosely inspired by the git -branch/tag/commit/tree architecture, and specifically inspired by Oliver -Charles' "New Edit System" [blog posts][nes-blog] from 2012. - -There are a whole bunch of proprietary, for-profit bibliographic databases, -including Web of Science, Google Scholar, Microsoft Academic Graph, aminer, -Scopus, and Dimensions. There are excellent field-limited databases like dblp, -MEDLINE, and Semantic Scholar. There are some large general-purpose databases -that are not directly user-editable, including the OpenCitation corpus, CORE, -BASE, and CrossRef. I don't know of any large (more than 60 million works), -open (bulk-downloadable with permissive or no license), field agnostic, -user-editable corpus of scholarly publication bibliographic metadata. - -[nes-blog]: https://ocharles.org.uk/blog/posts/2012-07-10-nes-does-it-better-1.html -[mb]: https://musicbrainz.org -[ol]: https://openlibrary.org -[wd]: https://wikidata.org -[wikicite]: https://meta.wikimedia.org/wiki/WikiCite_2017 - -- cgit v1.2.3