# V4 Have not release v3, but many change to `skate` so we may continue with v4. # Unstructured ``` { "biblio": { "unstructured": "J. Häger, W. Krieger, T. Rüegg, and H. Walther, J. Chem. Phys. 72, 4286 (1980).JCPSA60021-9606" }, "index": 8, "key": "_r4", "ref_source": "crossref", "release_year": 1983, "release_ident": "tebzylkszzbyjggye5ssmebdcy", "work_ident": "aaaofyp6uzcdnbe7hvfahylyha" } ``` We should be able to match: "J. Chem. Phys.", also maybe "72, 4286 (1980)", w/ id, title we should be able to match to: https://fatcat.wiki/release/d2k7en7tzzdzzddwfo5xlqx4ce If nothing else defined, and unstructured contains a URL, we may extract that. ``` { "biblio": { "unstructured": "Friedrich-Ebert-Stiftung (FES) 2008 FES in Nepal. FES http://www.fesnepal.org/about/fes_in_nepal.htm (accessed February 15, 2009)" }, "index": 19, "key": "CIT0020", "ref_source": "crossref", "release_year": 2011, "release_ident": "xqgaanhpf5gotdxxxytgyxw2ty", "work_ident": "aaaq35j3angwzdpzcvzdil3v4y" } ``` Also, these may say: "accessed at ..." # URL * url cleanup in place # Partial Data Mapping * how to map partial docs onto a key # OL beyond ISBN Example: ``` { "biblio": { "container_name": "The Debt: What America Owes to Blacks", "contrib_raw_names": [ "R Robinson" ], "unstructured": "Randall Robinson, The Debt: What America Owes to Blacks. New York: Dutton Books, 2000, pp. 219–220.", "year": 2000 }, "index": 22, "key": "8_CR23", "ref_source": "crossref", "release_year": 2009, "release_ident": "2igycuiobvhxrcmmrzz6anufuq", "work_ident": "aaacj23jqbdxvajwj5kc6jpejq" } ``` * https://openlibrary.org/works/OL488811W/The_debt?edition=debtwhatamerica000robi However, there is no explicit "subtitle" fields, and in this case, the subtitle is buried in "text": ``` { "key": "/works/OL488811W", "text": [ "/works/OL488811W", "The debt", "The Debt", "The Debt ", "what America owes to Blacks", "What America Owes to Blacks", "OL46591M", "OL7771042M", "OL7590904M", "OL3382710M", "Randall Robinson.", "2004556979", "99045728", "0452282101", "0525945245", ``` Subtitle in editions. ``` { "biblio": { "container_name": "BLACK AFRICA: The Economic and Cultural Basis for a Federated State", "unstructured": "For details on African Renaissance see Cheikh Anta Diop, BLACK AFRICA: The Economic and Cultural Basis for a Federated State, New Expanded Edition. Trenton, NJ: Africa World Press, 1987.", "year": 1987 }, "index": 28, "key": "8_CR29", "ref_source": "crossref", "release_year": 2009, "release_ident": "2igycuiobvhxrcmmrzz6anufuq", "work_ident": "aaacj23jqbdxvajwj5kc6jpejq" } ``` ## OL Loop Some do not have an explicit "works" key, but still link to an edition. * https://openlibrary.org/books/OL10000230M/Parliamentary_Debates_House_Of_Lords_2003-2004?edition= > An edition of Parliamentary Debates, House Of Lords 2003-2004 Example edition: ``` { "publishers": [ "Du Temps" ], "languages": [ { "key": "/languages/fre" } ], "last_modified": { "type": "/type/datetime", "value": "2010-04-24T18:46:01.556464" }, "weight": "5 ounces", "title": "Les Fleurs bleues de Raymond Queneau", "identifiers": { "goodreads": [ "487215" ] }, "isbn_13": [ "9782842741013" ], "covers": [ 3140044 ], "physical_format": "Paperback", "isbn_10": [ "2842741013" ], "publish_date": "January 1, 2000", "key": "/books/OL12622734M", "authors": [ { "key": "/authors/OL3964945A" } ], "latest_revision": 5, "works": [ { "key": "/works/OL10000008W" } ], "type": { "key": "/type/edition" }, "physical_dimensions": "8.4 x 5.7 x 0.3 inches", "revision": 5 } ``` Example Work: ``` { "title": "Les Fleurs bleues de Raymond Queneau", "created": { "type": "/type/datetime", "value": "2009-12-11T01:57:19.964652" }, "covers": [ 3140044 ], "last_modified": { "type": "/type/datetime", "value": "2010-04-28T06:54:19.472104" }, "latest_revision": 3, "key": "/works/OL10000008W", "authors": [ { "type": "/type/author_role", "author": { "key": "/authors/OL3964945A" } } ], "type": { "key": "/type/work" }, "revision": 3 } ``` ---- ## Unmatched If we exclude any id and title, we'll roughly have the following fields: ``` container_name|contrib_raw_names|year 64064559 unstructured 61711602 container_name|contrib_raw_names|volume|year 49701699 container_name|contrib_raw_names|unstructured|volume|year 36401044 container_name|contrib_raw_names|unstructured|year 26663422 contrib_raw_names|unstructured 16731608 container_name|contrib_raw_names|doi|unstructured|year 14207167 container_name|contrib_raw_names|doi|year 13159340 ``` Some examples: ``` { "biblio": { "container_name": "Intern. J. Comput. Math.", "contrib_raw_names": [ "D. Levin" ], "volume": "B3", "year": 1973 }, "index": 19, "key": "PhysRevB.48.6913Cc15R1", "ref_source": "crossref", "release_year": 1993, "release_ident": "i6s6e64n55hh5oned32mdwrs2i", "work_ident": "aaaeuvgitzfafczctw3bseauri" } ``` This refers to: * https://www.tandfonline.com/doi/abs/10.1080/00207167308803075 * 1972, and not 1973, 1993 * https://fatcat.wiki/release/3cstmufhszalvpnppwxjohnnsa It would help to go from "container name" to "issn", e.g. here: 0020-7160 * https://fatcat.wiki/release/search?q=levin+container_id%3A%22y4k3i2fvabgarkvywismzvy23a%22+year%3A1972 ``` $ grep -i "Intern.*J.*Comput.*Math.*" jabbrev.json {"name": "COMPEL-THE INTERNATIONAL JOURNAL FOR COMPUTATION AND MATHEMATICS IN ELECTRICAL AND ELECTRONIC ENGINEERING", "abbrev": "COMPEL"} {"name": "INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE", "abbrev": "INT J APPL MATH COMP"} {"name": "INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS", "abbrev": "INT J COMPUT MATH"} ``` Lookup name in issn: ``` $ zstdcat tmp/data.ndj.zst | grep -i "INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS" | jq . "@graph": [ { "@id": "http://id.loc.gov/vocabulary/countries/enk", "label": "England" }, { "@id": "organization/ISSNCenter#_1", "@type": "http://schema.org/Organization" }, { "@id": "resource/ISSN-L/0020-7160", "identifiedBy": "resource/ISSN/0020-7160#ISSN-L" }, { "@id": "resource/ISSN/0020-7160", "@type": [ "http://id.loc.gov/ontologies/bibframe/Instance", "http://id.loc.gov/ontologies/bibframe/Work", "http://schema.org/Periodical" ], "format": "vocabularies/medium#Print", "http://purl.org/ontology/bibo/issn": "0020-7160", "identifiedBy": [ "resource/ISSN/0020-7160#ISSN-L", "resource/ISSN/0020-7160#ISSN", "resource/ISSN/0020-7160#KeyTitle" ], ``` We would need: * rough abbrev name -> full name (jabbrev) -> issn (issnlister) -> container id (fatcat)