diff options
author | Martin Czygan <martin.czygan@gmail.com> | 2021-03-29 22:47:46 +0200 |
---|---|---|
committer | Martin Czygan <martin.czygan@gmail.com> | 2021-03-29 22:47:46 +0200 |
commit | cefdde667a4169bbde6b8cf2bde8eec8cb589c98 (patch) | |
tree | f6cc44b475c219142b69549250a416f0a7ecdf46 | |
parent | c4d16fa9f7d27425de0bbe9e1a56ca0d3b3e297a (diff) | |
download | refcat-cefdde667a4169bbde6b8cf2bde8eec8cb589c98.tar.gz refcat-cefdde667a4169bbde6b8cf2bde8eec8cb589c98.zip |
update README
-rw-r--r-- | README.md | 4 |
1 files changed, 2 insertions, 2 deletions
@@ -15,7 +15,7 @@ Context: [fatcat](https://fatcat.wiki), "Mellon Grant" (20/21). * [ ] Link PID or DOI to archived versions * [ ] URLs in corpus linked to best possible timestamp (GWB) -* [ ] Harvest all URLs in citation corpus +* [ ] Harvest all URLs in citation corpus (maybe do a sample first) * [ ] Links between records w/o DOI (fuzzy matching) * [ ] Publication of augmented citation graph, explore data mining, etc. * [ ] Interlinkage with other source, monographs, commercial publications, etc. @@ -31,7 +31,7 @@ $ refcat.pyz BiblioRefV2 * schema: [https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas](https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas) * matches via: doi, arxiv, pmid, pmcid, fuzzy title matches -* 785,569,011 edges (~103% of open citation/crossref), 39G compressed, ~260G uncompressed +* 785,569,011 edges (~103% of 12/2020 OCI/crossref release), ~39G compressed, ~288G uncompressed # Rough Notes |