diff options
-rw-r--r-- | README.md | 14 |
1 files changed, 9 insertions, 5 deletions
@@ -1,14 +1,18 @@ # cgraph -Scholarly citation graph related code; maintained by [martin@archive.org](mailto:martin@archive.org). +Scholarly citation graph related code; maintained by +[martin@archive.org](mailto:martin@archive.org); multiple subsproject to keep +all relevant code close: -* python: mostly luigi tasks -* skate: various Go tools +* python: mostly luigi tasks (using [shiv](https://github.com/linkedin/shiv) for single-file deployments) +* skate: various Go command line tools (wrapped in a deb packaged) -Context: [fatcat](https://fatcat.wiki), "Mellon Grant" (20/21) +Context: [fatcat](https://fatcat.wiki), "Mellon Grant" (20/21). # Grant related tasks +3/4 phases of the grant contain citation graph related tasks. + * [ ] Link PID or DOI to archived versions * [ ] URLs in corpus linked to best possible timestamp (GWB) * [ ] Harvest all URLs in citation corpus @@ -27,7 +31,7 @@ $ refcat.pyz BiblioRefV2 * schema: [https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas](https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas) * matches via: doi, arxiv, pmid, pmcid, fuzzy title matches -* 717,435,777 edges (94% of open citation/crossref), 37G compressed, ~260G uncompressed +* 785,569,011 edges (~103% of open citation/crossref), 39G compressed, ~260G uncompressed # Rough Notes |