aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-03-29 20:54:24 +0200
committerMartin Czygan <martin.czygan@gmail.com>2021-03-29 20:54:24 +0200
commitc4d16fa9f7d27425de0bbe9e1a56ca0d3b3e297a (patch)
tree6004471c695cdd308d3ce7b74752cf977197b52e
parentf03e0f6493ee40d55641bf0f3f0bf95a3530d237 (diff)
downloadrefcat-c4d16fa9f7d27425de0bbe9e1a56ca0d3b3e297a.tar.gz
refcat-c4d16fa9f7d27425de0bbe9e1a56ca0d3b3e297a.zip
update README
-rw-r--r--README.md14
1 files changed, 9 insertions, 5 deletions
diff --git a/README.md b/README.md
index 046ce7a..7aefb35 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,18 @@
# cgraph
-Scholarly citation graph related code; maintained by [martin@archive.org](mailto:martin@archive.org).
+Scholarly citation graph related code; maintained by
+[martin@archive.org](mailto:martin@archive.org); multiple subsproject to keep
+all relevant code close:
-* python: mostly luigi tasks
-* skate: various Go tools
+* python: mostly luigi tasks (using [shiv](https://github.com/linkedin/shiv) for single-file deployments)
+* skate: various Go command line tools (wrapped in a deb packaged)
-Context: [fatcat](https://fatcat.wiki), "Mellon Grant" (20/21)
+Context: [fatcat](https://fatcat.wiki), "Mellon Grant" (20/21).
# Grant related tasks
+3/4 phases of the grant contain citation graph related tasks.
+
* [ ] Link PID or DOI to archived versions
* [ ] URLs in corpus linked to best possible timestamp (GWB)
* [ ] Harvest all URLs in citation corpus
@@ -27,7 +31,7 @@ $ refcat.pyz BiblioRefV2
* schema: [https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas](https://git.archive.org/webgroup/fatcat/-/blob/10eb30251f89806cb7a0f147f427c5ea7e5f9941/proposals/2021-01-29_citation_api.md#schemas)
* matches via: doi, arxiv, pmid, pmcid, fuzzy title matches
-* 717,435,777 edges (94% of open citation/crossref), 37G compressed, ~260G uncompressed
+* 785,569,011 edges (~103% of open citation/crossref), 39G compressed, ~260G uncompressed
# Rough Notes