aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-07-03 02:06:48 +0200
committerMartin Czygan <martin.czygan@gmail.com>2021-07-03 02:06:48 +0200
commite443ddfd7240e9f78ae827726d08f263311cb40c (patch)
treed501a4c703997b202586ebb437f83c23d6c03668
parent88ed40c70df07bc1be662c8439c2a183f6c12449 (diff)
downloadrefcat-e443ddfd7240e9f78ae827726d08f263311cb40c.tar.gz
refcat-e443ddfd7240e9f78ae827726d08f263311cb40c.zip
update docs
-rw-r--r--python/Makefile5
-rw-r--r--python/README.md62
2 files changed, 44 insertions, 23 deletions
diff --git a/python/Makefile b/python/Makefile
index 7b78d1f..817d366 100644
--- a/python/Makefile
+++ b/python/Makefile
@@ -14,8 +14,9 @@ PKGNAME := refcat
# https://shiv.readthedocs.io/
ZIPAPP := $(PKGNAME).pyz
-# IMPORTANT: Python version on dev and target *must match* (up to minor version)
-# e.g. for aitio (2021), you might want to use:
+# IMPORTANT: Python version on dev (e.g. use https://github.com/pyenv/pyenv)
+# and target *must match* (up to minor version) e.g. for aitio (2021), you
+# might want to use:
# make refcat.pyz PYTHON_INTERPRETER='"/usr/bin/env python3.8"'
PYTHON_INTERPRETER := "/usr/bin/env python3.8"
diff --git a/python/README.md b/python/README.md
index f66a517..15e39c1 100644
--- a/python/README.md
+++ b/python/README.md
@@ -19,13 +19,14 @@ various artifacts, e.g.:
* [notes/version_1.md](version 1) (id plus title)
* [notes/version_2.md](version 2) (v1, full schema)
* [notes/version_3.md](version 3) (v2, unstructured)
+* [notes/version_4.md](version 4) (v3, extra sources, qa)
## Deployment
We are testing a zipapp based deployment (20s for packaging into a 10MB zip
file, and copying to target).
-Caveat: The development machine needs the same python version (e.g. 3.7) as the
+Caveat: The development machine needs the same python version (e.g. 3.8) as the
target, e.g. for native dependencies. It is relatively easy to have multiple
versions of Python available with [pyenv](https://github.com/pyenv/pyenv).
@@ -48,24 +49,43 @@ $ refcat.pyz
Command line entry point for running various data tasks.
-General usage:
-
- $ refcat TASK
-
-BASE: /bigger/.cache
-
-BiblioRef KeyDistribution RefsFatcatSortedKeys
-BiblioRefFromJoin RefCounter RefsFatcatTitleLowerJoin
-BiblioRefFuzzy Refcat RefsKeyStats
-CommonDOIs RefsArxiv RefsPMCID
-CommonTitles RefsDOIs RefsPMID
-CommonTitlesLower RefsDOIsLower RefsReleasesMerged
-FatcatArxiv RefsFatcatArxivJoin RefsTitleFrequency
-FatcatDOIs RefsFatcatClusterVerify RefsTitles
-FatcatDOIsLower RefsFatcatClusters RefsTitlesLower
-FatcatPMCID RefsFatcatDOIJoin RefsToRelease
-FatcatPMID RefsFatcatGroupJoin ReleaseExportExpanded
-FatcatTitles RefsFatcatPMCIDJoin URLList
-FatcatTitlesLower RefsFatcatPMIDJoin URLTabs
-Input RefsFatcatRanked
+ $ refcat.pyz [COMMAND | TASK] [OPTIONS]
+
+Commands: ls, ll, deps, tasks, files, config, cat, completion
+
+To install completion run:
+
+ $ source <(refcat.pyz completion)
+
+VERSION 0.1.3
+SETTINGS /home/martin/.config/refcat/settings.ini
+BASE /magna/refcat
+TMPDIR /sandcrawler-db/tmp-refcat
+SHIV_ROOT None
+
+Bref OpenLibraryWorksSorted
+BrefCombined Refcat
+BrefOpenLibraryZipISBN Refs
+BrefSortedByWorkID RefsArxiv
+BrefZipArxiv RefsByWorkID
+BrefZipDOI RefsDOI
+BrefZipFuzzy RefsMapped
+BrefZipOpenLibrary RefsPMCID
+BrefZipPMCID RefsPMID
+BrefZipPMID RefsToRelease
+FatcatArxiv RefsWithUnstructured
+FatcatDOI RefsWithoutIdentifiers
+FatcatMapped ReleaseExportExpanded
+FatcatPMCID ReleaseExportReduced
+FatcatPMID URLList
+MAGPapers URLTabs
+OpenLibraryAuthorMapping URLTabsCleaned
+OpenLibraryAuthors UnmatchedMapped
+OpenLibraryDump UnmatchedOpenLibraryMatchTable
+OpenLibraryEditions UnmatchedRefs
+OpenLibraryEditionsByWork UnmatchedRefsToRelease
+OpenLibraryEditionsMapped UnmatchedResolveJournalNames
+OpenLibraryEditionsToRelease UnmatchedResolveJournalNamesMapped
+OpenLibraryReleaseMapped WikipediaCitationsMinimalDataset
+OpenLibraryWorks
```