From e443ddfd7240e9f78ae827726d08f263311cb40c Mon Sep 17 00:00:00 2001 From: Martin Czygan Date: Sat, 3 Jul 2021 02:06:48 +0200 Subject: update docs --- python/README.md | 62 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 21 deletions(-) (limited to 'python/README.md') diff --git a/python/README.md b/python/README.md index f66a517..15e39c1 100644 --- a/python/README.md +++ b/python/README.md @@ -19,13 +19,14 @@ various artifacts, e.g.: * [notes/version_1.md](version 1) (id plus title) * [notes/version_2.md](version 2) (v1, full schema) * [notes/version_3.md](version 3) (v2, unstructured) +* [notes/version_4.md](version 4) (v3, extra sources, qa) ## Deployment We are testing a zipapp based deployment (20s for packaging into a 10MB zip file, and copying to target). -Caveat: The development machine needs the same python version (e.g. 3.7) as the +Caveat: The development machine needs the same python version (e.g. 3.8) as the target, e.g. for native dependencies. It is relatively easy to have multiple versions of Python available with [pyenv](https://github.com/pyenv/pyenv). @@ -48,24 +49,43 @@ $ refcat.pyz Command line entry point for running various data tasks. -General usage: - - $ refcat TASK - -BASE: /bigger/.cache - -BiblioRef KeyDistribution RefsFatcatSortedKeys -BiblioRefFromJoin RefCounter RefsFatcatTitleLowerJoin -BiblioRefFuzzy Refcat RefsKeyStats -CommonDOIs RefsArxiv RefsPMCID -CommonTitles RefsDOIs RefsPMID -CommonTitlesLower RefsDOIsLower RefsReleasesMerged -FatcatArxiv RefsFatcatArxivJoin RefsTitleFrequency -FatcatDOIs RefsFatcatClusterVerify RefsTitles -FatcatDOIsLower RefsFatcatClusters RefsTitlesLower -FatcatPMCID RefsFatcatDOIJoin RefsToRelease -FatcatPMID RefsFatcatGroupJoin ReleaseExportExpanded -FatcatTitles RefsFatcatPMCIDJoin URLList -FatcatTitlesLower RefsFatcatPMIDJoin URLTabs -Input RefsFatcatRanked + $ refcat.pyz [COMMAND | TASK] [OPTIONS] + +Commands: ls, ll, deps, tasks, files, config, cat, completion + +To install completion run: + + $ source <(refcat.pyz completion) + +VERSION 0.1.3 +SETTINGS /home/martin/.config/refcat/settings.ini +BASE /magna/refcat +TMPDIR /sandcrawler-db/tmp-refcat +SHIV_ROOT None + +Bref OpenLibraryWorksSorted +BrefCombined Refcat +BrefOpenLibraryZipISBN Refs +BrefSortedByWorkID RefsArxiv +BrefZipArxiv RefsByWorkID +BrefZipDOI RefsDOI +BrefZipFuzzy RefsMapped +BrefZipOpenLibrary RefsPMCID +BrefZipPMCID RefsPMID +BrefZipPMID RefsToRelease +FatcatArxiv RefsWithUnstructured +FatcatDOI RefsWithoutIdentifiers +FatcatMapped ReleaseExportExpanded +FatcatPMCID ReleaseExportReduced +FatcatPMID URLList +MAGPapers URLTabs +OpenLibraryAuthorMapping URLTabsCleaned +OpenLibraryAuthors UnmatchedMapped +OpenLibraryDump UnmatchedOpenLibraryMatchTable +OpenLibraryEditions UnmatchedRefs +OpenLibraryEditionsByWork UnmatchedRefsToRelease +OpenLibraryEditionsMapped UnmatchedResolveJournalNames +OpenLibraryEditionsToRelease UnmatchedResolveJournalNamesMapped +OpenLibraryReleaseMapped WikipediaCitationsMinimalDataset +OpenLibraryWorks ``` -- cgit v1.2.3