# refcat (wip) Citation graph related tasks. * compagnon project: [skate](https://git.archive.org/martin/cgraph/-/tree/master/skate) Objective: Given data about [releases](https://guide.fatcat.wiki/entity_release.html) and references derive various artifacts, e.g.: * a citation graph; nodes are releases and an edge is a citation (currently, this graph has about 50M nodes and 870M edges) * a list of referenced entities, like ISSN (container), ISBN (book), URL (webpage), datasets (by URL, DOI, name, ...) ## Ongoing Notes * [notes/version_0.md](version 0) (id only) * [notes/version_1.md](version 1) (id plus title) * [notes/version_2.md](version 2) (v1, full schema) * [notes/version_3.md](version 3) (v2, unstructured) ## Deployment We are testing a zipapp based deployment (20s for packaging into a 10MB zip file, and copying to target). Caveat: The development machine needs the same python version (e.g. 3.7) as the target, e.g. for native dependencies. It is relatively easy to have multiple versions of Python available with [pyenv](https://github.com/pyenv/pyenv). ``` $ make refcat.pyz && rsync -avP refcat.pyz user@host:/usr/local/bin ``` On the target you can call (first run will be slower, e.g. 4s, subsequent runs at around 1s startup time). ``` $ refcat.pyz ____ __ ________ / __/________ _/ /_ / ___/ _ \/ /_/ ___/ __ `/ __/ / / / __/ __/ /__/ /_/ / /_ /_/ \___/_/ \___/\__,_/\__/ Command line entry point for running various data tasks. General usage: $ refcat TASK BASE: /bigger/.cache BiblioRef KeyDistribution RefsFatcatSortedKeys BiblioRefFromJoin RefCounter RefsFatcatTitleLowerJoin BiblioRefFuzzy Refcat RefsKeyStats CommonDOIs RefsArxiv RefsPMCID CommonTitles RefsDOIs RefsPMID CommonTitlesLower RefsDOIsLower RefsReleasesMerged FatcatArxiv RefsFatcatArxivJoin RefsTitleFrequency FatcatDOIs RefsFatcatClusterVerify RefsTitles FatcatDOIsLower RefsFatcatClusters RefsTitlesLower FatcatPMCID RefsFatcatDOIJoin RefsToRelease FatcatPMID RefsFatcatGroupJoin ReleaseExportExpanded FatcatTitles RefsFatcatPMCIDJoin URLList FatcatTitlesLower RefsFatcatPMIDJoin URLTabs Input RefsFatcatRanked ```