diff options
author | Martin Czygan <martin.czygan@gmail.com> | 2021-06-01 15:49:17 +0200 |
---|---|---|
committer | Martin Czygan <martin.czygan@gmail.com> | 2021-06-01 15:49:17 +0200 |
commit | 6833ddc7ac15a17961264ccb8df433e8d4fa1f07 (patch) | |
tree | c7778f6dd6e65a267941519dece34d548c2a097a /python | |
parent | c03efb12d25429b51c68eff5c8c6d21b2d96e023 (diff) | |
download | refcat-6833ddc7ac15a17961264ccb8df433e8d4fa1f07.tar.gz refcat-6833ddc7ac15a17961264ccb8df433e8d4fa1f07.zip |
note on journal name resolution
Diffstat (limited to 'python')
-rw-r--r-- | python/notes/version_4.md | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/python/notes/version_4.md b/python/notes/version_4.md index 0d3861e..c96b084 100644 --- a/python/notes/version_4.md +++ b/python/notes/version_4.md @@ -458,3 +458,14 @@ different year /works/OL13199655W ilga2kj4nnaqdh4rmogbsgbgbe different year /works/OL13199655W 5ujpef3vjzhkvmse6ovey2q2zi 1000000delinquents 1000000 delinquents 1,000,000 Delinquents ``` +## Journal name augmentation + +In ~160M unmatched refs (release format) we could resolve 14M container names, via `skate-resolve-journal-name`. + +``` +$ zstdcat date-2021-05-06.tsv.zst | skate-resolve-journal-name -B -A /magna/data/jabbrev.json | cut -f 2 | pv -l | LC_ALL=C grep -cF resolved_container_name +2021/06/01 13:02:20 found 27178 abbreviation mappings + 160M 0:14:49 [ 180k/s] [ <=> ] +14090677 +``` + |