From 6833ddc7ac15a17961264ccb8df433e8d4fa1f07 Mon Sep 17 00:00:00 2001 From: Martin Czygan Date: Tue, 1 Jun 2021 15:49:17 +0200 Subject: note on journal name resolution --- python/notes/version_4.md | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'python') diff --git a/python/notes/version_4.md b/python/notes/version_4.md index 0d3861e..c96b084 100644 --- a/python/notes/version_4.md +++ b/python/notes/version_4.md @@ -458,3 +458,14 @@ different year /works/OL13199655W ilga2kj4nnaqdh4rmogbsgbgbe different year /works/OL13199655W 5ujpef3vjzhkvmse6ovey2q2zi 1000000delinquents 1000000 delinquents 1,000,000 Delinquents ``` +## Journal name augmentation + +In ~160M unmatched refs (release format) we could resolve 14M container names, via `skate-resolve-journal-name`. + +``` +$ zstdcat date-2021-05-06.tsv.zst | skate-resolve-journal-name -B -A /magna/data/jabbrev.json | cut -f 2 | pv -l | LC_ALL=C grep -cF resolved_container_name +2021/06/01 13:02:20 found 27178 abbreviation mappings + 160M 0:14:49 [ 180k/s] [ <=> ] +14090677 +``` + -- cgit v1.2.3