diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2021-11-29 15:02:27 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2021-11-29 15:02:27 -0800 |
commit | 7c6afa0a21883dc8037f3d021246db24eef39b41 (patch) | |
tree | 3fa7c1e595248a46e88ea62c2f9f70106186b0fe /extra/cleanups/check_issnl.sh | |
parent | c32154f2875a7fb9aac727013e1475cdd811e180 (diff) | |
download | fatcat-7c6afa0a21883dc8037f3d021246db24eef39b41.tar.gz fatcat-7c6afa0a21883dc8037f3d021246db24eef39b41.zip |
clean up extra/ folder a bit
Diffstat (limited to 'extra/cleanups/check_issnl.sh')
-rwxr-xr-x | extra/cleanups/check_issnl.sh | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/extra/cleanups/check_issnl.sh b/extra/cleanups/check_issnl.sh new file mode 100755 index 00000000..a28695e7 --- /dev/null +++ b/extra/cleanups/check_issnl.sh @@ -0,0 +1,15 @@ +#!/usr/bin/env bash + +set -e -u -o pipefail + +export LC_ALL=C + +CONTAINER_DUMP=$1 + +zcat $CONTAINER_DUMP \ + | jq '[.issnl, .ident] | @tsv' -r \ + | sort -S 4G \ + | uniq -D -w 9 \ + > issnl_ident.dupes.tsv + +wc -l issnl_ident.dupes.tsv >> counts.txt |