aboutsummaryrefslogtreecommitdiffstats
path: root/python/notes/mag_notes.md
diff options
context:
space:
mode:
Diffstat (limited to 'python/notes/mag_notes.md')
-rw-r--r--python/notes/mag_notes.md21
1 files changed, 21 insertions, 0 deletions
diff --git a/python/notes/mag_notes.md b/python/notes/mag_notes.md
new file mode 100644
index 0000000..6341676
--- /dev/null
+++ b/python/notes/mag_notes.md
@@ -0,0 +1,21 @@
+# MAG 2020 Notes
+
+* /magna/data/mag-2020-06-25
+* 1637615789
+
+```
+$ unpigz -c PaperReferences.txt.gz| pv -l | wc -l
+1637615789
+```
+
+* 238M rows in the papers table (238938563)
+* only 3516356 DOI?
+
+```
+$ zstdcat -T0 Papers.txt.zst | pv -l | LC_ALL=C cut -f3 | LC_ALL=C grep -v ^$ > mag_doi_list.txt
+ 238M 0:06:12 [ 641k/s]
+
+$ wc -l mag_doi_list.txt
+3516356 mag_doi_list.txt
+```
+