summaryrefslogtreecommitdiffstats
path: root/notes/bootstrap/import_timing_20190129.txt
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2019-02-05 12:15:21 -0800
committerBryan Newbold <bnewbold@robocracy.org>2019-02-05 12:15:21 -0800
commitd08f8660637d410b3baa029acd3bab7eb6e8e71e (patch)
tree6cd50144e0ff9be3da70be046bf85a62ab02ad02 /notes/bootstrap/import_timing_20190129.txt
parent05942d40b29f397372b9abcd1057e12c72e93743 (diff)
downloadfatcat-d08f8660637d410b3baa029acd3bab7eb6e8e71e.tar.gz
fatcat-d08f8660637d410b3baa029acd3bab7eb6e8e71e.zip
final bootstrap notes
Diffstat (limited to 'notes/bootstrap/import_timing_20190129.txt')
-rw-r--r--notes/bootstrap/import_timing_20190129.txt10
1 files changed, 10 insertions, 0 deletions
diff --git a/notes/bootstrap/import_timing_20190129.txt b/notes/bootstrap/import_timing_20190129.txt
index 6d635f92..30b7bdbf 100644
--- a/notes/bootstrap/import_timing_20190129.txt
+++ b/notes/bootstrap/import_timing_20190129.txt
@@ -6,6 +6,8 @@ Made a number of changes since yesterday's import, so won't be surprised if run
in to problems. Plan is to make any fixes and push through to the end to turn
up any additional issues/bugs, then iterate yet again if needed.
+NOTE: this import ended up being abandoned (too slow) in lieu of 2019-01-30.
+
## Service up/down
sudo service fatcat-web stop
@@ -108,6 +110,14 @@ up any additional issues/bugs, then iterate yet again if needed.
would take... about an hour to restart, might save 20+ hours, might waste 14?
+ Counter({'total': 5005785, 'insert': 4319312, 'exists': 457819, 'skip': 228654, 'update': 0})
+ 531544.60user 13597.32system 60:38:43elapsed 249%CPU (0avgtext+0avgdata 448748maxresident)k
+ 124037840inputs+395235552outputs (140major+41973732minor)pagefaults 0swaps
+
+ real 3638m43.712s => 60 hours (!!!)
+ user 8944m37.944s
+ sys 232m25.200s
+
export FATCAT_AUTH_SANDCRAWLER="..."
export FATCAT_API_AUTH_TOKEN=$FATCAT_AUTH_SANDCRAWLER
time zcat /srv/fatcat/datasets/ia_papers_manifest_2018-01-25.matched.json.gz | pv -l | time parallel -j12 --round-robin --pipe ./fatcat_import.py --batch-size 50 matched --bezerk-mode -