aboutsummaryrefslogtreecommitdiffstats
path: root/python/notes/version_3.md
diff options
context:
space:
mode:
Diffstat (limited to 'python/notes/version_3.md')
-rw-r--r--python/notes/version_3.md18
1 files changed, 18 insertions, 0 deletions
diff --git a/python/notes/version_3.md b/python/notes/version_3.md
index f828ee8..f9b6928 100644
--- a/python/notes/version_3.md
+++ b/python/notes/version_3.md
@@ -316,3 +316,21 @@ A subtle bug: a doi in refs ends with tab:
```
10.1002/andp.19975090102\t
```
+
+----
+
+## URL lookup via pig
+
+* failed after a week; map spill
+
+```
+2021-05-21 15:04:25,507 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 58% complete
+^C2021-05-24 15:22:57,073 [Thread-6] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ia802401.us.archive.org/207.241.228.181:6932
+2021-05-24 15:22:58,245 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 64% complete
+2021-05-24 15:22:58,778 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 71% complete
+2021-05-24 15:23:02,763 [Thread-6] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Job job_pigexec_0 killed
+
+real 8276m35.071s
+user 425m6.748s
+sys 52m21.012s
+```