diff options
author | Bryan Newbold <bnewbold@archive.org> | 2021-12-17 13:19:32 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2021-12-17 13:19:32 -0800 |
commit | fa557a90482cfed59564173e442d9375b959ee8b (patch) | |
tree | e5b1c2a3460e85be4dfd336453fc79b508b2f7a1 /extra/wikipedia/README.md | |
parent | 1d6589ed58879206c4507d08b25ab09e859d34ee (diff) | |
download | refcat-fa557a90482cfed59564173e442d9375b959ee8b.tar.gz refcat-fa557a90482cfed59564173e442d9375b959ee8b.zip |
update stats from 2021-12-01 run
Diffstat (limited to 'extra/wikipedia/README.md')
-rw-r--r-- | extra/wikipedia/README.md | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/extra/wikipedia/README.md b/extra/wikipedia/README.md index 8cfdfc0..59480a7 100644 --- a/extra/wikipedia/README.md +++ b/extra/wikipedia/README.md @@ -43,7 +43,8 @@ Within a virtualenv, use `parallel` to process like: This will output JSON lines, one line per article, with the article title, revision, site name, and any extracted references in a sub-array (of JSON -objects). +objects). As of December 2021, it takes about 17 hours on a large machine, with +the above command. ## Prior Work |