index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
scripts
Commit message (
Collapse
)
Author
Age
Files
Lines
*
cdx_collection.py: minor lint issue
Bryan Newbold
2021-10-04
1
-1
/
+1
|
*
another lowercase DOI in an (unused?) script
Bryan Newbold
2021-07-13
1
-1
/
+1
|
*
add cdx_collection.py python script (from scratch repo)
Bryan Newbold
2021-05-04
1
-0
/
+80
|
*
doaj ingest request updates (from prod)
Bryan Newbold
2021-01-05
1
-1
/
+5
|
*
blacklist -> denylist
Bryan Newbold
2020-11-10
1
-9
/
+9
|
*
DOAJ and HTML ingest tweaks from QA run
Bryan Newbold
2020-11-10
1
-1
/
+1
|
*
basic DOAJ ingest request conversion script
Bryan Newbold
2020-11-08
1
-0
/
+139
|
*
poppler: correct RGBA buffer endian-ness
Bryan Newbold
2020-06-25
1
-1
/
+1
|
*
pdf_thumbnail script: demonstrate PDF thumbnail generation
Bryan Newbold
2020-06-16
1
-0
/
+35
|
*
first iteration of oai2ingestrequest script
Bryan Newbold
2020-05-05
1
-0
/
+137
|
*
COVID-19 chinese paper ingest
Bryan Newbold
2020-04-15
1
-0
/
+83
|
*
unpaywall2ingestrequest: canonicalize URL
Bryan Newbold
2020-04-07
1
-1
/
+9
|
*
use local env in python scripts
Bryan Newbold
2020-03-10
3
-3
/
+3
|
|
|
|
|
Without this correct/canonical shebang invocation, virtualenvs (pipenv) don't work.
*
ingestrequest_row2json: skip on unicode errors
Bryan Newbold
2020-03-05
1
-1
/
+4
|
*
unpaywall2ingestrequest transform script
Bryan Newbold
2020-02-18
1
-0
/
+103
|
*
add ingestrequest_row2json.py
Bryan Newbold
2020-02-05
1
-0
/
+48
|
*
arabesque2ingestrequest: ingest type flag
Bryan Newbold
2020-01-14
1
-1
/
+4
|
*
basic arabesque2ingestrequest script
Bryan Newbold
2019-12-24
1
-0
/
+69
|
*
grobid_affiliations fix from prod, and usage example
Bryan Newbold
2019-10-02
1
-0
/
+5
|
*
deliver_dumpgrobid_to_s3: typo fix from old prod
Bryan Newbold
2019-10-02
1
-3
/
+4
|
*
grobid affiliation extractor (script)
Bryan Newbold
2019-10-02
1
-0
/
+47
|
*
move a bunch of random old scripts to subdir
Bryan Newbold
2019-09-25
9
-0
/
+1088