index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
ingest_file.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
ingest tool: more ingest control args
Bryan Newbold
2020-11-08
1
-1
/
+10
*
ingest tool: flag for HTML quick mode (CDX-only)
Bryan Newbold
2020-11-08
1
-0
/
+4
*
ingest tool: consistency about ingest-type arg
Bryan Newbold
2020-11-08
1
-2
/
+2
*
better default CLI output (show usage)
Bryan Newbold
2020-10-29
1
-1
/
+1
*
ingest_file: --no-spn2 flag for single command
Bryan Newbold
2020-03-10
1
-1
/
+6
*
ingest_tool: force-recrawl arg
Bryan Newbold
2020-03-05
1
-0
/
+5
*
cli: allow multiple ingest single types
Bryan Newbold
2020-01-09
1
-3
/
+4
*
refactor: sort keys in JSON output
Bryan Newbold
2019-12-18
1
-3
/
+3
*
refactor: improve argparse usage
Bryan Newbold
2019-12-18
1
-4
/
+8
*
update ingest proposal source/link naming
Bryan Newbold
2019-12-13
1
-1
/
+1
*
rename FileIngestWorker
Bryan Newbold
2019-11-13
1
-5
/
+6
*
more progress on file ingest
Bryan Newbold
2019-11-13
1
-1
/
+2
*
much progress on file ingest path
Bryan Newbold
2019-10-22
1
-320
/
+4
*
we do actually want consolidateHeader=2, not 1
Bryan Newbold
2019-10-04
1
-1
/
+1
*
commit WIP on file ingest script
Bryan Newbold
2019-09-23
1
-0
/
+386