index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
sandcrawler
/
persist.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
persist grobid: actually, status_code is required
Bryan Newbold
2020-01-21
1
-2
/
+9
*
persist: work around GROBID timeouts with no status_code
Bryan Newbold
2020-01-21
1
-2
/
+2
*
persist worker: implement updated ingest result semantics
Bryan Newbold
2020-01-15
1
-11
/
+16
*
ingest persist skips 'existing' ingest results
Bryan Newbold
2020-01-14
1
-0
/
+3
*
handle grobid2json errors in calling code instead
Bryan Newbold
2020-01-02
1
-1
/
+7
*
db: move duplicate row filtering into DB insert helpers
Bryan Newbold
2020-01-02
1
-15
/
+1
*
remove unused filter in grobid worker
Bryan Newbold
2020-01-02
1
-1
/
+0
*
fix dict typo
Bryan Newbold
2020-01-02
1
-1
/
+1
*
improvements to grobid persist worker
Bryan Newbold
2020-01-02
1
-13
/
+16
*
filter ingest results to not have key conflicts within batch
Bryan Newbold
2020-01-02
1
-1
/
+16
*
db: fancy insert/update separation using postgres xmax
Bryan Newbold
2020-01-02
1
-9
/
+15
*
add PersistGrobidDiskWorker
Bryan Newbold
2020-01-02
1
-0
/
+33
*
flush out minio helper, add to grobid persist
Bryan Newbold
2020-01-02
1
-9
/
+29
*
implement counts properly for persist workers
Bryan Newbold
2020-01-02
1
-15
/
+19
*
start work on persist workers and tool
Bryan Newbold
2020-01-02
1
-0
/
+223