aboutsummaryrefslogtreecommitdiffstats
path: root/python/sandcrawler/db.py
Commit message (Expand)AuthorAgeFilesLines
* tweak html_meta SQL schemaBryan Newbold2020-11-031-12/+19
* html: start on SQL tableBryan Newbold2020-11-031-0/+44
* fixes and tweaks from testing locallyBryan Newbold2020-06-171-0/+47
* pdf_trio persist fixes from prodBryan Newbold2020-02-191-4/+4
* include rel and oa_status in ingest request 'extra'Bryan Newbold2020-02-181-1/+1
* pdftrio basic python codeBryan Newbold2020-02-121-0/+57
* fix bug where ingest_request extra fields not persistedBryan Newbold2020-02-051-1/+2
* persist grobid: actually, status_code is requiredBryan Newbold2020-01-211-1/+1
* persist: work around GROBID timeouts with no status_codeBryan Newbold2020-01-211-1/+1
* persist: fix dupe field copyingBryan Newbold2020-01-151-1/+8
* persist worker: implement updated ingest result semanticsBryan Newbold2020-01-151-1/+1
* small fixups to SandcrawlerPostgrestClientBryan Newbold2020-01-141-1/+10
* db: move duplicate row filtering into DB insert helpersBryan Newbold2020-01-021-0/+25
* fix DB import countingBryan Newbold2020-01-021-4/+5
* fix small errors found by pylintBryan Newbold2020-01-021-1/+1
* db: fancy insert/update separation using postgres xmaxBryan Newbold2020-01-021-15/+30
* improve DB helpersBryan Newbold2020-01-021-26/+81
* start work on DB connector and minio clientBryan Newbold2020-01-021-0/+141