aboutsummaryrefslogtreecommitdiffstats
path: root/python_hadoop/extraction_cdx_grobid.py
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-01-21 11:32:49 -0800
committerBryan Newbold <bnewbold@archive.org>2020-01-21 11:32:51 -0800
commit20291471b34ea559d2ea5d45f3b05884e54d179a (patch)
tree772b58f3fe4091e30e9477e43351c1778f421e40 /python_hadoop/extraction_cdx_grobid.py
parent8b9acb1d31b4b8ae84a5133e947ca0a577cd98d8 (diff)
downloadsandcrawler-20291471b34ea559d2ea5d45f3b05884e54d179a.tar.gz
sandcrawler-20291471b34ea559d2ea5d45f3b05884e54d179a.zip
persist grobid: actually, status_code is required
Instead of working around when missing, force it to exist but skip in database insert section. Disk mode still needs to check if blank.
Diffstat (limited to 'python_hadoop/extraction_cdx_grobid.py')
0 files changed, 0 insertions, 0 deletions