index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
fatcat_tools
/
importers
/
ingest.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
web ingest: terminal URL mismatch as skip, not assert
Bryan Newbold
2020-12-30
1
-1
/
+3
*
ingest: allow dblp imports
Bryan Newbold
2020-12-23
1
-1
/
+1
*
add dblp as an ingest source and identifier
Bryan Newbold
2020-12-17
1
-1
/
+2
*
ingest: allow doaj ingest responses
Bryan Newbold
2020-12-17
1
-1
/
+2
*
html ingest: small fixes to try_update() code path
Bryan Newbold
2020-12-15
1
-5
/
+5
*
html ingest: actual xhtml mimetype
Bryan Newbold
2020-11-16
1
-2
/
+2
*
html ingest: remaining implementation
Bryan Newbold
2020-11-06
1
-22
/
+19
*
ingest: progress on HTML ingest
Bryan Newbold
2020-11-05
1
-14
/
+30
*
ingest: initial 'web' worker implementation
Bryan Newbold
2020-11-05
1
-66
/
+258
*
ingest: whitelist -> allowlist
Bryan Newbold
2020-11-05
1
-3
/
+3
*
ingest: basic checks for ingest_type
Bryan Newbold
2020-11-05
1
-3
/
+29
*
lint (flake8) tool python files
Bryan Newbold
2020-07-01
1
-6
/
+1
*
ingest importer: check that stage is consistent with release
Bryan Newbold
2020-05-26
1
-0
/
+5
*
importers: clarify handling of ApiException
Bryan Newbold
2020-05-22
1
-0
/
+1
*
ingest importer: don't use glutton matches
Bryan Newbold
2020-05-22
1
-3
/
+3
*
ingest import: fix edit_extra path
Bryan Newbold
2020-02-18
1
-1
/
+1
*
ingest importer: edit_extra is a top-level key
Bryan Newbold
2020-02-18
1
-1
/
+1
*
ingest import: allow short version of corpus names
Bryan Newbold
2020-02-18
1
-0
/
+3
*
ingest importer: pass through link rel
Bryan Newbold
2020-02-18
1
-1
/
+6
*
check ingest_request_source existance for SPN as well as ingest
Bryan Newbold
2020-02-06
1
-0
/
+3
*
additional trusted link sources
Bryan Newbold
2020-02-06
1
-0
/
+3
*
add mag and s2 as trusted link sources
Bryan Newbold
2020-02-06
1
-1
/
+1
*
ingest worker: handle missing ingest_request_source
Bryan Newbold
2020-02-06
1
-0
/
+3
*
fix trivial typo in file importer
Bryan Newbold
2020-01-20
1
-1
/
+1
*
ingest: improve tests, support old ingest results
Bryan Newbold
2020-01-15
1
-3
/
+12
*
update ingest worker for schema tweaks
Bryan Newbold
2020-01-15
1
-8
/
+15
*
ingest: allow more sources to auto-import
Bryan Newbold
2020-01-15
1
-1
/
+2
*
importers: control update behavior with more-standard flag
Bryan Newbold
2020-01-06
1
-1
/
+1
*
allow arabesque backfill ingests for some source types
Bryan Newbold
2019-12-24
1
-0
/
+5
*
fix spn/ingest importer duplication check
Bryan Newbold
2019-12-22
1
-6
/
+8
*
add ingest import file collision protection
Bryan Newbold
2019-12-13
1
-0
/
+6
*
update ingest request schema
Bryan Newbold
2019-12-13
1
-2
/
+7
*
remove default mimetype from ingest-file importer
Bryan Newbold
2019-12-13
1
-2
/
+1
*
savepapernow result importer
Bryan Newbold
2019-12-12
1
-3
/
+64
*
add another ingest request source to whitelist
Bryan Newbold
2019-12-10
1
-2
/
+5
*
tweaks to file ingest importer
Bryan Newbold
2019-12-03
1
-3
/
+4
*
re-order ingest want() for better stats
Bryan Newbold
2019-11-15
1
-7
/
+10
*
project -> ingest_request_source
Bryan Newbold
2019-11-15
1
-6
/
+6
*
ingest importer fixes
Bryan Newbold
2019-11-15
1
-3
/
+4
*
more ingest importer comments and counts
Bryan Newbold
2019-11-15
1
-1
/
+28
*
ingest file result importer
Bryan Newbold
2019-11-15
1
-0
/
+134