index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
fatcat_tools
/
importers
/
ingest.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
lint: simple, safe inline lint fixes
Bryan Newbold
2021-11-02
1
-6
/
+6
*
fix missing variable in fileset ingest
Bryan Newbold
2021-11-02
1
-2
/
+1
*
WIP: more fileset ingest
Bryan Newbold
2021-10-18
1
-13
/
+21
*
WIP: rel fixes
Bryan Newbold
2021-10-14
1
-6
/
+6
*
fileset ingest small tweaks
Bryan Newbold
2021-10-14
1
-21
/
+36
*
initial implementation of fileset ingest importers
Bryan Newbold
2021-10-14
1
-2
/
+223
*
new SPN web (html) importer
Bryan Newbold
2021-10-01
1
-26
/
+80
*
ingest importer behavior tweaks
Bryan Newbold
2021-10-01
1
-8
/
+8
*
more consistent and defensive lower-casing of DOIs
Bryan Newbold
2021-06-23
1
-0
/
+4
*
ingest: swap ingest and file checks, to result in clearer stats/counts of ski...
Bryan Newbold
2021-06-03
1
-2
/
+2
*
ingest: don't accept mag and s2 URLs
Bryan Newbold
2021-06-03
1
-4
/
+4
*
web ingest: terminal URL mismatch as skip, not assert
Bryan Newbold
2020-12-30
1
-1
/
+3
*
ingest: allow dblp imports
Bryan Newbold
2020-12-23
1
-1
/
+1
*
add dblp as an ingest source and identifier
Bryan Newbold
2020-12-17
1
-1
/
+2
*
ingest: allow doaj ingest responses
Bryan Newbold
2020-12-17
1
-1
/
+2
*
html ingest: small fixes to try_update() code path
Bryan Newbold
2020-12-15
1
-5
/
+5
*
html ingest: actual xhtml mimetype
Bryan Newbold
2020-11-16
1
-2
/
+2
*
html ingest: remaining implementation
Bryan Newbold
2020-11-06
1
-22
/
+19
*
ingest: progress on HTML ingest
Bryan Newbold
2020-11-05
1
-14
/
+30
*
ingest: initial 'web' worker implementation
Bryan Newbold
2020-11-05
1
-66
/
+258
*
ingest: whitelist -> allowlist
Bryan Newbold
2020-11-05
1
-3
/
+3
*
ingest: basic checks for ingest_type
Bryan Newbold
2020-11-05
1
-3
/
+29
*
lint (flake8) tool python files
Bryan Newbold
2020-07-01
1
-6
/
+1
*
ingest importer: check that stage is consistent with release
Bryan Newbold
2020-05-26
1
-0
/
+5
*
importers: clarify handling of ApiException
Bryan Newbold
2020-05-22
1
-0
/
+1
*
ingest importer: don't use glutton matches
Bryan Newbold
2020-05-22
1
-3
/
+3
*
ingest import: fix edit_extra path
Bryan Newbold
2020-02-18
1
-1
/
+1
*
ingest importer: edit_extra is a top-level key
Bryan Newbold
2020-02-18
1
-1
/
+1
*
ingest import: allow short version of corpus names
Bryan Newbold
2020-02-18
1
-0
/
+3
*
ingest importer: pass through link rel
Bryan Newbold
2020-02-18
1
-1
/
+6
*
check ingest_request_source existance for SPN as well as ingest
Bryan Newbold
2020-02-06
1
-0
/
+3
*
additional trusted link sources
Bryan Newbold
2020-02-06
1
-0
/
+3
*
add mag and s2 as trusted link sources
Bryan Newbold
2020-02-06
1
-1
/
+1
*
ingest worker: handle missing ingest_request_source
Bryan Newbold
2020-02-06
1
-0
/
+3
*
fix trivial typo in file importer
Bryan Newbold
2020-01-20
1
-1
/
+1
*
ingest: improve tests, support old ingest results
Bryan Newbold
2020-01-15
1
-3
/
+12
*
update ingest worker for schema tweaks
Bryan Newbold
2020-01-15
1
-8
/
+15
*
ingest: allow more sources to auto-import
Bryan Newbold
2020-01-15
1
-1
/
+2
*
importers: control update behavior with more-standard flag
Bryan Newbold
2020-01-06
1
-1
/
+1
*
allow arabesque backfill ingests for some source types
Bryan Newbold
2019-12-24
1
-0
/
+5
*
fix spn/ingest importer duplication check
Bryan Newbold
2019-12-22
1
-6
/
+8
*
add ingest import file collision protection
Bryan Newbold
2019-12-13
1
-0
/
+6
*
update ingest request schema
Bryan Newbold
2019-12-13
1
-2
/
+7
*
remove default mimetype from ingest-file importer
Bryan Newbold
2019-12-13
1
-2
/
+1
*
savepapernow result importer
Bryan Newbold
2019-12-12
1
-3
/
+64
*
add another ingest request source to whitelist
Bryan Newbold
2019-12-10
1
-2
/
+5
*
tweaks to file ingest importer
Bryan Newbold
2019-12-03
1
-3
/
+4
*
re-order ingest want() for better stats
Bryan Newbold
2019-11-15
1
-7
/
+10
*
project -> ingest_request_source
Bryan Newbold
2019-11-15
1
-6
/
+6
*
ingest importer fixes
Bryan Newbold
2019-11-15
1
-3
/
+4
[next]