index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
fatcat_tools
/
importers
Commit message (
Expand
)
Author
Age
Files
Lines
*
python impl
Bryan Newbold
2019-05-14
6
-16
/
+16
*
python: impl size_bytes -> size
Bryan Newbold
2019-05-13
1
-1
/
+1
*
importer code updates
Bryan Newbold
2019-05-13
4
-3
/
+18
*
partial python impl of ext_id and release_stage refactors
Bryan Newbold
2019-05-13
3
-15
/
+20
*
add limits to match importers
Bryan Newbold
2019-04-23
3
-2
/
+27
*
archive.org isn't really a repository
Bryan Newbold
2019-04-22
1
-1
/
+3
*
editgroup description override
Bryan Newbold
2019-04-22
1
-2
/
+2
*
arabesque importer does require timestamp/wayback
Bryan Newbold
2019-04-22
1
-0
/
+3
*
matched importer shouldn't require wayback
Bryan Newbold
2019-04-22
1
-5
/
+7
*
handle API 400 in arabesque import (invalid extid)
Bryan Newbold
2019-04-19
1
-7
/
+14
*
fix arabesque importer crawl_id None bug
Bryan Newbold
2019-04-18
1
-1
/
+1
*
mechanism to not double-update entities
Bryan Newbold
2019-04-18
2
-1
/
+9
*
minor arabesque tweaks
Bryan Newbold
2019-04-18
1
-0
/
+2
*
update URL rel list
Bryan Newbold
2019-04-18
1
-1
/
+10
*
arabesque importer does fewer updates
Bryan Newbold
2019-04-18
1
-1
/
+8
*
arabesque importer
Bryan Newbold
2019-04-18
1
-0
/
+165
*
early version of arabesque importer
Bryan Newbold
2019-04-12
1
-0
/
+1
*
add SqlitePusher importer option
Bryan Newbold
2019-04-12
2
-1
/
+21
*
fix cdl_dash_dat license_slug
Bryan Newbold
2019-03-19
1
-7
/
+3
*
importer for CDL/DASH dat pilot dweb datasets
Bryan Newbold
2019-03-19
2
-0
/
+200
*
new importer: wayback_static
Bryan Newbold
2019-03-19
2
-0
/
+237
*
bunch of lint/whitespace cleanups
Bryan Newbold
2019-02-22
3
-5
/
+3
*
better/additional crossref license lookups
Bryan Newbold
2019-02-14
1
-20
/
+58
*
crossref: import subtitle as str, not list[str]
Bryan Newbold
2019-02-14
1
-0
/
+2
*
don't print missing DOIs, just count
Bryan Newbold
2019-02-05
1
-1
/
+3
*
add some missing LICENSE_SLUG_MAP
Bryan Newbold
2019-02-05
1
-1
/
+4
*
yet another required field bug
Bryan Newbold
2019-01-29
1
-4
/
+5
*
fix null name for container (required)
Bryan Newbold
2019-01-29
1
-1
/
+5
*
tweaks to GROBID metadata import
Bryan Newbold
2019-01-29
1
-3
/
+2
*
crossref import tweaks/fixes
Bryan Newbold
2019-01-29
1
-7
/
+9
*
fix bug in clean() resulting in many consistency check fails
Bryan Newbold
2019-01-29
2
-12
/
+12
*
fix refs extra ordering bug
Bryan Newbold
2019-01-29
1
-6
/
+6
*
pass through kwargs (fixes bezerk imports)
Bryan Newbold
2019-01-29
5
-5
/
+10
*
ensure raw_name is not stub
Bryan Newbold
2019-01-29
1
-1
/
+4
*
ensure abstracts aren't stubs
Bryan Newbold
2019-01-29
1
-2
/
+3
*
add stub parse_record() to make pylint happy
Bryan Newbold
2019-01-28
1
-0
/
+4
*
fix title length checks in crossref
Bryan Newbold
2019-01-28
1
-2
/
+2
*
fix rel/url order swap
Bryan Newbold
2019-01-28
1
-1
/
+1
*
don't allow empty or single-character clean strings
Bryan Newbold
2019-01-28
1
-1
/
+1
*
filter short/stub original_title
Bryan Newbold
2019-01-28
1
-3
/
+7
*
many fixes in GROBID importer
Bryan Newbold
2019-01-28
1
-14
/
+10
*
fix GROBID null/short abstract additions
Bryan Newbold
2019-01-28
1
-1
/
+2
*
enforce title len>1 for release imports
Bryan Newbold
2019-01-28
2
-1
/
+8
*
drop creators with no display name at all
Bryan Newbold
2019-01-28
1
-3
/
+3
*
make ORCID importer skip no-names, not assert
Bryan Newbold
2019-01-28
1
-1
/
+2
*
transform and import fixes/tweaks
Bryan Newbold
2019-01-25
2
-4
/
+10
*
update journal meta import/transform
Bryan Newbold
2019-01-25
1
-104
/
+39
*
grobid import extra metadata tweaks
Bryan Newbold
2019-01-24
1
-6
/
+7
*
refactor _get_editgroup => get_editgroup_id
Bryan Newbold
2019-01-24
2
-5
/
+6
*
refactor make_rel_url
Bryan Newbold
2019-01-24
3
-29
/
+66
[next]