index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
extra
Commit message (
Expand
)
Author
Age
Files
Lines
*
chocula: fix wikidata_qid inclusion
Bryan Newbold
2019-07-30
2
-1
/
+3
*
chocula: better ISSN-L handling
Bryan Newbold
2019-07-30
4
-24
/
+41
*
chocula: updated fetches, new ISSN-L and DOAJ files
Bryan Newbold
2019-07-30
2
-7
/
+10
*
chocula: wikidata indexing
Bryan Newbold
2019-07-30
1
-4
/
+48
*
chocula: crude publisher type bucketing; field cleanup
Bryan Newbold
2019-07-30
2
-40
/
+194
*
shorter/simpler table names
Bryan Newbold
2019-07-26
2
-9
/
+17
*
chocula: more host/domain fixes
Bryan Newbold
2019-07-26
1
-3
/
+8
*
GOLD OA parsing
Bryan Newbold
2019-07-26
1
-40
/
+54
*
chocula: fix domain parsing
Bryan Newbold
2019-07-26
1
-10
/
+47
*
pipenv: pytest for journal_metadata
Bryan Newbold
2019-07-26
2
-4
/
+83
*
chocula README
Bryan Newbold
2019-07-14
1
-0
/
+7
*
chocula: fetch SZ json
Bryan Newbold
2019-07-14
1
-0
/
+2
*
more chocula progress
Bryan Newbold
2019-07-14
2
-61
/
+183
*
EZB and szczepanski indexers
Bryan Newbold
2019-07-11
1
-45
/
+146
*
chocula early work
Bryan Newbold
2019-07-10
4
-0
/
+1009
*
more fixup notes (from QA server)
Bryan Newbold
2019-06-27
1
-5
/
+46
*
finish fixup_longtail_issnl_unique; but not going to run it
Bryan Newbold
2019-06-27
1
-4
/
+3
*
initial work on longtail_issnl_unique.py
Bryan Newbold
2019-06-24
1
-0
/
+192
*
stats.json update after releases v03 cut-over
Bryan Newbold
2019-06-06
1
-0
/
+1
*
elasticsearch index alias howto
Bryan Newbold
2019-06-06
1
-1
/
+16
*
QA checks (for hash, extid duplication)
Bryan Newbold
2019-06-04
4
-0
/
+82
*
recent prod table sizes; 380 GBytes or so total
Bryan Newbold
2019-06-04
1
-0
/
+233
*
dump_release_extid.sql changes for new schema
Bryan Newbold
2019-06-03
1
-1
/
+1
*
move export README info to sql_dumps doc
Bryan Newbold
2019-06-03
1
-1
/
+29
*
fix parse_merge_metadata.py merge_spans()
Bryan Newbold
2019-05-30
1
-4
/
+8
*
better KBART merging
Bryan Newbold
2019-05-30
1
-4
/
+5
*
initial code to handle multiple KBART spans better
Bryan Newbold
2019-05-30
1
-2
/
+64
*
add work-in-progress elastic index notes
Bryan Newbold
2019-05-30
1
-0
/
+11
*
add 'superceded' release extra flag to elastic schema
Bryan Newbold
2019-05-23
1
-0
/
+1
*
also track work_id in release elasticsearch table
Bryan Newbold
2019-05-22
1
-0
/
+1
*
count linked refs (not just raw refs) in elasticsearch
Bryan Newbold
2019-05-22
1
-0
/
+1
*
commit SQL table stats scripts
Bryan Newbold
2019-05-21
2
-0
/
+36
*
include creator_ids in release elastic schema
Bryan Newbold
2019-05-20
1
-0
/
+1
*
elastic release schema update
Bryan Newbold
2019-05-20
1
-1
/
+6
*
start tracking stats
Bryan Newbold
2019-05-07
2
-0
/
+2
*
IA collection page embed example description
Bryan Newbold
2019-05-07
1
-0
/
+45
*
old fileset and webcapture example entities
Bryan Newbold
2019-04-30
2
-0
/
+146
*
no-derive metadata and SQL dump uploads (to petabox)
Bryan Newbold
2019-04-30
1
-0
/
+2
*
faster elasticsearch imports
Bryan Newbold
2019-04-30
1
-1
/
+1
*
more bots to bootstrap
Bryan Newbold
2019-04-24
1
-0
/
+15
*
update sql dump README
Bryan Newbold
2019-04-24
1
-9
/
+12
*
fix wild elastic schema typo
Bryan Newbold
2019-04-12
1
-1
/
+1
*
record webcaptures added as demos
Bryan Newbold
2019-03-19
1
-0
/
+45
*
new importer: wayback_static
Bryan Newbold
2019-03-19
1
-203
/
+0
*
update enrich examples demo script
Bryan Newbold
2019-03-19
1
-49
/
+63
*
initial wayback-to-webcapture helper
Bryan Newbold
2019-03-19
1
-0
/
+203
*
more integration of transform refactor
Bryan Newbold
2019-03-11
1
-2
/
+2
*
elastic schema indentation
Bryan Newbold
2019-03-06
1
-6
/
+6
*
gitignore SQL identifier dumps
Bryan Newbold
2019-02-22
1
-0
/
+1
*
include container_id in release ES schema
Bryan Newbold
2019-02-22
1
-0
/
+1
[next]