index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
extra
Commit message (
Collapse
)
Author
Age
Files
Lines
*
bulk edits: docs on initial dataset/fileset ingest
Bryan Newbold
2022-04-20
1
-0
/
+22
|
*
cleanups: isiarticles
Bryan Newbold
2022-04-20
3
-0
/
+49
|
*
stats: just as unpaywall bulk ingest starting
Bryan Newbold
2022-04-19
1
-0
/
+1
|
*
dump/export helper Makefile
Bryan Newbold
2022-04-18
1
-0
/
+93
|
*
container status: add simple prod single-command script
Bryan Newbold
2022-04-08
1
-0
/
+20
|
*
2022-03-21 fatcat stats
Bryan Newbold
2022-03-22
2
-0
/
+48
|
*
document recent bulk metadata edits/imports
Bryan Newbold
2022-03-22
3
-0
/
+62
|
*
Merge branch 'bnewbold-container-web' into 'master'
bnewbold
2022-03-10
1
-0
/
+6
|
\
|
|
|
|
|
|
|
|
container web interface improvements See merge request webgroup/fatcat!140
|
*
container ES schema: more aliases
Bryan Newbold
2022-02-09
1
-0
/
+6
|
|
*
|
sql dumps: use 'custom' mode instead of 'tar'
Bryan Newbold
2022-02-23
1
-1
/
+5
|
/
*
bulk cleanups: NCI chem entries; IRs with container_id; PLOS non-articles
Bryan Newbold
2022-02-09
4
-0
/
+330
|
*
bulk metadata edit log
Bryan Newbold
2022-02-04
3
-0
/
+223
|
*
commit updated stats
Bryan Newbold
2022-01-26
2
-0
/
+47
|
*
docker focal: update base image for focal/py38
Bryan Newbold
2022-01-26
1
-36
/
+11
|
*
container counts update process README
Bryan Newbold
2022-01-21
1
-0
/
+41
|
*
update stats
Bryan Newbold
2022-01-12
3
-0
/
+49
|
*
ES: update README for v05-era indices
Bryan Newbold
2022-01-12
1
-15
/
+15
|
*
ES schema: fix typo in container issns alias
Bryan Newbold
2022-01-12
1
-1
/
+1
|
*
another file_meta update
Bryan Newbold
2021-12-06
1
-0
/
+60
|
*
ES container schema: add 'sim_pubid' and `ia_sim_collection` fields
Bryan Newbold
2021-12-03
1
-0
/
+2
|
*
SQL snashots/exports: updated prod commands
Bryan Newbold
2021-12-03
1
-13
/
+15
|
*
file_meta cleanup update
Bryan Newbold
2021-12-01
1
-0
/
+75
|
*
initial 'far-future' release date updates
Bryan Newbold
2021-11-30
1
-0
/
+212
|
*
chocula update notes
Bryan Newbold
2021-11-30
1
-0
/
+61
|
*
container ISSN-L dedupe notes
Bryan Newbold
2021-11-30
1
-0
/
+198
|
*
add stats (before re-indexing), and rename old files for consistency
Bryan Newbold
2021-11-30
6
-0
/
+47
|
*
cleanups: springer 'page-one' sample PDFs
Bryan Newbold
2021-11-29
2
-0
/
+129
|
*
cleanups: truncated wayback PDFs from common crawl
Bryan Newbold
2021-11-29
2
-0
/
+292
|
*
update to truncated wayback timestamp issue
Bryan Newbold
2021-11-29
1
-0
/
+24
|
*
update to file short wayback timestamp cleanup
Bryan Newbold
2021-11-29
2
-1
/
+30
|
*
commit old 2021-11-11 stats file
Bryan Newbold
2021-11-29
1
-0
/
+1
|
*
clean up extra/ folder a bit
Bryan Newbold
2021-11-29
11
-24
/
+0
|
*
move notes/bulk_edits/ to extra/bulk_edits/
Bryan Newbold
2021-11-29
23
-0
/
+1743
|
*
move 'cleanups' directory from notes to extra/
Bryan Newbold
2021-11-29
11
-0
/
+1306
|
*
codespell fixes to various other docs
Bryan Newbold
2021-11-24
3
-4
/
+4
|
*
content_scope: include in file ES schema and transform
Bryan Newbold
2021-11-17
1
-0
/
+1
|
*
ISSN-L dupes check: output all matches
Bryan Newbold
2021-11-17
1
-1
/
+1
|
*
sitemap generation improvements
Bryan Newbold
2021-11-10
2
-1
/
+2
|
*
elasticsearch schema changes
Bryan Newbold
2021-10-13
2
-3
/
+13
|
*
update stats
Bryan Newbold
2021-10-11
3
-0
/
+48
|
*
sql_dumps: set collection at upload time
Bryan Newbold
2021-09-02
1
-2
/
+5
|
*
prod stats snapshot
Bryan Newbold
2021-08-06
4
-0
/
+47
|
*
stats snapshot (2021-06-23)
Bryan Newbold
2021-06-23
2
-0
/
+47
|
*
SQL dumps: more pigz (vs. gzip) for speed
Bryan Newbold
2021-06-17
1
-2
/
+2
|
*
fatcat_ref ES schema: more doc_values; source_year not source_release_year
Bryan Newbold
2021-06-17
1
-5
/
+2
|
*
update dblp pre-import notes and pipenv python version (3.8)
Bryan Newbold
2021-06-03
2
-6
/
+11
|
*
elasticsearch ref schema: 6 shards, not 12
Bryan Newbold
2021-05-18
1
-1
/
+1
|
*
fix 'colected' typos
Bryan Newbold
2021-04-13
1
-1
/
+1
|
|
|
|
Thanks for the catch martin
*
update elasticsearch bootstrap indexing notes
Bryan Newbold
2021-04-09
1
-8
/
+16
|
*
ES: rename fatcat_ref.json to ref_schema.json for consistency; add to README
Bryan Newbold
2021-04-08
2
-1
/
+4
|
[next]