Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
| * | file elasticsearch index worker | Bryan Newbold | 2021-12-15 | 3 | -1/+63 | |
| | | ||||||
* | | update stats | Bryan Newbold | 2022-01-12 | 3 | -0/+49 | |
| | | ||||||
* | | ES: update README for v05-era indices | Bryan Newbold | 2022-01-12 | 1 | -15/+15 | |
| | | ||||||
* | | ES schema: fix typo in container issns alias | Bryan Newbold | 2022-01-12 | 1 | -1/+1 | |
| | | ||||||
* | | elasticsearch: bump timeout to 40 seconds (from default of 10) | Bryan Newbold | 2022-01-10 | 1 | -1/+1 | |
| | | ||||||
* | | make fmt | Bryan Newbold | 2021-12-15 | 2 | -5/+6 | |
| | | ||||||
* | | Merge branch 'martin-sentry-sdk' into 'master' | bnewbold | 2021-12-16 | 10 | -344/+396 | |
|\ \ | | | | | | | | | | | | | move from raven to sentry_sdk See merge request webgroup/fatcat!135 | |||||
| * | | move from raven to sentry_sdk | Martin Czygan | 2021-12-14 | 10 | -344/+396 | |
| |/ | | | | | | | | | | | | | | | | | related docs: * https://docs.sentry.io/platforms/python/guides/flask/migration/ * https://docs.sentry.io/platforms/python/guides/asgi/configuration/integrations/flask/ > `fetch_git_sha` is gone, see: https://forum.sentry.io/t/fetch-git-sha-equivalent-in-the-unified-python-sdk/5521 | |||||
* / | crossref importer: skip affiliations lacking 'name' | Bryan Newbold | 2021-12-15 | 1 | -0/+3 | |
|/ | | | | Relatedly, we should start handling ROR affiliations in contribs soon. | |||||
* | updates to guide based on feedback | Bryan Newbold | 2021-12-08 | 3 | -9/+35 | |
| | ||||||
* | mergers: fix typo in env var name | Bryan Newbold | 2021-12-07 | 3 | -3/+3 | |
| | ||||||
* | another file_meta update | Bryan Newbold | 2021-12-06 | 1 | -0/+60 | |
| | ||||||
* | ES container schema: add 'sim_pubid' and `ia_sim_collection` fields | Bryan Newbold | 2021-12-03 | 2 | -0/+4 | |
| | ||||||
* | ES transform: remove prototype microfilm links | Bryan Newbold | 2021-12-03 | 1 | -20/+0 | |
| | | | | This ended up being a feature in scholar.archive.org, not fatcat. | |||||
* | SQL snashots/exports: updated prod commands | Bryan Newbold | 2021-12-03 | 1 | -13/+15 | |
| | ||||||
* | file_meta cleanup update | Bryan Newbold | 2021-12-01 | 1 | -0/+75 | |
| | ||||||
* | initial 'far-future' release date updates | Bryan Newbold | 2021-11-30 | 1 | -0/+212 | |
| | ||||||
* | chocula update notes | Bryan Newbold | 2021-11-30 | 1 | -0/+61 | |
| | ||||||
* | container ISSN-L dedupe notes | Bryan Newbold | 2021-11-30 | 1 | -0/+198 | |
| | ||||||
* | chocula importer: handle not-upper-case ISSNs | Bryan Newbold | 2021-11-30 | 1 | -2/+6 | |
| | ||||||
* | chocula importer: handle broken ISSNs in extra metadata | Bryan Newbold | 2021-11-30 | 1 | -2/+7 | |
| | ||||||
* | chocula importer: tweak counting, conditions for doing updates | Bryan Newbold | 2021-11-30 | 1 | -15/+7 | |
| | ||||||
* | chocula importer: move issne/issnp 'extra' to top-level fields if doing updates | Bryan Newbold | 2021-11-30 | 1 | -0/+6 | |
| | ||||||
* | chocula: don't do name cleanups in importer | Bryan Newbold | 2021-11-30 | 1 | -8/+2 | |
| | | | | This kind of cleanup should be done in 'chocula' instead. | |||||
* | container merger: fix bug with filtering by release count | Bryan Newbold | 2021-11-30 | 1 | -13/+15 | |
| | | | | | Also apply the "human edit" and "release count" checks only to the dupe (to-be-redirected) idents. | |||||
* | add stats (before re-indexing), and rename old files for consistency | Bryan Newbold | 2021-11-30 | 6 | -0/+47 | |
| | ||||||
* | cleanups: springer 'page-one' sample PDFs | Bryan Newbold | 2021-11-29 | 2 | -0/+129 | |
| | ||||||
* | cleanups: truncated wayback PDFs from common crawl | Bryan Newbold | 2021-11-29 | 2 | -0/+292 | |
| | ||||||
* | update to truncated wayback timestamp issue | Bryan Newbold | 2021-11-29 | 1 | -0/+24 | |
| | ||||||
* | update to file short wayback timestamp cleanup | Bryan Newbold | 2021-11-29 | 2 | -1/+30 | |
| | ||||||
* | commit old 2021-11-11 stats file | Bryan Newbold | 2021-11-29 | 1 | -0/+1 | |
| | ||||||
* | clean up extra/ folder a bit | Bryan Newbold | 2021-11-29 | 11 | -24/+0 | |
| | ||||||
* | move notes/bulk_edits/ to extra/bulk_edits/ | Bryan Newbold | 2021-11-29 | 23 | -0/+0 | |
| | ||||||
* | move 'cleanups' directory from notes to extra/ | Bryan Newbold | 2021-11-29 | 11 | -0/+0 | |
| | ||||||
* | Merge branch 'bnewbold-container-merger' | Bryan Newbold | 2021-11-29 | 7 | -4/+532 | |
|\ | ||||||
| * | notes on container ISSN-L merging, tested in QA | Bryan Newbold | 2021-11-24 | 2 | -0/+160 | |
| | | ||||||
| * | release merger: same editgroup_id fixes as for file and container mergers | Bryan Newbold | 2021-11-24 | 1 | -1/+5 | |
| | | ||||||
| * | container merger: fixes from QA testing | Bryan Newbold | 2021-11-24 | 1 | -8/+13 | |
| | | ||||||
| * | mergers: don't try to accept empty editgroups in dry-run-mode | Bryan Newbold | 2021-11-24 | 1 | -2/+4 | |
| | | ||||||
| * | ES release transform: handle redirected containers better | Bryan Newbold | 2021-11-24 | 1 | -1/+1 | |
| | | | | | | | | | | Despite the inline comment, we were not actually grabbing the "redirected" ident correctly, meaning some counts would not be accurate. | |||||
| * | container merger: defer allocation of editgroup_id; and dummy code path | Bryan Newbold | 2021-11-24 | 1 | -1/+5 | |
| | | ||||||
| * | initial implementation of container merger | Bryan Newbold | 2021-11-24 | 2 | -0/+353 | |
| | | ||||||
* | | notes on file_meta partial cleanup | Bryan Newbold | 2021-11-24 | 4 | -0/+239 | |
|/ | ||||||
* | notes from prod run of file de-dupe | Bryan Newbold | 2021-11-24 | 2 | -0/+36 | |
| | ||||||
* | file merger: allocate editgroup id later in 'merge' process | Bryan Newbold | 2021-11-24 | 1 | -1/+5 | |
| | | | | | The motivation is to avoid creating empty editgroups in dry-run mode, and when all entities are "skipped" | |||||
* | Merge branch 'bnewbold-mergers' into 'master' | bnewbold | 2021-11-25 | 8 | -0/+1046 | |
|\ | | | | | | | | | entity mergers framework See merge request webgroup/fatcat!133 | |||||
| * | mergers common: remove inaccurate comment | Bryan Newbold | 2021-11-24 | 1 | -2/+0 | |
| | | | | | | | | Caught in review, thanks miku | |||||
| * | merger proposal typos | Bryan Newbold | 2021-11-24 | 1 | -2/+2 | |
| | | ||||||
| * | file merger: add content_scope to list of merged fields | Bryan Newbold | 2021-11-24 | 1 | -1/+1 | |
| | | ||||||
| * | release merger: some progress, but also disable (not complete) | Bryan Newbold | 2021-11-23 | 1 | -12/+72 | |
| | |