summaryrefslogtreecommitdiffstats
Commit message (Expand)AuthorAgeFilesLines
* move from raven to sentry_sdkMartin Czygan2021-12-1410-344/+396
* updates to guide based on feedbackBryan Newbold2021-12-083-9/+35
* mergers: fix typo in env var nameBryan Newbold2021-12-073-3/+3
* another file_meta updateBryan Newbold2021-12-061-0/+60
* ES container schema: add 'sim_pubid' and `ia_sim_collection` fieldsBryan Newbold2021-12-032-0/+4
* ES transform: remove prototype microfilm linksBryan Newbold2021-12-031-20/+0
* SQL snashots/exports: updated prod commandsBryan Newbold2021-12-031-13/+15
* file_meta cleanup updateBryan Newbold2021-12-011-0/+75
* initial 'far-future' release date updatesBryan Newbold2021-11-301-0/+212
* chocula update notesBryan Newbold2021-11-301-0/+61
* container ISSN-L dedupe notesBryan Newbold2021-11-301-0/+198
* chocula importer: handle not-upper-case ISSNsBryan Newbold2021-11-301-2/+6
* chocula importer: handle broken ISSNs in extra metadataBryan Newbold2021-11-301-2/+7
* chocula importer: tweak counting, conditions for doing updatesBryan Newbold2021-11-301-15/+7
* chocula importer: move issne/issnp 'extra' to top-level fields if doing updatesBryan Newbold2021-11-301-0/+6
* chocula: don't do name cleanups in importerBryan Newbold2021-11-301-8/+2
* container merger: fix bug with filtering by release countBryan Newbold2021-11-301-13/+15
* add stats (before re-indexing), and rename old files for consistencyBryan Newbold2021-11-306-0/+47
* cleanups: springer 'page-one' sample PDFsBryan Newbold2021-11-292-0/+129
* cleanups: truncated wayback PDFs from common crawlBryan Newbold2021-11-292-0/+292
* update to truncated wayback timestamp issueBryan Newbold2021-11-291-0/+24
* update to file short wayback timestamp cleanupBryan Newbold2021-11-292-1/+30
* commit old 2021-11-11 stats fileBryan Newbold2021-11-291-0/+1
* clean up extra/ folder a bitBryan Newbold2021-11-2911-24/+0
* move notes/bulk_edits/ to extra/bulk_edits/Bryan Newbold2021-11-2923-0/+0
* move 'cleanups' directory from notes to extra/Bryan Newbold2021-11-2911-0/+0
* Merge branch 'bnewbold-container-merger'Bryan Newbold2021-11-297-4/+532
|\
| * notes on container ISSN-L merging, tested in QABryan Newbold2021-11-242-0/+160
| * release merger: same editgroup_id fixes as for file and container mergersBryan Newbold2021-11-241-1/+5
| * container merger: fixes from QA testingBryan Newbold2021-11-241-8/+13
| * mergers: don't try to accept empty editgroups in dry-run-modeBryan Newbold2021-11-241-2/+4
| * ES release transform: handle redirected containers betterBryan Newbold2021-11-241-1/+1
| * container merger: defer allocation of editgroup_id; and dummy code pathBryan Newbold2021-11-241-1/+5
| * initial implementation of container mergerBryan Newbold2021-11-242-0/+353
* | notes on file_meta partial cleanupBryan Newbold2021-11-244-0/+239
|/
* notes from prod run of file de-dupeBryan Newbold2021-11-242-0/+36
* file merger: allocate editgroup id later in 'merge' processBryan Newbold2021-11-241-1/+5
* Merge branch 'bnewbold-mergers' into 'master'bnewbold2021-11-258-0/+1046
|\
| * mergers common: remove inaccurate commentBryan Newbold2021-11-241-2/+0
| * merger proposal typosBryan Newbold2021-11-241-2/+2
| * file merger: add content_scope to list of merged fieldsBryan Newbold2021-11-241-1/+1
| * release merger: some progress, but also disable (not complete)Bryan Newbold2021-11-231-12/+72
| * file merges: fixes from testing in QABryan Newbold2021-11-231-14/+23
| * file de-dupe: notes on prep and QA testingBryan Newbold2021-11-232-0/+136
| * mergers: small tweaksBryan Newbold2021-11-232-3/+3
| * mergers: remove entity mergers from __init__ (to work around warning)Bryan Newbold2021-11-231-2/+0
| * add proposal for entity mergersBryan Newbold2021-11-231-0/+110
| * initial file merger, with testsBryan Newbold2021-11-232-0/+388
| * mergers: fmt, lint, refactorsBryan Newbold2021-11-233-96/+200
| * remove top-level fatcat_merge.py; going to call module __main__ going forwardBryan Newbold2021-11-231-112/+0