aboutsummaryrefslogtreecommitdiffstats
path: root/notes/ingest
Commit message (Collapse)AuthorAgeFilesLines
* unpaywall crawl wrap-up notesBryan Newbold2022-07-141-2/+145
|
* ingest: targeted 2022-04 notesBryan Newbold2022-07-071-1/+3
|
* more dataset crawl notesBryan Newbold2022-04-261-0/+53
|
* start notes on unpaywall and targeted/patch crawlsBryan Newbold2022-04-202-0/+277
|
* various ingest/task notesBryan Newbold2022-03-223-1/+93
|
* DOAJ ingest/crawl notesBryan Newbold2022-03-111-0/+266
|
* 2022 patch crawl bulk ingest notesBryan Newbold2022-03-021-0/+106
|
* update old OAI-PMH patch crawl notesBryan Newbold2022-02-281-1/+36
|
* more patch crawlingBryan Newbold2022-02-082-9/+209
|
* OAI-PMH patch crawl more updatesBryan Newbold2022-02-081-2/+71
|
* ingest notes: various in-progress projectsBryan Newbold2022-01-274-3/+800
|
* commit old patch crawl notesBryan Newbold2021-12-011-0/+488
|
* daily OA crawl improvements/notesBryan Newbold2021-09-081-0/+1021
|
* OAI-PMH patch and ingest improvement notesBryan Newbold2021-09-032-204/+1578
|
* commit old patch crawl notes (dec 2020)Bryan Newbold2021-09-031-0/+1
|
* commit old arxiv ingest notesBryan Newbold2021-09-031-0/+12
|
* commit old patch notes (will rework)Bryan Newbold2021-09-031-0/+110
|
* MAG post-crawl stats (5m+ new PDFs crawled successfully)Bryan Newbold2021-09-021-0/+124
|
* MAG and OAI-PMH crawl/processing notesBryan Newbold2021-08-132-0/+480
|
* 2021-07 unpaywall crawl wrap-up notesBryan Newbold2021-07-301-12/+108
|
* unpaywall 2021-07 crawl partial notesBryan Newbold2021-07-141-0/+224
|
* notes on large-domain ingest tweaksBryan Newbold2021-05-271-0/+480
|
* 2021-04 unpaywall crawl notesBryan Newbold2021-05-271-0/+368
|
* late-2020 OA DOI crawl ingest notesBryan Newbold2021-01-041-3/+46
|
* DOAJ crawl ingest statsBryan Newbold2020-12-311-0/+295
|
* progress notes on OA DOI ingest (still running)Bryan Newbold2020-12-281-11/+102
|
* unpaywall crawl/ingest update (from Oct 2020)Bryan Newbold2020-12-081-0/+134
|
* commit sept 2020 scielo ingest notesBryan Newbold2020-12-081-0/+21
|
* unpaywall oct 2020 crawl notesBryan Newbold2020-11-021-45/+82
|
* more notes on unpaywall ingest from last weekBryan Newbold2020-10-271-0/+73
|
* notes on 2020-09 re-ingest passesBryan Newbold2020-10-171-0/+197
|
* OA DOIs: partial notesBryan Newbold2020-10-171-0/+218
|
* notes/status on daily ingestBryan Newbold2020-10-171-0/+193
|
* start 2020-10 ingest notesBryan Newbold2020-10-111-0/+42
|
* update unpaywall 2020-04 notesBryan Newbold2020-10-111-0/+32
|
* OAI-PMH ingest progress timestampsBryan Newbold2020-10-111-0/+13
|
* OAI-PMH ingest notesBryan Newbold2020-09-031-0/+232
|
* daily ingest notesBryan Newbold2020-09-021-0/+202
|
* unpaywall ingest follow-upBryan Newbold2020-09-021-0/+115
|
* MAG ingest follow-up notesBryan Newbold2020-08-051-0/+194
|
* MAG 2020-07 ingest notesBryan Newbold2020-07-081-0/+159
|
* 2020-05_pubmed ingest notes (short)Bryan Newbold2020-06-251-0/+10
|
* ingest: OAI-PMH count tableBryan Newbold2020-05-281-0/+24
|
* ingest notesBryan Newbold2020-05-262-6/+76
|
* potential future backfill ingestsBryan Newbold2020-05-261-0/+52
|
* ingests: normalize file names; commit updatesBryan Newbold2020-05-2610-63/+279
|
* summarize datacite and MAG 2020 crawlsBryan Newbold2020-05-052-0/+200
|
* update MAG crawl notesBryan Newbold2020-04-281-0/+71
|
* COVID-19 chinese paper ingestBryan Newbold2020-04-151-0/+73
|
* 2020-04 unpaywall ingest (in progress)Bryan Newbold2020-04-151-0/+63
|