aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* | better parsing of year as integer in refs pipelineBryan Newbold2021-07-262-4/+8
| |
* | make fmtBryan Newbold2021-07-263-10/+24
| |
* | fix failing test after clean_doi()Bryan Newbold2021-07-261-1/+1
| |
* | ref_key: hotfix for some corner casesBryan Newbold2021-07-261-8/+25
| |
* | notes on bundle/refs dump iterationBryan Newbold2021-07-261-0/+63
| |
* | transform: more clean_doi() callsBryan Newbold2021-07-261-3/+3
| |
* | refs transform: consolidate clean_ref_key() hacksBryan Newbold2021-07-251-17/+35
| |
* | refs transform: many fixesBryan Newbold2021-07-253-10/+308
| | | | | | | | | | | | | | | | | | - include year correctly (many cases) - test coverage for Crossref transform - pass-through 'edition' as 'version' - series-title parsed in to title or container as appropriate - missing release stage - fix 0-index vs. 1-index ref index field
* | bibref: add version field; isbn13 -> isbnBryan Newbold2021-07-251-1/+2
| |
* | refs transform: 1-index refs.index, not 0-indexBryan Newbold2021-07-253-5/+13
| | | | | | | | | | | | | | | | This was not matching expectations/schema of downstream refs pipeline (cgraph), and wasn't matching documented schema. Note care required when checking if the index is set, to distinguish between '0' and 'None' values.
* | web: fix paper.fulltext is None errorBryan Newbold2021-07-241-1/+1
|/
* Translated using Weblate (Greek)Eugenia Russell2021-07-072-4/+4
| | | | | | | Currently translated at 39.3% (70 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
* Translated using Weblate (Croatian)Milo Ivir2021-07-072-5/+6
| | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/hr/
* Translated using Weblate (Greek)Eugenia Russell2021-07-072-8/+8
| | | | | | | Currently translated at 38.2% (68 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
* refs: clean up GROBID DOIs and PMCIDsBryan Newbold2021-07-012-12/+9
|
* HACK: don't parse TEI-XML for a specific paper/fileBryan Newbold2021-06-301-2/+4
| | | | | GROBID v0.5.5 returns TEI-XML for this one PDF which is not valid XML, due to a text encoding issue.
* refs: include (source) release_stage in outputBryan Newbold2021-06-303-9/+20
|
* pipenv: use correct/full beautifulsoup package nameBryan Newbold2021-06-302-96/+97
|
* robots: add commented out reference to sitemap-index-access.xmlBryan Newbold2021-06-301-0/+3
|
* sitemaps: change filters; only primary release fulltext (via jq); scp to replicaBryan Newbold2021-06-223-5/+12
|
* fix jinja2/babel deprecationsBryan Newbold2021-06-113-6/+6
|
* bump jinja2 major version, and other dep version updatesBryan Newbold2021-06-112-202/+209
|
* commit missing elastic get example JSON filesBryan Newbold2021-06-112-0/+174
|
* sitemap: new access URL formatBryan Newbold2021-06-112-7/+6
|
* update citation_pdf_url HTML meta tag to new access URL styleBryan Newbold2021-06-113-13/+21
|
* update access redirect URL endpointsBryan Newbold2021-06-113-75/+72
|
* update indexability proposal based on feedbackBryan Newbold2021-06-111-22/+19
|
* catch/ignore ChunkedEncoding errors in fetchesBryan Newbold2021-06-112-0/+6
|
* bugfix: pass full crossref obj, not just 'record'Bryan Newbold2021-06-021-1/+1
|
* refs: use fatcat prefix for some sourcesBryan Newbold2021-06-021-5/+5
| | | | This makes debugging what is going on much easier
* web: add goatcounter tracking to more download buttonsBryan Newbold2021-06-021-4/+4
|
* integrate crossref references, and iterate on refs output logicBryan Newbold2021-06-021-7/+115
| | | | Needs test coverage!
* lint fixes, and run fmtBryan Newbold2021-06-024-22/+11
|
* add 'crossref' hydration to work pipelineBryan Newbold2021-06-023-0/+62
| | | | | | | | The immediate motivation is to include recent crossref refs in citation graph transforms. May also be valuable for researchers to have authoritative/publisher metadata in the bundle dumps.
* schema: add 'crossref' to bundle schema, and add from_json() helperBryan Newbold2021-06-025-40/+28
| | | | | from_json() refactor was an earlier TODO, to reduce duplication when updating fields on this class
* lint: ignore mypy on some template importsBryan Newbold2021-06-021-2/+2
|
* web: fix fatcat.wiki search option linkBryan Newbold2021-05-241-1/+1
|
* sitemaps: filter out 12-digit wayback timestamp access URLs, for nowBryan Newbold2021-05-192-0/+4
|
* web: fixes to access redirect endpointsBryan Newbold2021-05-193-1/+30
|
* sitemaps: find SCRIPT_DIRBryan Newbold2021-05-181-1/+2
|
* sitemaps: pdfs -> accessBryan Newbold2021-05-183-4/+4
|
* enable Greek (el) translation (still partial)Bryan Newbold2021-05-182-0/+2
|
* Translated using Weblate (Korean)YiReun2021-05-182-3/+3
| | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/ko/
* Translated using Weblate (Portuguese)ssantos2021-05-182-12/+13
| | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/pt/
* Translated using Weblate (Greek)Eugenia Russell2021-05-182-27/+43
| | | | | | | Currently translated at 34.2% (61 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
* web: add mellon grant acknowledgement to about pageBryan Newbold2021-05-181-0/+11
|
* sitemaps: PDF sitemapsBryan Newbold2021-05-185-0/+42
|
* sitemaps: remove date from sitmap URLsBryan Newbold2021-05-172-7/+2
|
* make fmtBryan Newbold2021-05-171-1/+4
|
* iterate on PDF redirect linksBryan Newbold2021-05-175-39/+149
|