aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'master' of github.com:internetarchive/fatcat-scholarBryan Newbold2021-08-096-22/+27
|\
| * Translated using Weblate (Spanish)Adolfo Jayme Barrientos2021-08-092-15/+16
| | | | | | | | | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/es/
| * Translated using Weblate (Greek)Georgios Pitsiladis2021-08-092-2/+5
| | | | | | | | | | | | | | Currently translated at 39.8% (71 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
| * Translated using Weblate (Dutch)privacysimp2021-07-162-5/+6
| | | | | | | | | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/nl/
* | ES: add 'preference' query param; default to '_local' in prodBryan Newbold2021-08-032-0/+5
| |
* | config: remove 'scholar-svc500' name from QA configBryan Newbold2021-08-032-6/+5
| |
* | web: access_redirect_fallback mechanismBryan Newbold2021-07-263-67/+297
| | | | | | | | | | | | | | | | | | | | | | | | This adds a helper code path that "tries harder" to find an access link, by querying the fatcat API directly to look for any file from any release associated with the work. If it finds a match, it does the redirect as usual (but does log the incident). If no match can be found, there is now a more helpful access-specific 404 error page. If the *work* is a 404, the generic error page is shown.
* | better parsing of year as integer in refs pipelineBryan Newbold2021-07-262-4/+8
| |
* | make fmtBryan Newbold2021-07-263-10/+24
| |
* | fix failing test after clean_doi()Bryan Newbold2021-07-261-1/+1
| |
* | ref_key: hotfix for some corner casesBryan Newbold2021-07-261-8/+25
| |
* | notes on bundle/refs dump iterationBryan Newbold2021-07-261-0/+63
| |
* | transform: more clean_doi() callsBryan Newbold2021-07-261-3/+3
| |
* | refs transform: consolidate clean_ref_key() hacksBryan Newbold2021-07-251-17/+35
| |
* | refs transform: many fixesBryan Newbold2021-07-253-10/+308
| | | | | | | | | | | | | | | | | | - include year correctly (many cases) - test coverage for Crossref transform - pass-through 'edition' as 'version' - series-title parsed in to title or container as appropriate - missing release stage - fix 0-index vs. 1-index ref index field
* | bibref: add version field; isbn13 -> isbnBryan Newbold2021-07-251-1/+2
| |
* | refs transform: 1-index refs.index, not 0-indexBryan Newbold2021-07-253-5/+13
| | | | | | | | | | | | | | | | This was not matching expectations/schema of downstream refs pipeline (cgraph), and wasn't matching documented schema. Note care required when checking if the index is set, to distinguish between '0' and 'None' values.
* | web: fix paper.fulltext is None errorBryan Newbold2021-07-241-1/+1
|/
* Translated using Weblate (Greek)Eugenia Russell2021-07-072-4/+4
| | | | | | | Currently translated at 39.3% (70 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
* Translated using Weblate (Croatian)Milo Ivir2021-07-072-5/+6
| | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/hr/
* Translated using Weblate (Greek)Eugenia Russell2021-07-072-8/+8
| | | | | | | Currently translated at 38.2% (68 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/
* refs: clean up GROBID DOIs and PMCIDsBryan Newbold2021-07-012-12/+9
|
* HACK: don't parse TEI-XML for a specific paper/fileBryan Newbold2021-06-301-2/+4
| | | | | GROBID v0.5.5 returns TEI-XML for this one PDF which is not valid XML, due to a text encoding issue.
* refs: include (source) release_stage in outputBryan Newbold2021-06-303-9/+20
|
* pipenv: use correct/full beautifulsoup package nameBryan Newbold2021-06-302-96/+97
|
* robots: add commented out reference to sitemap-index-access.xmlBryan Newbold2021-06-301-0/+3
|
* sitemaps: change filters; only primary release fulltext (via jq); scp to replicaBryan Newbold2021-06-223-5/+12
|
* fix jinja2/babel deprecationsBryan Newbold2021-06-113-6/+6
|
* bump jinja2 major version, and other dep version updatesBryan Newbold2021-06-112-202/+209
|
* commit missing elastic get example JSON filesBryan Newbold2021-06-112-0/+174
|
* sitemap: new access URL formatBryan Newbold2021-06-112-7/+6
|
* update citation_pdf_url HTML meta tag to new access URL styleBryan Newbold2021-06-113-13/+21
|
* update access redirect URL endpointsBryan Newbold2021-06-113-75/+72
|
* update indexability proposal based on feedbackBryan Newbold2021-06-111-22/+19
|
* catch/ignore ChunkedEncoding errors in fetchesBryan Newbold2021-06-112-0/+6
|
* bugfix: pass full crossref obj, not just 'record'Bryan Newbold2021-06-021-1/+1
|
* refs: use fatcat prefix for some sourcesBryan Newbold2021-06-021-5/+5
| | | | This makes debugging what is going on much easier
* web: add goatcounter tracking to more download buttonsBryan Newbold2021-06-021-4/+4
|
* integrate crossref references, and iterate on refs output logicBryan Newbold2021-06-021-7/+115
| | | | Needs test coverage!
* lint fixes, and run fmtBryan Newbold2021-06-024-22/+11
|
* add 'crossref' hydration to work pipelineBryan Newbold2021-06-023-0/+62
| | | | | | | | The immediate motivation is to include recent crossref refs in citation graph transforms. May also be valuable for researchers to have authoritative/publisher metadata in the bundle dumps.
* schema: add 'crossref' to bundle schema, and add from_json() helperBryan Newbold2021-06-025-40/+28
| | | | | from_json() refactor was an earlier TODO, to reduce duplication when updating fields on this class
* lint: ignore mypy on some template importsBryan Newbold2021-06-021-2/+2
|
* web: fix fatcat.wiki search option linkBryan Newbold2021-05-241-1/+1
|
* sitemaps: filter out 12-digit wayback timestamp access URLs, for nowBryan Newbold2021-05-192-0/+4
|
* web: fixes to access redirect endpointsBryan Newbold2021-05-193-1/+30
|
* sitemaps: find SCRIPT_DIRBryan Newbold2021-05-181-1/+2
|
* sitemaps: pdfs -> accessBryan Newbold2021-05-183-4/+4
|
* enable Greek (el) translation (still partial)Bryan Newbold2021-05-182-0/+2
|
* Translated using Weblate (Korean)YiReun2021-05-182-3/+3
| | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/ko/