Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | config: remove 'scholar-svc500' name from QA config | Bryan Newbold | 2021-08-03 | 2 | -6/+5 |
| | |||||
* | web: access_redirect_fallback mechanism | Bryan Newbold | 2021-07-26 | 3 | -67/+297 |
| | | | | | | | | | | | | This adds a helper code path that "tries harder" to find an access link, by querying the fatcat API directly to look for any file from any release associated with the work. If it finds a match, it does the redirect as usual (but does log the incident). If no match can be found, there is now a more helpful access-specific 404 error page. If the *work* is a 404, the generic error page is shown. | ||||
* | better parsing of year as integer in refs pipeline | Bryan Newbold | 2021-07-26 | 2 | -4/+8 |
| | |||||
* | make fmt | Bryan Newbold | 2021-07-26 | 3 | -10/+24 |
| | |||||
* | fix failing test after clean_doi() | Bryan Newbold | 2021-07-26 | 1 | -1/+1 |
| | |||||
* | ref_key: hotfix for some corner cases | Bryan Newbold | 2021-07-26 | 1 | -8/+25 |
| | |||||
* | notes on bundle/refs dump iteration | Bryan Newbold | 2021-07-26 | 1 | -0/+63 |
| | |||||
* | transform: more clean_doi() calls | Bryan Newbold | 2021-07-26 | 1 | -3/+3 |
| | |||||
* | refs transform: consolidate clean_ref_key() hacks | Bryan Newbold | 2021-07-25 | 1 | -17/+35 |
| | |||||
* | refs transform: many fixes | Bryan Newbold | 2021-07-25 | 3 | -10/+308 |
| | | | | | | | | | - include year correctly (many cases) - test coverage for Crossref transform - pass-through 'edition' as 'version' - series-title parsed in to title or container as appropriate - missing release stage - fix 0-index vs. 1-index ref index field | ||||
* | bibref: add version field; isbn13 -> isbn | Bryan Newbold | 2021-07-25 | 1 | -1/+2 |
| | |||||
* | refs transform: 1-index refs.index, not 0-index | Bryan Newbold | 2021-07-25 | 3 | -5/+13 |
| | | | | | | | | This was not matching expectations/schema of downstream refs pipeline (cgraph), and wasn't matching documented schema. Note care required when checking if the index is set, to distinguish between '0' and 'None' values. | ||||
* | web: fix paper.fulltext is None error | Bryan Newbold | 2021-07-24 | 1 | -1/+1 |
| | |||||
* | Translated using Weblate (Greek) | Eugenia Russell | 2021-07-07 | 2 | -4/+4 |
| | | | | | | | Currently translated at 39.3% (70 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/ | ||||
* | Translated using Weblate (Croatian) | Milo Ivir | 2021-07-07 | 2 | -5/+6 |
| | | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/hr/ | ||||
* | Translated using Weblate (Greek) | Eugenia Russell | 2021-07-07 | 2 | -8/+8 |
| | | | | | | | Currently translated at 38.2% (68 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/ | ||||
* | refs: clean up GROBID DOIs and PMCIDs | Bryan Newbold | 2021-07-01 | 2 | -12/+9 |
| | |||||
* | HACK: don't parse TEI-XML for a specific paper/file | Bryan Newbold | 2021-06-30 | 1 | -2/+4 |
| | | | | | GROBID v0.5.5 returns TEI-XML for this one PDF which is not valid XML, due to a text encoding issue. | ||||
* | refs: include (source) release_stage in output | Bryan Newbold | 2021-06-30 | 3 | -9/+20 |
| | |||||
* | pipenv: use correct/full beautifulsoup package name | Bryan Newbold | 2021-06-30 | 2 | -96/+97 |
| | |||||
* | robots: add commented out reference to sitemap-index-access.xml | Bryan Newbold | 2021-06-30 | 1 | -0/+3 |
| | |||||
* | sitemaps: change filters; only primary release fulltext (via jq); scp to replica | Bryan Newbold | 2021-06-22 | 3 | -5/+12 |
| | |||||
* | fix jinja2/babel deprecations | Bryan Newbold | 2021-06-11 | 3 | -6/+6 |
| | |||||
* | bump jinja2 major version, and other dep version updates | Bryan Newbold | 2021-06-11 | 2 | -202/+209 |
| | |||||
* | commit missing elastic get example JSON files | Bryan Newbold | 2021-06-11 | 2 | -0/+174 |
| | |||||
* | sitemap: new access URL format | Bryan Newbold | 2021-06-11 | 2 | -7/+6 |
| | |||||
* | update citation_pdf_url HTML meta tag to new access URL style | Bryan Newbold | 2021-06-11 | 3 | -13/+21 |
| | |||||
* | update access redirect URL endpoints | Bryan Newbold | 2021-06-11 | 3 | -75/+72 |
| | |||||
* | update indexability proposal based on feedback | Bryan Newbold | 2021-06-11 | 1 | -22/+19 |
| | |||||
* | catch/ignore ChunkedEncoding errors in fetches | Bryan Newbold | 2021-06-11 | 2 | -0/+6 |
| | |||||
* | bugfix: pass full crossref obj, not just 'record' | Bryan Newbold | 2021-06-02 | 1 | -1/+1 |
| | |||||
* | refs: use fatcat prefix for some sources | Bryan Newbold | 2021-06-02 | 1 | -5/+5 |
| | | | | This makes debugging what is going on much easier | ||||
* | web: add goatcounter tracking to more download buttons | Bryan Newbold | 2021-06-02 | 1 | -4/+4 |
| | |||||
* | integrate crossref references, and iterate on refs output logic | Bryan Newbold | 2021-06-02 | 1 | -7/+115 |
| | | | | Needs test coverage! | ||||
* | lint fixes, and run fmt | Bryan Newbold | 2021-06-02 | 4 | -22/+11 |
| | |||||
* | add 'crossref' hydration to work pipeline | Bryan Newbold | 2021-06-02 | 3 | -0/+62 |
| | | | | | | | | The immediate motivation is to include recent crossref refs in citation graph transforms. May also be valuable for researchers to have authoritative/publisher metadata in the bundle dumps. | ||||
* | schema: add 'crossref' to bundle schema, and add from_json() helper | Bryan Newbold | 2021-06-02 | 5 | -40/+28 |
| | | | | | from_json() refactor was an earlier TODO, to reduce duplication when updating fields on this class | ||||
* | lint: ignore mypy on some template imports | Bryan Newbold | 2021-06-02 | 1 | -2/+2 |
| | |||||
* | web: fix fatcat.wiki search option link | Bryan Newbold | 2021-05-24 | 1 | -1/+1 |
| | |||||
* | sitemaps: filter out 12-digit wayback timestamp access URLs, for now | Bryan Newbold | 2021-05-19 | 2 | -0/+4 |
| | |||||
* | web: fixes to access redirect endpoints | Bryan Newbold | 2021-05-19 | 3 | -1/+30 |
| | |||||
* | sitemaps: find SCRIPT_DIR | Bryan Newbold | 2021-05-18 | 1 | -1/+2 |
| | |||||
* | sitemaps: pdfs -> access | Bryan Newbold | 2021-05-18 | 3 | -4/+4 |
| | |||||
* | enable Greek (el) translation (still partial) | Bryan Newbold | 2021-05-18 | 2 | -0/+2 |
| | |||||
* | Translated using Weblate (Korean) | YiReun | 2021-05-18 | 2 | -3/+3 |
| | | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/ko/ | ||||
* | Translated using Weblate (Portuguese) | ssantos | 2021-05-18 | 2 | -12/+13 |
| | | | | | | | Currently translated at 100.0% (178 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/pt/ | ||||
* | Translated using Weblate (Greek) | Eugenia Russell | 2021-05-18 | 2 | -27/+43 |
| | | | | | | | Currently translated at 34.2% (61 of 178 strings) Translation: Internet Archive/Archive Scholar (web interface) Translate-URL: https://hosted.weblate.org/projects/internetarchive/fatcat-scholar/el/ | ||||
* | web: add mellon grant acknowledgement to about page | Bryan Newbold | 2021-05-18 | 1 | -0/+11 |
| | |||||
* | sitemaps: PDF sitemaps | Bryan Newbold | 2021-05-18 | 5 | -0/+42 |
| | |||||
* | sitemaps: remove date from sitmap URLs | Bryan Newbold | 2021-05-17 | 2 | -7/+2 |
| |