aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gitignore codegen rust Cargo.toml codegenBryan Newbold2020-05-101-0/+1
|
* codegen patches (using script)Bryan Newbold2020-05-101-12/+10
|
* updated rust codegen scriptBryan Newbold2020-05-101-53/+23
|
* WIP: update rust codegen scriptBryan Newbold2020-05-1056-23973/+50422
| | | | Only Cargo.toml project metadata updated.
* Merge branch 'martin-fix-container-empty-search' into 'master'Martin Czygan2020-04-291-0/+4
|\ | | | | | | | | search: assume * when q is not set or empty See merge request webgroup/fatcat!51
| * search: assume * when q is not set or emptyMartin Czygan2020-04-291-0/+4
| | | | | | | | An example would be a blank search from a container details page.
* | Merge branch 'bnewbold-search-tweaks' into 'master'bnewbold2020-04-273-92/+132
|\ \ | |/ |/| | | | | tweaks to search result pages See merge request webgroup/fatcat!50
| * web search: tweak release search result styleBryan Newbold2020-04-231-25/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | This is also back-ported from covid19.fatcat.wiki, though with some more tweaks on top. The changes are: - show original title if available (usually non-English) - move release_type label to title line suffix, and only show if not a "paper" - show publication status and withdrawl as text after the journal title, not as a label
| * web search: improve indentation, fix missing div tagsBryan Newbold2020-04-232-67/+81
| | | | | | | | These are back-ported fixes from covid19.fatcat.wiki
* | Merge branch 'bnewbold-non-ident-fix' into 'master'Martin Czygan2020-04-243-6/+10
|\ \ | | | | | | | | | | | | | | | | | | fix ident=None broken links Closes #3 See merge request webgroup/fatcat!49
| * | web: fix ident=None broken linksBryan Newbold2020-04-233-6/+10
| |/ | | | | | | | | | | On web interface views for revisions, we had a bunch of broken links because the ident is "None". This commit fixes these by removing the links.
* | Merge branch 'martin-datacite-fix-parse-record-int' into 'master'bnewbold2020-04-244-2/+80
|\ \ | | | | | | | | | | | | datacite: fix type error See merge request webgroup/fatcat!48
| * | datacite: fix type errorMartin Czygan2020-04-224-2/+80
|/ / | | | | | | | | | | | | Up to now, we expected the description to be a string or list. Add handling for int as well. First appeared: Apr 22 19:58:39.
* | Merge branch 'martin-datacite-fix-release-contrib-raw-name-check-violation' ↵bnewbold2020-04-204-1/+86
|\ \ | | | | | | | | | | | | | | | | | | into 'master' datacite: fix a raw name constraint violation See merge request webgroup/fatcat!47
| * | datacite: fix a raw name constraint violationMartin Czygan2020-04-204-1/+86
| |/ | | | | | | | | | | | | It was possible that contribs got added which had no raw name. One example would be a name consisting of whitespace only. This fix adds a final check for this case.
* | Merge branch 'bnewbold-fix-changelog-es' into 'master'Martin Czygan2020-04-201-3/+12
|\ \ | |/ |/| | | | | fixes for changelog elasticsearch worker See merge request webgroup/fatcat!46
| * more changelog ES fixesBryan Newbold2020-04-171-4/+6
| |
| * ES changelog worker: fixes for ident; fetch update from API if neededBryan Newbold2020-04-171-2/+9
|/ | | | | The API fetch update may be needed for old changelog entries in the kafka feed.
* Merge branch 'bnewbold-py37-cleanups' into 'master'bnewbold2020-04-177-64/+39
|\ | | | | | | | | py37 cleanups See merge request webgroup/fatcat!44
| * example of starting to use format stringsBryan Newbold2020-04-171-12/+12
| |
| * pytest: ignore remaining deprecation warnings in 3rd party librariesBryan Newbold2020-04-171-0/+2
| |
| * consistently use raw string prefix for regexBryan Newbold2020-04-173-7/+7
| |
| * pipenv: update deps for python 3.7Bryan Newbold2020-04-172-45/+18
| | | | | | | | | | We had some pre-3.6 work arounds. Also seems like a reasonable time to update all depdencies to most recent versions.
* | Merge branch 'martin-changelog-to-es' into 'master'bnewbold2020-04-173-3/+41
|\ \ | | | | | | | | | | | | derive changelog worker from release worker See merge request webgroup/fatcat!43
| * | derive changelog worker from release workerMartin Czygan2020-04-173-3/+41
| | | | | | | | | | | | | | | Early versions of changelog entries may not have all the fields required for the current transform.
* | | Merge branch 'martin-changelog-release-types' into 'master'bnewbold2020-04-171-12/+17
|\ \ \ | | | | | | | | | | | | | | | | changelog: extend release_types considered documents See merge request webgroup/fatcat!42
| * | | changelog: limit typesMartin Czygan2020-04-161-5/+1
| | | | | | | | | | | | | | | | | | | | No partial docs (e.g. abstract), too generic components and entries, not HTML blogs.
| * | | changelog: extend release_types considered documentsMartin Czygan2020-04-161-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | according to release_rev.release_type, we have 29 values: fatcat_prod=# select release_type, count(release_type) from release_rev group by release_type; release_type | count -------------------+----------- abstract | 2264 article | 6371076 article-journal | 101083841 article-newspaper | 17062 book | 1676941 chapter | 13914854 component | 58990 dataset | 6860325 editorial | 133573 entry | 1628487 graphic | 1809471 interview | 19898 legal_case | 3581 legislation | 1626 letter | 275119 paper-conference | 6074669 peer_review | 30581 post | 245807 post-weblog | 135 report | 1010699 retraction | 1292 review-book | 96219 software | 316 song | 24027 speech | 4263 standard | 312364 stub | 1036813 thesis | 414397 | 0 (29 rows)
* | | | update prod statsBryan Newbold2020-04-177-0/+149
| |_|/ |/| |
* | | CHANGELOG entry for python 3.7Bryan Newbold2020-04-171-0/+4
| | |
* | | retro-active v0.3.2 changelog updatesBryan Newbold2020-04-172-2/+41
| | |
* | | Add missing packages to Dockerfile and CI fileBryan Newbold2020-04-162-3/+3
| | |
* | | ci: don't re-build/install commands if existingBryan Newbold2020-04-161-2/+2
| | |
* | | ci: only build postgres feature for dieselBryan Newbold2020-04-161-1/+1
| | |
* | | test-base DockerfileBryan Newbold2020-04-162-0/+51
| | | | | | | | | | | | Used to create bnewbold/fatcat-test-base image
* | | ci: switch to fatcat-test-base Docker imageBryan Newbold2020-04-161-1/+1
| |/ |/| | | | | Goal is to speed up CI runs.
* | CI: add libpq-dev (for diesel build)Bryan Newbold2020-04-161-1/+1
|/ | | | Not sure why things build without this.
* get gitlab-ci working with python3.7Bryan Newbold2020-04-131-2/+2
| | | | | | Required updating to newer 'buster' Debian distro, and a newer rust release to work around a Docker/OCI containerization issue with older docker images.
* update README and coveragerc for python3.7Bryan Newbold2020-04-132-5/+5
|
* pipenv: switch from python3.5 to python3.7Bryan Newbold2020-04-132-210/+204
| | | | Also updates dependencies.
* ingest: configurable ES indexv0.3.2Bryan Newbold2020-04-081-1/+4
|
* update bulk export instructionsBryan Newbold2020-04-071-4/+2
| | | | | - don't do expanded and regular release dumps - default to sqldump_public for item name (as that is common-case)
* Merge branch 'bnewbold-pubmed-get_text' into 'master'bnewbold2020-04-014-39/+47
|\ | | | | | | | | beautifulsoup XML parsing: .string vs. .get_text() See merge request webgroup/fatcat!40
| * pubmed: use untranslated title if translated not availableBryan Newbold2020-04-011-0/+6
| | | | | | | | | | | | | | The primary motivation for this change is that fatcat *requires* a non-empty title for each release entity. Pubmed/Medline occasionally indexes just a VenacularTitle with no ArticleTitle for foreign publications, and currently those records don't end up in fatcat at all.
| * importers: replace newlines in get_text() stringsBryan Newbold2020-04-014-23/+25
| |
| * importers: more string/get_text swapsBryan Newbold2020-03-283-27/+27
| | | | | | | | See previous pubmed commit for details.
| * pubmed: bunch of .get_text() instead of .stringBryan Newbold2020-03-281-12/+12
| | | | | | | | | | | | | | | | | | | | | | Yikes! Apparently when a tag has child tags, .string will return None instead of all the strings. .get_text() returns all of it: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text https://www.crummy.com/software/BeautifulSoup/bs4/doc/#string I've things like identifiers as .string, when we expect only a single string inside.
* | Merge branch 'bnewbold-match-proposal' into 'master'Martin Czygan2020-04-011-0/+430
|\ \ | | | | | | | | | | | | proposal: fuzzy matching See merge request webgroup/fatcat!39
| * | proposal: fuzzy matchingbnewbold2020-04-011-0/+430
|/ /
* | sql_dumps: stop doing redundant release dumpsBryan Newbold2020-04-011-1/+3
| |