| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently using two external libraries:
* dateparser
* langcodes
Note: This commit includes lots of wip docs and field stat in comment,
which should be removed.
|
| |
| |
| |
| |
| |
| | |
* contributors, title, date, publisher, container, license
Field and value analysis via https://github.com/miku/indigo.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
pytest has been pinned to the 4.x series to work around a test import
package mangling problem with citeproc_styles. Now that pytest.ini
explicitly lists test files, this seems to no longer be a problem and
pytest can be updated to the most recent version.
Also re-locked Pipfile.lock with updated dependencies (only minor
changes).
|
|/
|
|
|
|
|
|
|
|
|
| |
The purpose of this change is to test errors when pytest tries to
recursively update assertion statements in all dependent packages. The
reason pytest does this is to add pretty printing, which is nice, but
probably shouldn't be done in all dependency libraries.
This fixes test problems with both CSL (citeproc_styles) and dateparser
(when actually imported in code, which currently on master does not
happen).
|
| |
|
|
|
|
|
|
| |
Produced messages should match:
jq '.data|length' tests/files/datacite_api.json
|
| |
|
|
|
|
|
| |
The bracket syntax is inclusive. See also:
https://www.elastic.co/guide/en/elasticsearch/reference/7.5/query-dsl-query-string-query.html#_ranges
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As a first iteration, just mark the daily batch complete and continue.
The occasional HTTP 400 issue has been reported as
https://github.com/datacite/datacite/issues/897.
A possible improvement would be to shrink the window, so losses will be
smaller.
|
| |
|
|
|
|
|
|
|
|
|
| |
Update parameter update for datacite API v2. Works fine, but there are
occasional HTTP 400 responses when using the cursor API (daily updates
can exceed the 10000 record limit for search queries).
The HTTP 400 issue is not solved yet, but reported to datacite as
https://github.com/datacite/datacite/issues/897.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Replace emdash with regular dash.
Replace double slash after partner ID with single slash. This conversion
seems to be done by crossref automatically on lookup. I tried several
examples, using doi.org resolver and Crossref API lookup.
Note that there are a number of fatcat entities with '//' in the DOI.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Check was happing after the `return True` by mistake, allowing
duplicates in SPN editgroups, and potentially in ingest request
editgroups as well.
|
|
|
|
|
|
|
| |
Small ergonomic changes for datacite releases:
- add a link to live/current datacite metadata (like we do for Crossref)
- expand "extra" metadata fields under 'datacite' dict in metadata view
|
| |
|
|
|
|
|
|
|
|
| |
loginpass patches got accepted upstream a while back, so don't need to
pin to a git version
ipython 7.10 seems to have problems installing, so restricting to
earlier 6.x versions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents a test exception that presents like:
tests/transform_csl.py:46:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
fatcat_tools/transforms/csl.py:204: in citeproc_csl
style_path = get_style_filepath(style)
.venv/lib/python3.5/site-packages/citeproc_styles/__init__.py:74: in get_style_filepath
if resource_exists(__name__, independent_style):
.venv/lib/python3.5/site-packages/pkg_resources/__init__.py:1134: in resource_exists
return get_provider(package_or_requirement).has_resource(resource_name)
.venv/lib/python3.5/site-packages/pkg_resources/__init__.py:1404: in has_resource
return self._has(self._fn(self.module_path, resource_name))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pkg_resources.NullProvider object at 0x7f4f38c0bb00>
path = '/home/bnewbold/code/fatcat/python/.venv/lib/python3.5/site-packages/citeproc_styles/styles/bibtex.csl'
def _has(self, path):
raise NotImplementedError(
> "Can't perform this operation for unregistered loader type"
)
E NotImplementedError: Can't perform this operation for unregistered loader type
|
|
|
|
|
|
| |
This is still manually tweaked. I believe i've bifurcated the source of
the CSL/citeproc_style import error to upgrade of the 'pytest' module.
This commit upgrades all packages except pytest.
|
| |
|
|
|
|
|
| |
During debugging, it can be helpful to keep stdout (e.g. processing
results) and dignostic messages separate.
|
|\
| |
| |
| |
| | |
Update EntityImporter docstring.
See merge request webgroup/fatcat!9
|
| | |
|
| |
| |
| |
| | |
I believe the required method is `parse_record`, not `parse`.
|
| |
| |
| |
| |
| |
| |
| |
| | |
The common case is the same URL being submitted repeatedly during
testing.
This is only within-editgroup, and per importer (eg, won't work across
spn importer "submitted" editgroups), but is better than nothing.
|
| | |
|
| |
| |
| |
| |
| | |
This is mostly changing ingest_type from 'file' to 'pdf', and adding
'link_source'/'link_source_id', plus some small cleanups.
|
| |
| |
| |
| | |
We really should just use file_meta result or nothing.
|
| |
| |
| |
| | |
Also fix a spurious typo.
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
As a form of documentation
|
| |
| |
| |
| | |
Based on ingest-file-results importer
|
| | |
|
|/
|
|
|
| |
For use with bots that don't have admin privileges, or where human
follow-up review is desired.
|