<feed xmlns='http://www.w3.org/2005/Atom'>
<title>fatcat/python/tests/files, branch bnewbold-rust-gen-v5</title>
<subtitle>[no description]</subtitle>
<id>https://git.bnewbold.net/fatcat/atom?h=bnewbold-rust-gen-v5</id>
<link rel='self' href='https://git.bnewbold.net/fatcat/atom?h=bnewbold-rust-gen-v5'/>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/'/>
<updated>2020-04-22T20:25:36+00:00</updated>
<entry>
<title>datacite: fix type error</title>
<updated>2020-04-22T20:25:36+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin.czygan@gmail.com</email>
</author>
<published>2020-04-22T20:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=e0baeade7924019c5bbd27d9a7c116a1e26854fc'/>
<id>urn:sha1:e0baeade7924019c5bbd27d9a7c116a1e26854fc</id>
<content type='text'>
Up to now, we expected the description to be a string or list. Add
handling for int as well.

First appeared: Apr 22 19:58:39.
</content>
</entry>
<entry>
<title>datacite: fix a raw name constraint violation</title>
<updated>2020-04-20T18:52:10+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin.czygan@gmail.com</email>
</author>
<published>2020-04-20T18:52:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=7c6febf20c84dd4f5778e1fb02369456f7dad344'/>
<id>urn:sha1:7c6febf20c84dd4f5778e1fb02369456f7dad344</id>
<content type='text'>
It was possible that contribs got added which had no raw name. One
example would be a name consisting of whitespace only.

This fix adds a final check for this case.
</content>
</entry>
<entry>
<title>pubmed: handle multiple ReferenceList</title>
<updated>2020-03-20T20:00:52+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-03-20T20:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=a6f74183dd1cf1eaa44f7edeb98dbc5dc737dabb'/>
<id>urn:sha1:a6f74183dd1cf1eaa44f7edeb98dbc5dc737dabb</id>
<content type='text'>
This resolves a situation noticed in prod where we were only
importing/updating a single reference per article.

Includes a regression test.
</content>
</entry>
<entry>
<title>Merge branch 'martin-kafka-bs4-import' into 'master'</title>
<updated>2020-03-10T15:33:17+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin@archive.org</email>
</author>
<published>2020-03-10T15:33:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=336630e1d445fb9d233447f9af4bac94473a12bf'/>
<id>urn:sha1:336630e1d445fb9d233447f9af4bac94473a12bf</id>
<content type='text'>
pubmed and arxiv harvest preparations

See merge request webgroup/fatcat!28</content>
</entry>
<entry>
<title>Merge branch 'bnewbold-elastic-v03b'</title>
<updated>2020-02-27T06:05:43+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-02-27T06:05:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=ae50ee2274031ddc178fa4a10b59280e8440a24c'/>
<id>urn:sha1:ae50ee2274031ddc178fa4a10b59280e8440a24c</id>
<content type='text'>
</content>
</entry>
<entry>
<title>more pubmed adjustments</title>
<updated>2020-02-22T16:44:38+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin.czygan@gmail.com</email>
</author>
<published>2020-02-19T01:28:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=376053a479a8d683fc5e099d0b0b3cb76c026d16'/>
<id>urn:sha1:376053a479a8d683fc5e099d0b0b3cb76c026d16</id>
<content type='text'>
* regenerate map in continuous mode
* add tests
</content>
</entry>
<entry>
<title>shadow import: more filtering of file_meta fields</title>
<updated>2020-02-14T06:24:20+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-01-30T20:15:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=87029cb13d244381f915fe66e40760477edb5675'/>
<id>urn:sha1:87029cb13d244381f915fe66e40760477edb5675</id>
<content type='text'>
</content>
</entry>
<entry>
<title>basic shadow importer</title>
<updated>2020-02-14T06:24:20+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2019-12-24T01:59:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=e59d1b617d4abd5f002d9e59b6bbaebc9ff30993'/>
<id>urn:sha1:e59d1b617d4abd5f002d9e59b6bbaebc9ff30993</id>
<content type='text'>
</content>
</entry>
<entry>
<title>datacite: add exception for https://www.micropublication.org/</title>
<updated>2020-01-31T00:44:46+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin.czygan@gmail.com</email>
</author>
<published>2020-01-31T00:44:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=a42206d2603e28f1311ac3873dc168c78eabffee'/>
<id>urn:sha1:a42206d2603e28f1311ac3873dc168c78eabffee</id>
<content type='text'>
</content>
</entry>
<entry>
<title>datacite: improve date handling and minor tweak</title>
<updated>2020-01-30T12:36:01+00:00</updated>
<author>
<name>Martin Czygan</name>
<email>martin.czygan@gmail.com</email>
</author>
<published>2020-01-30T12:36:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=7dec2d1560ebf5ca6d0d337eb246fe345f6ec0bb'/>
<id>urn:sha1:7dec2d1560ebf5ca6d0d337eb246fe345f6ec0bb</id>
<content type='text'>
Records from https://www.micropublication.org/ did not have a date in
FC, although raw data contained date strings - they were not using the
finer-grained "attributes.date" but "attributes.published" and/or
"attributes.publicationYear".

Support for those fields has been added, including a test case.

During this test (#30) a processing gap for names became clear (author
may have "given_name" and "surname", but no "name"). This bug has been
fixed, too.
</content>
</entry>
</feed>
