<feed xmlns='http://www.w3.org/2005/Atom'>
<title>fatcat/python/fatcat_tools/workers, branch v0.3.3</title>
<subtitle>[no description]</subtitle>
<id>https://git.bnewbold.net/fatcat/atom?h=v0.3.3</id>
<link rel='self' href='https://git.bnewbold.net/fatcat/atom?h=v0.3.3'/>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/'/>
<updated>2020-12-16T23:00:18+00:00</updated>
<entry>
<title>entity update worker: treat fileset and webcapture updates like file updates</title>
<updated>2020-12-16T23:00:18+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-12-16T22:58:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=f60ba0ea04081ac0095c12d8ecbaa48b3da74aee'/>
<id>urn:sha1:f60ba0ea04081ac0095c12d8ecbaa48b3da74aee</id>
<content type='text'>
When webcapture or fileset entities are updated, then the release
entities associated with them also need to be updated (and work
entities, recursively).

A TODO is to handle the case where a release_id is *removed* as well as
*added*, and reprocess the releases in that case as well.
</content>
</entry>
<entry>
<title>entity updates: don't ingest JSTOR DOI prefixes</title>
<updated>2020-10-23T20:17:34+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-10-23T20:17:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=bf77adc854022213951daec14bd904f483f21202'/>
<id>urn:sha1:bf77adc854022213951daec14bd904f483f21202</id>
<content type='text'>
</content>
</entry>
<entry>
<title>entity updater: new work update feed (ident and changelog metadata only)</title>
<updated>2020-10-16T23:41:16+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-10-16T03:53:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=99c54764f0cbbb05409001b4af83182842f4e52d'/>
<id>urn:sha1:99c54764f0cbbb05409001b4af83182842f4e52d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: default to crawl protocols.io DOIs</title>
<updated>2020-09-11T01:49:35+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-09-11T01:49:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=3bac4de14219b5d8a3ef8f5919c4f004663424c0'/>
<id>urn:sha1:3bac4de14219b5d8a3ef8f5919c4f004663424c0</id>
<content type='text'>
</content>
</entry>
<entry>
<title>entity updater: handle doi=None case better</title>
<updated>2020-08-14T23:09:32+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-14T23:09:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=e024a38bd29686eaa687e54cde1dc255ba6af52b'/>
<id>urn:sha1:e024a38bd29686eaa687e54cde1dc255ba6af52b</id>
<content type='text'>
</content>
</entry>
<entry>
<title>entity updater: es['publisher_type'] not always set</title>
<updated>2020-08-14T23:05:57+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-14T23:05:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=67c2dd909de3c5dada6efe8db2f59ed09e76d439'/>
<id>urn:sha1:67c2dd909de3c5dada6efe8db2f59ed09e76d439</id>
<content type='text'>
This is a small bugfix for a production issue.
</content>
</entry>
<entry>
<title>entity update: change big5 ingest behavior</title>
<updated>2020-08-11T22:45:39+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-11T22:45:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=03d2004717d36962aef1bd373d59ce799d7db9ab'/>
<id>urn:sha1:03d2004717d36962aef1bd373d59ce799d7db9ab</id>
<content type='text'>
In addition to changing the OA default, this was the main intended
behavior change in this group of commits: want to ingest fewer attempts
that we *expect* to fail, but default to ingest/crawl attempt if we are
uncertain. This is because there is a long tail of journals that
register DOIs and are defacto OA (fulltext is available), but we don't
have metadata indicating them as such.
</content>
</entry>
<entry>
<title>entity update: default to ingest non-OA works</title>
<updated>2020-08-11T22:32:28+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-11T22:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=5eddc9b9aefbd7ae197d441b8a7af1fded940e2d'/>
<id>urn:sha1:5eddc9b9aefbd7ae197d441b8a7af1fded940e2d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>entity update: skip ingest of figshare+zenodo 'group' DOIs</title>
<updated>2020-08-11T22:32:28+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-11T21:52:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=2a492914082444690f853a55ab1394fc0cf50108'/>
<id>urn:sha1:2a492914082444690f853a55ab1394fc0cf50108</id>
<content type='text'>
</content>
</entry>
<entry>
<title>update crawl blocklist for SPNv2 requests which mostly fail</title>
<updated>2020-08-10T22:07:19+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@robocracy.org</email>
</author>
<published>2020-08-10T22:07:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/fatcat/commit/?id=e9dd3c73f036d3fba2680eeaff8e62ecf2dbf9a1'/>
<id>urn:sha1:e9dd3c73f036d3fba2680eeaff8e62ecf2dbf9a1</id>
<content type='text'>
</content>
</entry>
</feed>
