diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-20 18:35:02 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-20 18:35:02 -0700 |
commit | ad7363b3a6dd9e595250698ce78177ebdeb12cd2 (patch) | |
tree | 3e106a7e380017c17540ea8729a9f82384eabc36 | |
parent | b9fb1c01e17133d7a8934dea542c3885204e6a0f (diff) | |
download | fatcat-ad7363b3a6dd9e595250698ce78177ebdeb12cd2.tar.gz fatcat-ad7363b3a6dd9e595250698ce78177ebdeb12cd2.zip |
more release schema changes
-rw-r--r-- | guide/src/entity_release.md | 56 |
1 files changed, 56 insertions, 0 deletions
diff --git a/guide/src/entity_release.md b/guide/src/entity_release.md index 709a020c..d9943d2a 100644 --- a/guide/src/entity_release.md +++ b/guide/src/entity_release.md @@ -24,6 +24,11 @@ publicly available. Blank if only year is known. - `release_year` (integer): year when this release was first made publicly available; should match `release_date` if both are known. +- `withdrawn_status` (string, controlled set): +- `release_date` (string, ISO date format): when this release was first made + publicly available. Blank if only year is known. +- `release_year` (integer): year when this release was first made + publicly available; should match `release_date` if both are known. - `ext_ids` (key/value object of string-to-string mappings): external identifiers. At least an empty `ext_ids` object is always required for release entities, so individual identifiers can be accessed directly. @@ -36,6 +41,16 @@ this release can be looked up under. This is a free-form string, and could represent the first page, a range of pages, or even prefix pages (like "xii-xxx"). +- `version` (string): optionally, describes distinguishes this release version + from others. Generally a number, software-style version, or other short/slug + string, not a freeform description. Book "edition" descriptions can also go + in an `edition` extra field. Often used in conjunction with external + identifiers. If you're not certain, don't use this field! +- `number` (string): an inherent identifier for this release (or work), often + part of the title. For example, standards numbers, technical memo numbers, + book series number, etc. Not a book `chapter` number however (which can be + stored in `extra`). Depending on field or series-specific norms, the number + may be stored here, in the title, or in both fields. - `publisher` (string): name of the publishing entity. This does not need to be populated if the associated `container` entity has the publisher field set, though it is acceptable to duplicate, as the publishing entity of a container @@ -106,6 +121,10 @@ The `ext_ids` object name-spaces external identifiers and makes it easier to add new identifiers to the schema in the future. +Many identifier fields must match an internal regex (string syntax constraint) +to ensure they are properly formatted, though these checks aren't always +complete or correct in more obscure cases. + - `doi` (string): full DOI number, lower-case. Example: "10.1234/abcde.789". See the "External Identifiers" section of style guide for more notes about DOIs specifically. @@ -241,6 +260,7 @@ All other CSL types are also allowed, though they are mostly out of scope: For the purpose of statistics, the following release types are considered "papers": +- `article` - `article-journal` - `chapter` - `paper-conference` @@ -278,6 +298,42 @@ does get a `withdrawn_status` metadata field set. When blank, indicates status isn't known, and wasn't inferred at creation time. Can often be interpreted as `published`, but be careful! +#### `withdrawn_status` Vocabulary + +Don't know of an existing controlled vocabulary for things like retractions or +other reasons for marking papers as removed from publication, so invented my +own. These labels should be considered experimental and subject to change. + +Note that some of these will apply more to pre-print servers or publishing +accidents, and don't necessarily make sense as a formal change of status for a +print journal publication. + +Any value at all indicates that the release should be considered "no longer +published by the publisher or primary host", which could mean different things +in different contexts. As some concrete examples, works are often accidentally +generated a duplicate DOI; physics papers have been taken down in reponse to +government order under national security justifications; papers have been +withdrawn for public health reasons (above and beyond any academic-style +retraction); entire journals may be found to be predatory and pulled from +circulation; individual papers may be retracted by authors if a serious mistake +or error is found; an author's entire publication history may be retracted in +cases of serious academic misconduct or fraud. + +- `withdrawn` is generic: the work is no longer available from the original + publisher. There may be no reason, or the reason may not be known yet. +- `retracted` for when a work is formally retracted, usually accompanied by a + retraction notice (a separate release under the same work). Note that the + retraction itself should not have a `withdrawn_status`. +- `concern` for when publishers release an "expression of concern", often + indicating that the work is not reliable in some way, but not yet formally + retracted. In this case the original work is probably still available, but + should be marked as suspect. This is not the same as presence of errata. +- `safety` for works pulled for public health or human safety concerns. +- `national-security` for works pulled over national security concerns. +- `spam` for content that is considered spam (eg, bogus pre-print or repository + submissions). Not to be confused with advertisements or product reviews in + journals. + #### `contribs.role` Vocabulary - `author` |