diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2019-02-14 12:24:55 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2019-02-14 12:24:55 -0800 |
commit | 76ac2a96a6bd3910f8f4af18f79b539b1d29edf9 (patch) | |
tree | 0320d32ff6b51acdf5d27b10fb846e357fcea36a /guide/src/container_extra.md | |
parent | 22574f18e59bbed73ab1d76906a5ad5fb1d0f5f8 (diff) | |
download | fatcat-76ac2a96a6bd3910f8f4af18f79b539b1d29edf9.tar.gz fatcat-76ac2a96a6bd3910f8f4af18f79b539b1d29edf9.zip |
provenance, not progeny
Diffstat (limited to 'guide/src/container_extra.md')
-rw-r--r-- | guide/src/container_extra.md | 78 |
1 files changed, 78 insertions, 0 deletions
diff --git a/guide/src/container_extra.md b/guide/src/container_extra.md new file mode 100644 index 00000000..224b7e8a --- /dev/null +++ b/guide/src/container_extra.md @@ -0,0 +1,78 @@ + +'extra' fields: + + doaj + as_of: datetime of most recent check; if not set, not actually in DOAJ + seal: bool + work_level: bool (are work-level publications deposited with DOAJ?) + archiving: array, can include 'library' or 'other' + road + as_of: datetime of most recent check; if not set, not actually in ROAD + pubmed (TODO: delete?) + as_of: datetime of most recent check; if not set, not actually indexed in pubmed + norwegian (TODO: drop this?) + as_of: datetime of most recent check; if not set, not actually indexed in pubmed + id (integer) + level (integer; 0-2) + kbart + lockss + year_rle + volume_rle + portico + ... + clockss + ... + sherpa_romeo + color + jstor + year_rle + volume_rle + scopus + id + TODO: print/electronic distinction? + wos + id + doi + crossref_doi: DOI of the title in crossref (if exists) + prefixes: array of strings (DOI prefixes, up to the '/'; any registrar, not just Crossref) + ia + sim + nap_id + year_rle + volume_rle + longtail: boolean + homepage + as_of: datetime of last attempt + url + status: HTTP/heritrix status of homepage crawl + + issnp: string + issne: string + coden: string + abbrev: string + oclc_id: string (TODO: lookup?) + lccn_id: string (TODO: lookup?) + dblb_id: string + default_license: slug + original_name: native name (if name is translated) + platform: hosting platform: OJS, wordpress, scielo, etc + mimetypes: array of strings (eg, 'application/pdf', 'text/html') + first_year: year (integer) + last_year: if publishing has stopped + primary_language: single ISO code, or 'mixed' + languages: array of ISO codes + region: TODO: continent/world-region + nation: shortcode of nation + discipline: TODO: highest-level subject; "life science", "humanities", etc + field: TODO: narrower description of field + subjects: TODO? + url: homepage + is_oa: boolean. If true, can assume all releases under this container are "Open Access" + TODO: domains, if exclusive? + TODO: fulltext_regex, if a known pattern? + +For KBART, etc: + We "over-count" on the assumption that "in-progress" status works will soon actually be preserved. + year and volume spans are run-length-encoded arrays, using integers: + - if an integer, means that year is preserved + - if an array of length 2, means everything between the two numbers (inclusive) is preserved |