aboutsummaryrefslogtreecommitdiffstats
path: root/guide/src/container_extra.md
blob: 224b7e8ab8a103dad43ff63d19daf91a0b864d71 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

'extra' fields:

    doaj
        as_of: datetime of most recent check; if not set, not actually in DOAJ
        seal: bool
        work_level: bool (are work-level publications deposited with DOAJ?)
        archiving: array, can include 'library' or 'other'
    road
        as_of: datetime of most recent check; if not set, not actually in ROAD
    pubmed (TODO: delete?)
        as_of: datetime of most recent check; if not set, not actually indexed in pubmed
    norwegian (TODO: drop this?)
        as_of: datetime of most recent check; if not set, not actually indexed in pubmed
        id (integer)
        level (integer; 0-2)
    kbart
        lockss
            year_rle
            volume_rle
        portico
            ...
        clockss
            ...
    sherpa_romeo
        color
    jstor
        year_rle
        volume_rle
    scopus
        id
        TODO: print/electronic distinction?
    wos
        id
    doi
        crossref_doi: DOI of the title in crossref (if exists)
        prefixes: array of strings (DOI prefixes, up to the '/'; any registrar, not just Crossref)
    ia
        sim
            nap_id
            year_rle
            volume_rle
        longtail: boolean
        homepage
            as_of: datetime of last attempt
            url
            status: HTTP/heritrix status of homepage crawl

    issnp: string
    issne: string
    coden: string
    abbrev: string
    oclc_id: string (TODO: lookup?)
    lccn_id: string (TODO: lookup?)
    dblb_id: string
    default_license: slug
    original_name: native name (if name is translated)
    platform: hosting platform: OJS, wordpress, scielo, etc
    mimetypes: array of strings (eg, 'application/pdf', 'text/html')
    first_year: year (integer)
    last_year: if publishing has stopped
    primary_language: single ISO code, or 'mixed'
    languages: array of ISO codes
    region: TODO: continent/world-region
    nation: shortcode of nation
    discipline: TODO: highest-level subject; "life science", "humanities", etc
    field: TODO: narrower description of field
    subjects: TODO?
    url: homepage
    is_oa: boolean. If true, can assume all releases under this container are "Open Access"
    TODO: domains, if exclusive?
    TODO: fulltext_regex, if a known pattern?

For KBART, etc:
    We "over-count" on the assumption that "in-progress" status works will soon actually be preserved.
    year and volume spans are run-length-encoded arrays, using integers:
        - if an integer, means that year is preserved
        - if an array of length 2, means everything between the two numbers (inclusive) is preserved