aboutsummaryrefslogtreecommitdiffstats
path: root/notes/examples/dataset_examples.txt
blob: 3a0475015ee66e6dc7a4e8dc977cb7b5c7c08d5b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

### ArchiveOrg: CAT dataset

<https://archive.org/details/CAT_DATASET>

`release_36vy7s5gtba67fmyxlmijpsaui`

###

<https://archive.org/details/academictorrents_70e0794e2292fc051a13f05ea6f5b6c16f3d3635>

doi:10.1371/journal.pone.0120448

Single .rar file

### Dataverse

<https://dataverse.rsu.lv/dataset.xhtml?persistentId=doi:10.48510/FK2/IJO02B>

Single excel file

### Dataverse

<https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CLSFKX&version=1.1>

doi:10.7910/DVN/CLSFKX

Mulitple files; multiple versions?

API fetch: <https://dataverse.harvard.edu/api/datasets/:persistentId/?persistentId=doi:10.7910/DVN/CLSFKX&version=1.1>

    .data.id
    .data.latestVersion.datasetPersistentId
    .data.latestVersion.versionNumber, .versionMinorNumber
    .data.latestVersion.files[]
        .dataFile
            .contentType (mimetype)
            .filename
            .filesize (int, bytes)
            .md5
            .persistendId
            .description
        .label (filename?)
        .version

Single file inside: <https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/CLSFKX/XWEHBB>

Download single file: <https://dataverse.harvard.edu/api/access/datafile/:persistentId/?persistentId=doi:10.7910/DVN/CLSFKX/XWEHBB> (redirects to AWS S3)

Dataverse refs:
- 'doi' and 'hdl' are the two persistentId styles
- file-level persistentIds are optional, on a per-instance basis: https://guides.dataverse.org/en/latest/installation/config.html#filepidsenabled