summaryrefslogtreecommitdiffstats
path: root/extra/fatcat-cli.1.scdoc
blob: cee68ab30610c875fe10b7a2cee4d2dde11d49f7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
fatcat-cli(1) "Fatcat API Tool Manual Page"

# NAME

fatcat-cli - client for fatcat.wiki API

# SYNOPSIS

fatcat-cli [OPTIONS] <COMMAND> <ARGS>

# DESCRIPTION

This is simple command-line interface to the fatcat catalog API. Fatcat (https://fatcat.wiki) is an open bibliographic catalog of scholarly works, with a focus on access and preservation.

Many commands will work out-of-the-box, but all editing actions require authentication. Create an account on https://fatcat.wiki, then generate an API token (a long string of random characters) from the account page, and export to your shell environment (read below for the env variable to use).

# COMMANDS

## Search Commands

	*search* <ENTITY-TYPE> <QUERY>...
		Query the search index for entities of a specified type. Currently `release`, `container`, `fulltext`, `refs`, and `file` indexes are searchable. By default prints a table with a subset of metadata, but `--index-json` will output the search engine JSON document, or `--entity-json` will do an API fetch for each result and print the full entity JSON.

## Single Entity Commands

Most commands for interacting with individual catalog entities take a "specifier" which imples an entity type. These can be fatcat-specific "idents", which are an entity type followed by an underscore, then 26 character hash, such as "release_hsmo6p4smrganpb3fndaj2lon4". Or they can be an external identifier type, followed by a colon and the identifier, 

	*get* <SPECIFIER> [--expand <FIELDS>] [--hide <FIELDS>] [--json/--toml]
		Simply fetches the specified entity or other object from the API and prints to stdout. Currently pretty-prints JSON, but this behavior may change.

	*create* [-i/--input-file <PATH>] [-e/--editgroup-id <id>]
		Reads entity from file (or stdin), and adds to the editgroup specified by argument or environment variable.

	*update* <SPECIFIER> [<FIELD>=<VALUE> ...] [-i/--input-file <PATH>] [-e/--editgroup-id <id>]
		Can operate in two ways. If no input file is given, will fetch the specified entity, apply the given mutations (updating field values), and push the update to the specified editgroup. If an input file is given, that will be used instead of fetching from the API.
		If there is an edit to the same entity in the current editgroup, will delete the current edit ("update the edit"). Note this behavior could result in loss of the current edit if there is a problem updating.

	*delete* <SPECIFIER> [-e/--editgroup-id <id>]
		Deletes the specified entity, as part of the specified editgroup.

	*edit* <SPECIFIER> [-e/--editgroup-id <id>] [--toml] [--editing-command <EDITOR>]
		Helper command to edit the given entity using a local text editor. Fetches the entity, opens `$EDITOR` to modify it, then pushes the saved version as part of the given editgroup.

	*download* <SPECIFIER> [-o/--output-dir <path>]
		Downloads a publicly accessibly full-text version of the given entity to disk, if one exists. Currently works with file and release entities. Most files are PDF.

	*history* <SPECIFIER> [-n/--limit <count>] [--json]
		Displays the (accepted) edit history for the given entity.

## Batch Commands

Batch editing commands will operate on a stream of entities by automatically create new editgroups of a fixed batch size. Please be careful with these commands! Start small, and test against the QA API environment (api.qa.fatcat.wiki).

	*batch update* [<FIELD>=<VALUE> ...]
		Same as the `update` command, but operates on a stream of JSON entities (one per line).

	*batch create*
		Same as the `create` command, but operates on a stream of JSON entities (one per line).

	*batch download* [-j/--jobs=N]
		Same as `download`, but operates on a stream of entities. A tab-separated log of {entity, status, path} will be printed to stdout. The jobs argument can be used to download multiple files in parallel, up to a reasonable limit.

## Editgroup Commands

	*editgroups list* [-n/--limit <count>] [-e/--editor-id <ident>] [--json]
		Prints a simple table of editgroups created by the current user (requires authentication).

	*editgroups reviewable* [--json]
		Prints a table of "submitted" but not "accepted" editgroups, from all editors, which need review

	*editgroups submit* <EDITGROUP-ID>
		Submit the given editgroup for review (requires authentication)

	*editgroups unsubmit* <EDITGROUP-ID>
		Withdraws submission for review, so the editgroup can be further edited (requires authentication)

	*editgroups accept* <EDITGROUP-ID>
		Accepts the editgroup changes into the catalog (requires authentication and admin permissions)
		

## Other Commands

	*changelog* [--json]
		Prints a table of recent changelog entries (accepted editgroups)

	*status* [--json]
		Summarizes connection and authentication to the API server. Useful for debugging

# OPTIONS

*-h, --help*
	Prints help information

*-V, --version*
	Prints version information

*-v, --verbose*
	Pass many times for more log output
	By default, it'll only report errors. Passing `-v` one time also prints warnings, `-vv` enables info logging, `-vvv` debug, and `-vvvv` trace.

*--api-host <api-host>* [env: FATCAT_API_HOST] [default: https://api.fatcat.wiki]

*--api-token <api-token>* [env: FATCAT_API_AUTH_TOKEN]

*--search-host <search-host>* [env: FATCAT_SEARCH_HOST] [default: https://search.fatcat.wiki]

## Search Options

*-count*
	Just print the number of search results matching the query, instead of displaying the results themselves.

*-n, --limit <count>*
	Maximum number of search rows to be printed. Set to 0 to print all results (this is not the default behavior).

*--expand <fields>*
	When output is expanded entity JSON objects (`--entity-json`), this argument will be forwarded as the 'expand' paramter in API fetches. Multiple expansions can be separated by commas, with no space. For example, `--expand files,filesets`.

*--hide <fields>*
	Same as `--expand`, but for hiding fields/sub-entities.

*--expand-json*
	For each search result row, do an API fetch for the entity and print the entity as JSON. Because there is an API call for each row, this is much slower than the default table output, or the `--index-json` output.

*--index-json*
	For each search result row, print the search engine (Elasticsearch) indexed "document", as JSON.

## Batch Options

*-i, --input-file*
	JSON lines file to read entities from. Defaults to stdin; "-" can also be passed to explicitly use stdin.

*-n, --limit <count>*
	Only operate on the given number of entities. By default, no limit. Good to use defensively to prevent large accidental edits.

*--batch-size <count>*
	For editing batch commands, how many entity edits should be bundled into each editgroup.

*--auto-accept*
	For editing batch commands, this argument will result in each editgroup being accepted without review. Requires admin permissions.

# EDITING

Every change to the catalog (an "edit") is made as part of an "editgroup". In some cases the CLI tool with create or guess what the current editgroup you are working on is, but you can also create them explicitly and pass the editgroup identifier on every subsequent edit. It is best to combine small groups of related changes into the same editgroup (so they can be reviewed together), but to split up larger batches into editgroups of 50-100 changes at a time.

Create a new editgroup:

	fatcat-cli editgroups create --description "demonstration edit"

	# grab the editgroup_id from the output, eg "uy7qzonuwbcitdhhyuk5vjtsdy"

Individual entities can be edited from the convenience of your text editor, in either JSON or TOML format:

	fatcat-cli get release_hsmo6p4smrganpb3fndaj2lon4 --json > release_hsmo6p4smrganpb3fndaj2lon4.json

	# whatever editor you prefer
	$EDITOR release_hsmo6p4smrganpb3fndaj2lon4

	fatcat-cli update release_hsmo6p4smrganpb3fndaj2lon4 -e <editgroup_id> < release_hsmo6p4smrganpb3fndaj2lon4.json

Or, with a single command:

	fatcat-cli edit release_hsmo6p4smrganpb3fndaj2lon4 --toml -e <editgroup_id>

To check in on the status of recent editgroups, or to "submit" them for review:

	fatcat-cli editgroups list
	fatcat-cli editgroups submit <editgroup_id>


# EXAMPLES

Query the catalog:

	fatcat-cli search releases author:phillips metadata year:2014

Fetch metadata for a specific work:

	fatcat-cli get doi:10.1002/spe.659

Download 100 papers from a specific journal, as PDF, to current folder:

	fatcat-cli search releases journal:"first monday" --entity-json --expand files -n0 | fatcat-cli batch download --limit 100