aboutsummaryrefslogtreecommitdiffstats
path: root/minio/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'minio/README.md')
-rw-r--r--minio/README.md24
1 files changed, 24 insertions, 0 deletions
diff --git a/minio/README.md b/minio/README.md
new file mode 100644
index 0000000..8e8e29f
--- /dev/null
+++ b/minio/README.md
@@ -0,0 +1,24 @@
+
+minio is used as an S3-compatible blob store. Initial use case is GROBID XML
+documents, addressed by the sha1 of the PDF file the XML was extracted from.
+
+Note that on the backend minio is just storing objects as files on disk.
+
+## Buckets
+
+Notable buckets, and structure/naming convention:
+
+ grobid/
+ 2c/0d/2c0daa9307887a27054d4d1f137514b0fa6c6b2d.tei.xml
+ SHA1 (lower-case hex) of PDF that XML was extracted from
+
+Create new buckets like:
+
+ mc mb sandcrawler/grobid
+
+## Users
+
+Create a new readonly user like:
+
+ mc admin user add sandcrawler unpaywall $RANDOM_SECRET_KEY readonly
+