From 51e73fa019577bb3b5443274767252c748d5773a Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Thu, 8 Aug 2019 16:43:20 -0700 Subject: minio README --- minio/README.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 minio/README.md diff --git a/minio/README.md b/minio/README.md new file mode 100644 index 0000000..8e8e29f --- /dev/null +++ b/minio/README.md @@ -0,0 +1,24 @@ + +minio is used as an S3-compatible blob store. Initial use case is GROBID XML +documents, addressed by the sha1 of the PDF file the XML was extracted from. + +Note that on the backend minio is just storing objects as files on disk. + +## Buckets + +Notable buckets, and structure/naming convention: + + grobid/ + 2c/0d/2c0daa9307887a27054d4d1f137514b0fa6c6b2d.tei.xml + SHA1 (lower-case hex) of PDF that XML was extracted from + +Create new buckets like: + + mc mb sandcrawler/grobid + +## Users + +Create a new readonly user like: + + mc admin user add sandcrawler unpaywall $RANDOM_SECRET_KEY readonly + -- cgit v1.2.3