diff options
-rw-r--r-- | minio/README.md | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/minio/README.md b/minio/README.md new file mode 100644 index 0000000..8e8e29f --- /dev/null +++ b/minio/README.md @@ -0,0 +1,24 @@ + +minio is used as an S3-compatible blob store. Initial use case is GROBID XML +documents, addressed by the sha1 of the PDF file the XML was extracted from. + +Note that on the backend minio is just storing objects as files on disk. + +## Buckets + +Notable buckets, and structure/naming convention: + + grobid/ + 2c/0d/2c0daa9307887a27054d4d1f137514b0fa6c6b2d.tei.xml + SHA1 (lower-case hex) of PDF that XML was extracted from + +Create new buckets like: + + mc mb sandcrawler/grobid + +## Users + +Create a new readonly user like: + + mc admin user add sandcrawler unpaywall $RANDOM_SECRET_KEY readonly + |