diff options
author | Bryan Newbold <bnewbold@archive.org> | 2019-08-08 16:43:20 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2019-08-08 16:43:20 -0700 |
commit | 51e73fa019577bb3b5443274767252c748d5773a (patch) | |
tree | 847431e66ff7250bbaca02d862823dd520c46b3b /minio | |
parent | 48a802d42cff309543466a9f23245aa93c6d84ea (diff) | |
download | sandcrawler-51e73fa019577bb3b5443274767252c748d5773a.tar.gz sandcrawler-51e73fa019577bb3b5443274767252c748d5773a.zip |
minio README
Diffstat (limited to 'minio')
-rw-r--r-- | minio/README.md | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/minio/README.md b/minio/README.md new file mode 100644 index 0000000..8e8e29f --- /dev/null +++ b/minio/README.md @@ -0,0 +1,24 @@ + +minio is used as an S3-compatible blob store. Initial use case is GROBID XML +documents, addressed by the sha1 of the PDF file the XML was extracted from. + +Note that on the backend minio is just storing objects as files on disk. + +## Buckets + +Notable buckets, and structure/naming convention: + + grobid/ + 2c/0d/2c0daa9307887a27054d4d1f137514b0fa6c6b2d.tei.xml + SHA1 (lower-case hex) of PDF that XML was extracted from + +Create new buckets like: + + mc mb sandcrawler/grobid + +## Users + +Create a new readonly user like: + + mc admin user add sandcrawler unpaywall $RANDOM_SECRET_KEY readonly + |