- small set of example files, extracted text, tokenized, token-hashes, and complete hashes (in this repo)