Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | ES schema: add best_url to file schema | Bryan Newbold | 2020-06-04 | 1 | -0/+1 |
| | | | | | | | | | This will increase index size (URLs are often long in our corpus, and we have many file entities), but seems worth it. Initially added `ia_url` as a second field, guaranteed to always be an *.archive.org URL, but `best_url` defaults to that anyways so didn't seem worthwhile. | ||||
* | ES schemas: make keywords case-insensitive by default | Bryan Newbold | 2020-01-30 | 1 | -11/+23 |
| | | | | But not applying asciifolding; don't see any need to do so? | ||||
* | tweak file ES archive.org domain tracking | Bryan Newbold | 2020-01-30 | 1 | -0/+1 |
| | |||||
* | elastic schema fixes | Bryan Newbold | 2020-01-29 | 1 | -6/+6 |
| | |||||
* | first implementation of ES file schema | Bryan Newbold | 2020-01-29 | 1 | -0/+46 |
Includes a trivial test and transform, but not any workers or doc updates. |