Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | content_scope: include in file ES schema and transform | Bryan Newbold | 2021-11-17 | 1 | -0/+1 |
| | |||||
* | ES schemas: add doc_index_ts to all mappings | Bryan Newbold | 2021-04-06 | 1 | -0/+1 |
| | |||||
* | elasticsearch schema, docs, docker: update from ES 6.x to ES 7.x | Bryan Newbold | 2021-04-06 | 1 | -1/+3 |
| | | | | | Including removing index document names (use '_doc' instead during transition) | ||||
* | ES schema: add best_url to file schema | Bryan Newbold | 2020-06-04 | 1 | -0/+1 |
| | | | | | | | | | This will increase index size (URLs are often long in our corpus, and we have many file entities), but seems worth it. Initially added `ia_url` as a second field, guaranteed to always be an *.archive.org URL, but `best_url` defaults to that anyways so didn't seem worthwhile. | ||||
* | ES schemas: make keywords case-insensitive by default | Bryan Newbold | 2020-01-30 | 1 | -11/+23 |
| | | | | But not applying asciifolding; don't see any need to do so? | ||||
* | tweak file ES archive.org domain tracking | Bryan Newbold | 2020-01-30 | 1 | -0/+1 |
| | |||||
* | elastic schema fixes | Bryan Newbold | 2020-01-29 | 1 | -6/+6 |
| | |||||
* | first implementation of ES file schema | Bryan Newbold | 2020-01-29 | 1 | -0/+46 |
Includes a trivial test and transform, but not any workers or doc updates. |