[Elasticsearch] Create documentation and all means to update index mapping with zero downtime
It is often needed to update index mapping when we need to improve search, fix some bug, add new field, add new analizer or whatever. For relatively small instances it can be done by simply removing the index and creating new one. But it's acceptable if indexing can be performed in few minutes or if you don't have high availability requirement. For those who should do it with no downtime we have to prepare documentation on how to do it. It is also needed to prepare pipeline example for Logstash.
The common practice is to use next strategy:
- Update the application
- Create new empty index with new mapping
- Use Logstash to copy all data from old index to new one (it's much faster then reading data from database/repository again). Copying should be performed until some date(by date condition), since new data is continue to go to the old index. We need to use Scroll API and bulk import here to make it fast. All this stuff can be configured in Logstash.
- Remove old index, create an alias to new one. From this point GitLab will work with new index.
- Run indexing again for date starting from last reindexing run (to not lose the data).
One more alternative is to temporary disable elastic search, prepare new index with either GitLab rake tasks or Logtash and enable it again.
cc @dzaporozhets @jacobvosmaer @jnijhof @DouweM
@sytses Not sure if should I mention you in such issues....