Proposal for configurable search engines
Hi,
I'm starting to look at how I can integrate my indexing technology with GitLab, to improve the browsing experience for Enterprise users. And one of the areas where I think my technology can help, is with searching. For example, if you do a search for the keyword "lab", you'll get the following:
.
which shows close to 4000 code matches. If you execute the same search using my search engine, you'll get:
.
If I have read things correctly, I believe GitLab does a "git grep" when searching for code, which would explain such a high number of matches. If this is correct, executing grep with a boundary option "-w" would result in the same number of matches as my search engine.
However, as the attached image shows, my search engine is also able to cross reference the matching file, to the commit that last touched it. And this is useful because it can be used to further filter results. For example, if you know for a fact that things only started to go wrong after a certain point in time, you can apply a date filter to reduce the number of matches. Basically any metadata that is associated with the commit (author, committer, bug id, etc.) can be used to further filter results, which you can't do with "git grep".
My search engine also works at the branch level, which is handy for verifying bug fixes across release branches at the source code level. For example, if a bug fix involved changing the value for a variable, you can execute a code search across all the release branches that you care about, to verify that this value has indeed been updated. Searching at the branch level would also be helpful for day to day verification testers, who need to extract certain setting values from a file, without having to crawl through the different trees from different branches.
Now for the reason for this issue. I'm not familiar with Ruby and I wouldn't even know where to start, to look at how to change the search engine. And having worked in Enterprise for some time, I know some internal tools group would more than likely have their own homebrew search engine, and I think it would be nice if GitLab could support custom search engines, to reduce the need for context switching (leaving GitLab to search).
From a design point, I think the biggest thing is just agreeing on a messaging/communication protocol. For example, GitLab can define a rule that says if you want to include a custom search engine, your search engine must be able to handle these JSON requests or have these REST endpoints.
I really don't expect this to be implemented anytime soon, but I figure it would be nice to start talking about it, as it would make 3rdparty integration easier, which in turn can help improve the GitLab browsing experience.