For now, I just tried to open two tabs for the files I want to review on GitLab.com, both for CE and EE master. This is useful when I am reviewing a particular file, but not so much when I want an overview.
#2902 (closed) would certainly make this easier, but before we reach there completely, is there an easy way to do this for now?
@godfat This is an interesting issue and its made me think about doing a blog post with GitLab EE and CE as a case study. From time to time, I'll look at how GitLab CE and EE is evolving with GitSense and I'm curious to know what are your current pain points, since this is a rather trivial task with GitSense.
For example, if I want to see what unique changes were made to both EE and CE within that last 30 days, I'll just do a unique search, which returns something like this:
Here I can see EE contains 722 unique commits and CE contains 64. And if I switch to the "File History" tab, I can see what areas were the most active for both editions, within the last 30 days.
And to see how they differ, I'll simulate merging from EE to CE and CE to EE like so:
Merge EE to CE
Merge CE to EE
And since you sign off on diffs at the file level, GitSense always knows what revisions you last signed off on, like so.
Which should make reviewing 1000s of diffs over time quite acceptable.
You can also do merge conflict analysis with GitSense, but this is a bit more involved to explain, so I'll leave this explanation for a blog post.
If you have the time, I wouldn't mind getting a better understanding of your current pain points, as this should be a trivial task with GitSense, provided I understand your problem correctly.
@terrchen My pain point is that when I am trying to resolve the conflict, I can't really tell how to resolve it at times. Yes I have <<<<< and >>>>> for the conflicting parts, but the rest code was already merged which was neither EE nor CE. In order to check the intention, I'll need to check the full CE/EE code, then I could decide how to resolve it.
For a single file it's probably not too troublesome, I just need to open two tabs, one for CE and another one for EE, and open the same file. But what if I need to read through out the entire code base in order to understand the context more? I can't. Unless I have 3 copies locally.
but the rest code was already merged which was neither EE nor CE.
I assume you mean code that is not EE and CE specific and not code from another repository?
This is how I do merge conflict analysis with GitSense and let me know if this makes sense. And I'll use the app/assets/javascripts/dispatcher.js file as an example, since it has a merge conflict.
The first thing that I do is search for the file across both the EE and CE master branch and then switch to the "File history" tab like so:
With the file history tree, I can view every commit and change that has modified the dispatcher.js file in both EE and CE.
If you click on dispatcher.js, you can see its history
And if you click on the commit title, you can see what was changed by that commit
So with the history tree, I have access to enough information to know how the latest versions of dispatcher.js came about in both EE and CE.
The basic idea is, if there is a merge conflict, I'll just do a GitSense search for the conflicting file or files and iterate through their history.
Does this make sense or am I misunderstanding the situation?
@terrchen Sorry that I wasn't explaining clearly. So here's what I am doing.
We merge from CE to EE daily. When we do this, we create a new branch on top of current EE master, and merge from CE, and we end up with conflicts. By this:
but the rest code was already merged which was neither EE nor CE.
I meant the current branch, which is during the merge, it's not EE master nor CE master. They're in a state of merging. We don't know the code was merged properly or not yet. Most of the time, we just need to resolve the conflicts by picking either side, but this might not be enough, because Git might wrongly merge the code, causing issues.
In order to understand if the current code is correct, we'll need to look at EE master and CE master at the same time, understanding the original intention, and make changes accordingly. Which might not always be around the conflicts.
For example, suppose we rename a method in CE, and we didn't do so in EE, while EE has other call sites using this method. We could merge this cleanly without conflicts, but we still need to rename EE call sites, which doesn't happen in EE master. This is something Git can't help for us, and we need to look at both CE master and EE master to understand why.
According to your screenshot, yes, it's nice to read the history easily, but I care more about the current status more, and only if I can't figure it out, would I need to look at the detailed history.
Does GitSense have an easy way to tell me that a method in CE was renamed, and we need to do that for EE as well? We should rename it before we could merge to EE master, to make sure that EE master is working correctly.
Yes the approach you guys are using makes sense and it's how I see things being done in most enterprise environments that I've worked at.
Right now, GitSense is optimized for drilling into Git. What you are looking for, is a solution that can generate semantic diffs.
Semantic diffs is on the GitSense roadmap, and everything that has been done so far, is to help support this. This is why GitSense doesn't use Elasticsearch and why I ended up writing my own indexing engine, which is designed specifically, to index Git's history.
I was actually in San Francisco last month, to talk to some companies about semantic diffs. And I can tell you, this an area of great interest, since semantic diffs is a requirement for intelligent code reviews.
In the near future (~2 years), I can see us being able to pick any two points in time and execute a diff, which can return a human readable CHANGELOG. So in your case, the diff would be able to produce a CHANGELOG, that can tell you function FOO was changed to BAR and so on. However, this is at least, 2 years away, in my opinion.
So until then, GitSense can make things easier, but not as easy as you would like.
I forgot to mention that GitSense supports "merged by" analytics, which can help you grok what has changed. So in your case, if you execute
mergedby:<the ce to ee merge commit>
on the branch that was used to merge the ce changes, you can quickly see what commits were merged from ce, what files were changed and who authored the commits. And if you execute
mergedby:<the ce to ee merge commit> likecm:<the ce to ee merge commit>
you'll only see the commits that contains at least one file that had to be merged. And if you execute:
mergedby:<the ce to ee merge commit> path:<conflict file> path:<conflict file> ...
you'll only see the commits with one or more merge conflict files. This search is particular useful, since you can use it to determine who to ask to double check the merges.
And if you only care about merge conflicts to JavaScript files, you can add lang:javascript to the previous search.