We now have two codebases: CE and EE.
Every release RMs get EE-to-CE merge hell.
Every developer, while developing something in CE, has to create additional MRs to sync or adopt changes into EE.
We don't obfuscate the EE codebase.
So from the technical perspective nothing stops evil users to grep the codebase for license and modify the code to get all the EE features enabled.
Thus, can we consider joining two codebases into one (let it be EE with enabled/disabled features depending on a purchased license)?
There are a few open source products with dual-licenses which lets users to use their products for free if it's not for a commercial usage:
job
[Jun 28th at 13:40]
in #thanks
Thanks @axil and @marcia for making the EE docs the default, reducing confusion on our docs and teaching people about all the great stuff we have in EE! https://docs.gitlab.com/
docs.gitlab.com
GitLab Documentation - GitLab Documentation
Documentation for GitLab Community Edition, GitLab Enterprise Edition, Omnibus GitLab, and GitLab Runner.
19 replies
remy
[19 days ago]
@job Soon the single codebase!?
job
[19 days ago]
@Remy nope, we can't do that because we want everyone to be able to contribute documentation at the same time as the rest of the code.
job
[19 days ago]
If we don't do that, we'll start seeing that documentation is treated as second-order and that is not acceptable.
remy
[19 days ago]
Hmm, I think I was not clear, I meant to have a single codebase for both CE and EE.
job
[19 days ago]
ah, that's also not possible @Remy
job
[19 days ago]
Because we can't make all of EE open source
job
[19 days ago]
And we can't include proprietary code in an OSS project
job
[19 days ago]
it'd be easier, but it's not possible. So what we're doing now is making clear in which version of GitLab a particular feature is in the docs. (edited)
job
[19 days ago]
And we encourage downloading EE and will allow you to run EE in 'CE mode' before you buy a license
job
[19 days ago]
Good questions though
remy
[19 days ago]
What about dual-licensing? I think there are some OSS projects that do that?
job
[19 days ago]
Hm I'm not sure if that would work for us..
remy
[19 days ago]
The fact that we'll allow to run EE in "CE mode" sparks the question of a single codebase since it'd be so much better from a developer/technical point of view (no CE->EE merges anymore)…
remy
[19 days ago]
I will open an issue since this is a complex licensing question.
job
[19 days ago]
I think dual-licenses only works if the consumption has some restriction for particular customers. E.g. if we'd only allow non-open source through a license. I don't think that's a sustainable situation for us
remy
[19 days ago]
Hmm I see
axil
[19 days ago]
I don't think open sourcing the docs in EE should be a problem
axil
[19 days ago]
@Remy I'm in favor of a single docs codebase as well I'm not sold on the idea that docs will be miss-treated if we have one codebase
@blackst0ne yep it would be much easier for development to have one code base. We raised this question many times already. I don't know how dual-license work but the main problem was and is ability to get clean CE code without risk of storing/modifying license protected code.
ability to get clean CE code without risk of storing/modifying license protected code.
Why do we need that? Couldn't we say that the code that is behind a license flag (i.e. License.feature_available?(:feature_name) is under a proprietary license? I'm not a license expert, but we can probably be creative with how we license a codebase (at least I wish!)?
@rymai Some people refuse to run any proprietary code on their system, even if those code paths are not enabled, so for these people EE in "CE mode" will not cut it—they will want a fully MIT code base, i.e. CE. So even if we have 1 code base, we'd need a way to remove all proprietary code from it and have that be a fully working CE.
We could say that everything inside ee/ is proprietary, and everything outside is MIT, but we also have a lot of stuff outside ee that is proprietary and only gates behind License.feature_available?(:feature_name) checks, that are far harder to remove with some preprocessor.
We could say that everything inside ee/ is proprietary, and everything outside is MIT
That would be awesome!
but we also have a lot of stuff outside ee that is proprietary and only gates behind License.feature_available?(:feature_name) checks, that are far harder to remove with some preprocessor.
Yeah, it won't be tomorrow, but we're slowly going toward that (mostly to avoid conflicts, not for licensing purpose, but if we can kill two birds with one stone, that's great)...
Yeah, it won't be tomorrow, but we're slowly going toward that (mostly to avoid conflicts, not for licensing purpose, but if we can kill two birds with one stone, that's great)...
@rymai I agree. It would be interesting to see a complete diff between CE and EE and start moving old stuff into prepended modules and view partials.
Could you take a look at this problem, please?
We can technically merge two codebases into the one, but can we limit our users by our license like the products I mentioned in the description above do?
A single codebase would make developers' life a lot easier.
If we can't do that right now, what should we do to achieve that?
@blackst0ne if the entire codebase is not open source then we would not be able to license the proprietary code under MIT for non-commercial purposes, we would then lose our proprietary rights to the code that is not open source. Dual licensing works for a codebase that is the same, but can be used for different purposes (personal use vs. commercial use). This idea would not work for codebases that not only have different uses, but have different functionality, and that additional functionality is not open.
@blackst0ne if it was the same product (codebase) that was available to be licensed just for different uses (personal vs. commercial), then yes we could issue a license that states if you use for personal it is licensed under one license, but if used for commercial, then it is licensed under another.
@dzaporozhets@JobV@sytses so the question is, can we (do we want to) make a single product with all the features enabled, but different licences from the business perspective?
@jhurewitz Sorry that I want to clarify one more thing.
if it was the same product (codebase) that was available to be licensed just for different uses (personal vs. commercial)
What if for personal use, without a commercial license, some features would be disabled? Does this also work? I think we don't want to enable everything for personal use. (and from the product design perspective, they won't be useful for personal use, too)
And if this does work, then I think we could merge CE and EE, and just use the license to enable/disable features.
If there's any way we can have a single codebase I think we need to pursue it. From a development standpoint, separate codebases wastes hours every week just for the daily merges. It takes time from every developer opening a merge request to see if they need an EE counterpart. If the CE MR requires changes after a review they have to carefully mirror them to the EE version.
It's a huge source of stress during the release process when manually resolving often-complex merge conflicts. It wastes time again when those conflict resolutions create new test failures that need to be resolved and re-run. It's a huge source of regressions because conflicts are resolved by humans, humans make mistakes, and not everything is covered by tests. This wastes time again when a developer has to spend time finding out why something's broken only in the stable EE branch.
We've spent a lot of time in the last year trying to reduce all of these things with seemingly little impact, and it will never be fully resolved simply because the codebases are large and there's a lot of differences. These problems will always exist with separate codebases.
@godfat From a technical and operational point of view, how do you see this working? We have one codebase that is openly accessible. When it is retrieved by the end user for use, the EE functions would, by default, disabled? What is required to enable them for legitimate subscription EE use? What would prevent non-subscription holders from using those EE functions?
@jhurewitz Technically we have a license file, which only we could generate, and the license file would contain the information about the features they subscribed, and number of seats, and so on. The software would check the license file, enabling features they have the access to, and disabling features they don't have access. This is already the case, and we're also moving forward to make it that if there's no license for the EE codebase, it would just work as if they're running CE. So basically it's all controlled by the license file.
However I don't know if we could license it this way. I am not aware if there's such license before. It might also be tricky to the contributors? People might not want to contribute to a dual licensed codebase, and it might not be clear which part should be open sourced, and which part should to be proprietary?
Subsonic is an example of a FOSS codebase that includes a license check. Because the codebase is FOSS, several forks of subsonic exist that remove the license check. These are all, of course, completely legal (otherwise it wouldn't be FOSS).
(Just to clarify: I'm strongly in favour of a single codebase. Relicensing all EE code to MIT carries predictable IP / company property risks which I'd be happy to take, personally)
@nick.thomas Oh, good point. That's probably the reason why not many companies are using this. I guess this is also the case with CentOS, which as far as I know it's just Red Hat.
Right. It comes down to business models, really - if most of your value is in the support contract for the product, copy protection is less of a strategic concern than if it's in technical innovation, etc.
I recall discussing the subsonic approach in a meeting a month or two ago. It was not keenly taken up ^^
@rymai@stanhu@DouweM I talked with @sytses and we agreed that we should end up with one code base. In the end, we can just place license text in every file to make a distinct separation. Since we are going to promote EE by default (with CE functionality if there is no license) it makes total sense to have just one repository.
@dzaporozhets great news, but how does that work for files that differ between CE and EE?
In most classes where we have additional EE functionality, we include a line like prepend ::EE::.... If we remove the EE-licensed file that defines that module, then the application breaks.
# This file is MIT licensedclassFooprepend::EE::Foodeffooendend
ee/foo.rb (same in both cases):
# This file is EE licensedmoduleEEmoduleFoo# ...endend
If we delete ee/foo.rb, per 4. Add a rake task that will remove ee licensed files for those who can't keep proprietary code on their servers, the application breaks - it can't find the constant EE::Foo.
Also, even if we go down to a single repository, I think we still need to have two source and binary releases - one containing the full codebase, and one containing just the MIT-licensed parts. Expecting people to install from source or maintain their own builds if they want something that is only MIT-licensed would be a huge retrograde step.
I overall think the idea is great. Couple questions:
How can we absolutely ensure devs don't write EE code in a CE class? We already battle this in EE code base. The current issue is only that merges are hard. In a combined code base you now also violate license if we put something in the wrong class. Maybe we can have CI check for any feature/license check wrapped code?
I like the idea of the command to strip an install of proprietary files if needed. We may still draw criticism from open-source purists. Would it be beneficial for us to provide a purely open-source Omnibus package that has EE files pre-stripped? We don't need to advertise it specifically, but provide it if requested.
@nick.thomas comment is definitely an issue. Is there a way to include/pretend IF a class is defined? We would need some good CI around this to prevent breakage when EE classes are removed.
Note that these prepend lines aren't the only type of intra-file differences between CE and EE (although they do throw up the thorniest problem) - there are lots of cases where there is substantial divergence between CE and EE within a file.
Most of them can be fixed over time, but we'll need to be careful not to accidentally relicense EE code as MIT during the transition.
Contributions should continue be OK as the license header at the top of each file will allow people to work out whether the code they're giving us will be MIT or EE licensed, and moving to a single repository shouldn't cause any more of the total codebase to become EE licensed than previously.
This means we still have to maintain two repositories. But we won't have the ugly merge conflicts we have now. And people that don't want to download the proprietary files have an option to do so without having to build it themselves.
At some point in the future we might decide that the overhead of two releases is not worth it. But we can make that a separate decision.
What I meant with my question is, will people interpret this as "we moving away from the open source solution"? Will it make difficult for people who want to contribute to the open-source project feel its problematic to see and touch a code-base with both licenses and last question, will it make people fork to keep a "100% opensource" tree intact etc? (Like Debian project etc)
I think there are some risks in that regard, otherwise physically separating the files is always a win.
Great thoughts @brodock. Does it make sense to draft our plan (already doing it here) and then solicit community feedback as a check to ensure we're not alienating?
What if some CE and EE features are mixed in the same files? Like clicking the lock button might be mixed in a JS file that has other non EE features. How will this work?
Frontend actually gets a boon in that the JS is already MIT-licensed, so moving to a single codebase simplifies a lot without adding any extra accounting :)
files with diffs between CE and EE are really the hard part. To try to get an order-of-magnitude idea of the scale of the problem, I've done some basic diffing of CE at 598b1a17 with EE: diff -qr gitlab-ce gitlab-ee > gitlab.diff
Assuming we overcome the technical obstacles that stand in the way of allowing all these files to take their CE forms, we're still looking at some fairly concentrated effort to implement it. Not insurmountable, but not a one-day thing either.
Regarding the prepend (::)?EE ... lines (of which I count 80 in gitlab-ee), I offered one possibility while we were considering the introduction of the ee/ top-level directory: https://gitlab.com/gitlab-org/gitlab-ee/issues/2902
by following this rule, we could eventually do away with the manual prepend ::EE::... lines we currently have to add to CE files. Instead, we could traverse the ee/{app,lib} directory tree, require each file and its CE equivalent, then run ce_class.prepend(ee_module) at startup, or cleverly hooked into the autoloading mechanism somehow.
@DouweM preferred the explicit prepend so we didn't go ahead, but it's just not compatible with the idea of removing files from the codebase so, perhaps we can revisit it. That will fix up to 9.6% of the files at a stroke (depending on how many of them also contain other modifications), while also resolving the tricky technical issue of removing files containing constants referenced in files not being removed.
@nick.thomas The other possible approach is that we could provide some no-op empty modules whenever we're removing the EE files. So instead of just remove them, we actually replace them with MIT-licensed empty EE files... Actually, maybe just one giant file with all the empty modules would be good enough.
Not that I am in favouring of this approach, but if we really hate the magic this could work.
As a free software freak myself, if we pursue this, I'd really like if there was a package that contained only the free components, and I'm very concerned how the community would react to if there wasn't.
This means we still have to maintain two repositories.
How would that work? If we made EE the default, would that mean we would also move all CE issues in the EE tracker? And this is just needed for the first step, meaning we aim to also have a single project in the end, right?
Does it make sense to draft our plan (already doing it here) and then solicit community feedback as a check to ensure we're not alienating?
We should also run all the test twice; before and after the EE feature physical removal.
We can also write something like prepend_ee/include_ee to just check the licence and eventually do a no-op instead of prepending or including the files.
In any case, we still have to define all those modules otherwise autoloader will get mad looking for them.
@rymai removing files is easy, modifying them is much harder. I'd put that at the bottom of the list of options!
HAML views also deserve a special mention - we've been working on moving EE code into their own partials, but that means we have a bunch of diffs between CE and EE in files that boil down to "= render partial: "..."`
= ee_render partial: ... - which would be a silent no-op if the file doesn't exist - could be an option.
@nick.thomas I think we can have ee_render, and ee_prepend that takes a string rather than an actual constant. We can then relatively easily swap these out for no-ops if we build CE.
Since reducing the number of files modified between CE and EE is an essential prerequisite of this issue, and benefits us even if we don't go ahead and keep doing the CE->EE merge, should we start putting some focus into it? The diffs in spec/, particularly, worry me as a large time sink.
If we could relicense all our CSS as being MIT in the same way that the JS is, we'd save plenty of time there. I understand from discussion in https://gitlab.com/gitlab-org/gitlab-ee/issues/2902 that these files aren't easy to separate out, and I'd struggle to argue that a significant amount of the value in EE is contained there.
It'd be just adding one more line there. If it is a rake task, it'd be more complicated, but manageable I think. I might be able to automate that (haven't tried anything more complicated than just removing some files), but manually I can do it for sure.
@pravi it's been a while since I did any Debian packaging, but would you prefer a gitlab-ce_x.y.z.tar.gz file that only contains the MIT-licensed code?
@nick.thomas if it is removing just a folder (or even a list of files), then single tarball would be fine. If it is more complicated like a rake task, I'd prefer a seaprate tarball, but I can manage to generate it manually too.
As for updating the package, since 9.x moved to node modules directly for the front end, I'm still working on packaging those modules before I can update gitlab. You can see the list of modules yet to package at https://wiki.debian.org/Javascript/Nodejs/Tasks/gitlab (not just these modules, I have to package build tools as well, webpack, babel, rollup, browserify etc which were neglected in debian for a long time).
Wow, there are a lot of comments here. I am glad it raised such attention. I believe all technical issues we can overcome with some efforts. Related to open source part I think next compromise is reasonable:
CE repo is read-only and generated by running rake task that removes EE code
People are recommended to create issues and merge requests to EE repo.
If you don't want to clone EE repo under any circumstances - you can either use web editor or send a patch. There are workarounds. But I believe those will be rare cases.
I think the hardest part is to properly explain our intentions to the community. Basically, we want the community to experience both high-quality GitLab.com (which is EE codebase) and self-hosted version (which can be EE or CE). To make this happen we are going to use a single codebase for development and issue tracking.
Before:
CE repo with merge requests and issues
EE repo with merge requests and issues
CE code being developed without seeing EE code which causes bugs in EE (and GitLab.com as result)
Merge from CE to EE generates conflicts all the time
After:
CE repo read-only
EE repo with merge requests and issues
CE repo updated from EE repo by running rake task to remove proprietary code
P.S. As a person who uses GitLab.com for my personal open source projects I want both CE and EE to be good.
@dzaporozhets I think we should stop thinking about the repos/codebases/packages as "the CE repo/codebase/package" and "the EE repo/codebase/package".
Instead, we have "the main GitLab repo/codebase/package", called gitlab, and "the open-source GitLab repo/codebase/package", called something like gitlab-os or gitlab-foss.
CE and EE are names for feature sets, not repos, codebases or packages; it is a separate "dimension". Specific code is also not CE or EE, but open-source or proprietary.
Installing gitlab will get you the CE, EES, or EEP feature set, depending on the provided license file. It consists of both open-source and proprietary code, that is selectively and dynamically activated based on the provided license file.
Installing gitlab-os will always get you the CE feature set, with no way to provide a license file. It consists of only open-source code.
The main GitLab project gitlab also has the issue and MR trackers for all of GitLab, covering the CE, EES, and EEP feature sets, since they are all part of the GitLab product and project.
gitlab-os is not a product or project, it's just a repository. All issues and MRs belong in the main GitLab project.
This will prevent the confusing situation where someone who only uses the CE feature set, who installed the gitlab-ce package or the gitlab-ee package without providing a license, is expected to submit issues and MRs to the "GitLab EE" repository.
It would also prevent people who don't know the ins and outs of the situation from thinking or posting that "GitLab has stopped working on gitlab-ce and is now only working on gitlab-ee".
gitlab-os is not a product or project, it's just a repository.
Not entirely true, because there will be omnibus packages for this FOSS edition. So we should properly consider the name. Maybe open-gitlab or gitlab-free.
Not entirely true, because there will be omnibus packages for this FOSS edition. So we should properly consider the name. Maybe open-gitlab or gitlab-free.
@to1ne Right, there will be a repository, codebase and package. What I tried to communicate is that it is not a product or project in the sense that it something we specifically market or specifically put development effort into. It is "just" a repository (and package) extracted from the main GitLab product/project.
As for the name, gitlab-free is confusing because "free" will likely be interpreted as "free as in beer", while both the main and the open-source package can be used without payment, giving you the CE feature set in both cases. Per https://gitlab.com/gitlab-org/gitlab-ee/issues/2417, we actually would like users of the CE feature set to install the main package, not the open-source one.
My initial suggestion gitlab-os probably sounds too much like "GitLab Operating System".
I like open-gitlab ("Open GitLab"); it reminds me of openSUSE and OpenOffice. Although people may think it's an unofficial fork. Putting "open" or "FOSS" behind the name may be better; I think gitlab-open or gitlab-foss is more obviously an official GitLab thing.
Talking about names, it may be confusing that "GitLab Community Edition" is no longer actually an edition of GitLab that is built by the community with a separate package, but rather a feature set available in both the main and open-source GitLab packages, with us preferring people install the main package even if they're only interested in the CE feature set. If we're going to be renaming things anyway, maybe we should rename Community Edition to something that more clearly describes the feature set/use case, rather than the community-built aspect, to make it clear that deciding what feature set you need (CE, EES, EEP) is more important to most people than deciding what package you need (main or open-source), and that choosing the CE feature set does not automatically mean choosing the open-source package, even though the "community" name currently does seem to imply that.
@DouweM I would keep the gitlab-ce name for the entirely FOSS repository, and rename the feature set to something else.
This simplifies communication - "GitLab CE is still GitLab CE and 100% FOSS". Anything else could be construed as "GitLab CE is no longer FOSS", scaring existing CE users.
In increasing order of features, we have:
GitLab CE (100% FOSS, no option to add a license file)
GitLab ??? (more features than CE, not 100% FOSS)
GitLab EES/EEP (same code as ???, plus a license file)
Maybe call it GitLab Flex or something ridiculous like that.
@nick.thomas I don't think there would be something in between the current CE and EES feature sets (since the CE feature set can be defined as "all features that are FOSS"), so I'm not sure where your "???"/"Flex" would fit.
Community Edition, Enterprise Edition Starter and Enterprise Edition Premium are feature sets. gitlab and gitlab-foss are repos/packages.
I think renaming the "feature sets" to get rid of "Edition" makes a lot of sense, since "edition" implies that these are separate packages, which is no longer true, since the main gitlab package can take all different forms. "Community Edition" is in that sense more appropriate for the package, than the feature set. "Enterprise Edition" also doesn't make sense if it's not a separate package. That would make your suggestion something like this:
Flex (formerly known as Community Edition)
Enterprise Starter
Enterprise Premium
FREE
$3.25 per user per month
$16.59 per user per month
Install gitlab
Install gitlab and buy now
Install gitlab and buy now
Install gitlab-ce
Flex, Enterprise Starter and Enterprise Premium are feature sets. gitlab and gitlab-ce are repos/packages.
Realistically, I think that most current users of GitLab Community Edition, if they came across https://about.gitlab.com and came across that table, with a choice between the free-as-in-beer gitlab package and the free-as-in-speech gitlab-ce package, with both behaving exactly the same when no license file is provided, they'd be equally fine using either, because they care more about the free-as-in-beer aspect than the free-as-in-speech aspect. So us being a company, we want to lead these people to gitlab rather than gitlab-ce, to make it easier for them to eventually upgrade to a paid feature set. For those people who care about free-as-in-beer and free-as-in-speech, there would be a smaller link to gitlab-ce. (See https://gitlab.com/gitlab-org/gitlab-ee/issues/2417)
However, since gitlab-ce, CE, and "Community Edition" currently refer to both the feature set and the package, I think that people who heard about CE from a friend or some random website, will be likely to find "Flex (formerly known as Community Edition)" and choose gitlab-ce, even if they would have been perfectly happy with gitlab, which we would've preferred they use. That's what I'm suggesting stopping using gitlab-ce as a package name, because we want less people to use it, not more.
I don't want the name of the 100% FOSS codebase and packages to change. I think that would lead to misunderstandings among the minority of our users who are concerned with, and vocal about, FOSS software.
"Flex" is the name I proposed (I don't like it at all, by the way :p) for the current CE featureset PLUS the ability to upload a license to unlock EES/EEP functionality. Plus any small bits and pieces we've forgotten to add a feature check around - I think there are a few.
Sticking with the table format, I'd be quite happy with us presenting something like this:
Flex
Enterprise Starter
Enterprise Premium
FREE
$3.25 per user per month
$16.59 per user per month
Install gitlab*
Install gitlab and buy now
Install gitlab and buy now
* The open-source [Community Edition](...) is also available
I understand the commercial and marketing incentives to push "flex" as the default installation option for people who aren't FOSS-oriented, and I support it wholeheartedly - it makes support a lot easier! My concerns (now we're committed to keeping a binary release of what is currently gitlab-ce going, anyway) are simply about how best to communicate what's happening with as few misunderstandings as possible.
Would Rails Engine help us separate EE from CE code? Like every EE specific code goes to an engine and we can remove it from the CE version? It helps that it will handle routes migrations etc... I know we have some custom code in all that, but perhaps to fully isolate things we will re-implement part of Engines.
My motivation here is that, if we don't separate in that way, we may be switching from merging issues to broken code dependencies issues (like removing a critical part that will break CE, and will take us similar or greater amount of time to everytime to make sure the change is self-contained and can be safely removed).
At least for conflicts we have tooling... plus is that by adopting something like that we can even reconsider not splitting the codebase (if EE is self-contained, than merging from CE to EE should not generate any code conflict).