As part of our effort on GitLab HA, there is the ability to load balance across multiple database servers. Since this is part of our HA feature, this is an EEP feature.
One key difference though between this feature and most of the rest of Postgres HA, is that this is actually part of our GitLab app code. (Whereas much of the rest of Postgres HA is part of the Omnibus code) Since this feature is inside of GitLab, it is aware of the license applied to the instance.
We should add an EEP license check to the database load balancing code. If there is no license, we shouldn't enable this feature. Some additional details:
If there is no license and this has been configured, we should put a banner warning up on the site to warn people the feature is not active. (Unless there is a better idea on how to convey this?)
Will need to keep in mind that if the database is having issues we may not be able to query the database to check the license state. Will need to ensure we can validate the license state without having the check back with the database. (This is both steady state as well as startup, since a GitLab app node could be starting up after a failure and the primary node is down.)
@mydigitalself, this is the first step in GitLab.com showing status of HA. Essentially, whether or not you are licensed to use it. From here I think we can look at adding additional monitoring and potentially configuration aspects, but we need to deliver the license check first.
Would this be something your team could help with, perhaps in 9.5? We could then explore larger status options with 9.6 or later.
Joshua Lambertchanged title from Database load balancing code is EEP to Database load balancing check for EEP
changed title from Database load balancing code is EEP to Database load balancing check for EEP
If there is no license and this has been configured, we should put a banner warning up on the site to warn people the feature is not active. (Unless there is a better idea on how to convey this?)
@mydigitalself How do you envision this? We are not (yet) doing this for any other features that may be configured but are actually inactive.
Will need to keep in mind that if the database is having issues we may not be able to query the database to check the license state. Will need to ensure we can validate the license state without having the check back with the database. (This is both steady state as well as startup, since a GitLab app node could be starting up after a failure and the primary node is down.)
@DouweM yes, the goal is the same. The challenge with doing a check in Omnibus is that it is not aware of the license, and trying to enforce one to be present at install time is onerous and results in a poor UX.
Given that the DB load balancing code is in the platform, we have an opportunity to perform the check here. I think long term we should have a "Geo style" status page where this could be less intrusive, but I was thinking the banner would be a good minimal first step. Certainly open to other ideas / UX input here as well!
@joshlambert Is ensuring we can validate the license state without having the check back with the database a requirement? Wouldn't GitLab be completely broken in that case anyway, so that whether or not the license check result is correct doesn't really matter?
@DouweM More of a technical concern than product requirement. The product requirement would be that the GitLab service remain operational during various outage scenarios, when an EEP license is present.
My concern was something like the following flow:
License is EEP, load balancing is active.
GitLab currently hitting node X.
Node X goes down, and becomes unresponsive.
GitLab now needs to fail over and select a new DB node.
GitLab attempts to perform license check, can't reach node X and fails.
Or alternatively, something like:
Outage goes down bringing down part of the fleet. Some DB, some app nodes.
GitLab brings up a few new app nodes to recover, attempts to connect to first DB in list (node x)
Node x is unresponsive.
GitLab performs license check for database load balancing, fails to find a license. Fails.
It perhaps could be simplified to "fail open" in the event of database errors, but would certainly look to all your technical expertise on better approaches.
If there is no license and this has been configured, we should put a banner warning up on the site to warn people the feature is not active. (Unless there is a better idea on how to convey this?)
@DouweM@joshlambert if we're going to be promoting features to users, then I think having some UI/banner is a good thing, although I'm not sure how often it would be seen.
If multiple databases are configured, but there is no EEP license, then we could:
Put a permanent banner in Admin Area (Overview?) saying as such
Put a notification banner similar to how we notify for expiring trial licenses (see below) in the app itself, this would only be displayed to admins and could be dismissable.
@mydigitalself agree, I think how visible it should be depends on if we can gate the feature and still support the failure modes. If we can gate it, then we can just display to admins.
If we instead opt to not gate the feature because of technical challenges (the license is in the database which you are trying to reach), then we could put up a banner to all users saying there is a license violation.
@rdavila I would think that would help existing nodes, although we'd still need to think through how to bring on new nodes in the various outage scenarios that can occur. (It may not have the license on disk, yet.)
An MVP here could be to simply provide a banner warning if GitLab provided PG HA is on, and license type is not EEP.
Will need to keep in mind that if the database is having issues we may not be able to query the database to check the license state. Will need to ensure we can validate the license state without having the check back with the database. (This is both steady state as well as startup, since a GitLab app node could be starting up after a failure and the primary node is down.)
@joshlambert given that DB load balancing is enabled at startup time I'm wondering if the above comment still makes sense. I've added the license check there so it will not be enabled without a Premium License.
A last question, if a DB server is having connectivity issues I think the app server will not be able to start isn't it?
If there is no license and this has been configured, we should put a banner warning up on the site to warn people the feature is not active. (Unless there is a better idea on how to convey this?)
@mydigitalself before sending the MR for review, can you please confirm if I should also implement the above feature? Right now I've added a simple check to enable/disable this feature based on the license key.
@rdavila if we can prevent the feature from being enabled without an EEP license then we don't need the banner.
On the topic of whether or not we can start, I will leave that up to you folks who understand the technical components and order of operations better. But my concern was just to ensure that GitLab can start in all various conditions.
For example if the first database node in the list is down, GitLab should still be able to start up and connect to the next one in the list.
On the topic of whether or not we can start, I will leave that up to you folks who understand the technical components and order of operations better. But my concern was just to ensure that GitLab can start in all various conditions.
For example if the first database node in the list is down, GitLab should still be able to start up and connect to the next one in the list.