Skip to content
Snippets Groups Projects
Commit e2827329 authored by Pablo Carranza's avatar Pablo Carranza
Browse files

Merge branch 'infra-overview-doc' into 'master'

Updated the docs with some recent information

Closes #787

See merge request !9
parents dcaf8959 9b2fbfc7
No related branches found
No related tags found
No related merge requests found
Loading
Loading
@@ -2,7 +2,9 @@
 
[These are not the runbooks you are looking for](https://gitlab.com/gitlab-com/runbooks)
 
[Infrastructure Overview](overview.md)
## I'm here to see how the GitLab.com infrastructure is made
You should read the [production architecture](production_architecture.md) then.
 
## Where and how to look for data
 
Loading
Loading
@@ -31,9 +33,3 @@ There are some metrics that are not visible in this public site because we do no
* [Host Stats](http://performance.gitlab.net/dashboard/db/host-stats): useful to dive deep into a specific host to understand what is going on with it. Select a host from the dropdown on the top.
* [Business Stats](http://performance.gitlab.net/dashboard/db/business-stats): shows many pushes, new repos and CI builds.
* [Daily overview](http://performance.gitlab.net/dashboard/db/daily-overview): shows endpoints with amount of calls and performance metrics. Useful to understand what is slow generally.
## Production Architecture
![Architecture](img/GitLab Infrastructure Architecture.png)
[Source](https://docs.google.com/drawings/d/1MqoemFRdoLm3_p5aKBhzblZM872F1R-tWdoOR5xMQpE/edit), GitLab internal use only
# Infrastructure Overview
# Production Architecture
 
Our core infrastructure is currently hosted on several cloud providers,
all with different functions. This document does not cover servers that
are not integral to the public facing operations of GitLab.com.
 
This is what it looks like:
![Architecture](img/GitLab Infrastructure Architecture.png)
[Source](https://docs.google.com/drawings/d/1MqoemFRdoLm3_p5aKBhzblZM872F1R-tWdoOR5xMQpE/edit), GitLab internal use only
## Azure
 
The main portion of GitLab.com is hosted on Microsoft Azure. We have
Loading
Loading
@@ -12,23 +17,38 @@ the following servers there.
* 5 HAProxy load balancers for GitLab.com
* 2 HAProxy load balancers for GitLab Pages
* 2 HAProxy nodes for altssh.GitLab.com
* 20 worker nodes (Nginx, Workhorse, Unicorn + Rails, Redis + Sidekiq)
* 2 PostgreSQL servers
* 22 front-end nodes of which:
* 4 are Web nodes
* 8 are API nodes
* 10 are Git nodes
* 10 Sidekiq nodes
* 4 PostgreSQL servers
* 5 Redis servers
* 3 Prometheus monitoring servers
* 11 NFS servers
* 3 Prometheus servers
* 5 NFS servers
Note that these numbers can fluctuate to adapt to the platform needs.
 
We also use availability sets to ensure that a minimum number of servers in each
group are available at any given time. This ensures that Azure will not reboot
all instances in the same availability set at the same time for anything that
is planned.
 
Additionally, we utilize an Azure load balancer to manage PostgreSQL failovers
and to front-end the HA-Proxy servers with a single virtual IP address.
All our servers run the latest Ubuntu LTS unless there is a specific need to do otherwise. Every server is configured with a fully fledged set of firewall rules for increased security.
 
![Architecture](img/GitLab Infrastructure Architecture.png)
### Load Balancers
We utilize Azure load balancers in front of our HAProxy nodes. This allows us to leverage on the Azure infrastructure for HA as well as [taking advantage of the power of HAProxy](https://gitlab.com/gitlab-cookbooks/gitlab-haproxy).
Additionally, we utilize an Azure load balancer to manage PostgreSQL failovers.
* The GitLab.com load balancer pool serves git over ssh, git over https, http and https traffic.
* The GitLab Pages load balancer serves http and https.
* The AltSSH load balancer serves [git on port 443](https://about.gitlab.com/2016/02/18/gitlab-dot-com-now-supports-an-alternate-git-plus-ssh-port/) and translates it to port 22 on the back-end.
### Service nodes
 
This constitutes our core infrastructure.
Different services have different resource utilization patterns so we use a variety of instance types across our service nodes that are consistent for each group. We have recently isolated traffic by type on dedicated pools of nodes. We hope you noticed the performance improvement.
 
## Digital Ocean
 
Loading
Loading
@@ -59,7 +79,7 @@ We are currently investigating Google Cloud.
 
# Technology at GitLab
 
We use a lot of cool ([but boring](https://about.gitlab.com/handbook/#values)])
We use a lot of cool ([but boring](https://about.gitlab.com/handbook/#values))
technologies here at GitLab. Below is a non-exhaustive list of tech we use here.
 
* [Ruby](https://www.ruby-lang.org/) (probably goes without saying)
Loading
Loading
@@ -68,3 +88,5 @@ technologies here at GitLab. Below is a non-exhaustive list of tech we use here.
* [PostgreSQL](https://www.postgresql.org/)
* [Redis](https://redis.io/)
* [ELK Stack](https://www.elastic.co/products)
* [Terraform](https://www.terraform.io)
* [Consul](https://www.consul.io)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment