Skip to content
Snippets Groups Projects
Commit 5ef422a9 authored by Ernst van Nierop's avatar Ernst van Nierop
Browse files

Merge branch 'evn-prio-labels' into 'master'

Proposal for performance priority labels

Closes infrastructure#1943

See merge request !6019
parents 3f96dfb1 2a14b518
No related branches found
No related tags found
1 merge request!6019Proposal for performance priority labels
Pipeline #12594552 failed
Loading
Loading
@@ -160,6 +160,51 @@ revision, new revision, and ref (e.g. tag or branch) name.
1. Sidekiq updates PostgreSQL
1. Unicorn can now query PostgreSQL.
 
## Availability and Performance Priority Labels
{: #performance-labels}
To clarify the priority of issues that relate to GitLab.com's availability and
performance consider adding an _Availability and Performance Priority Label_,
`~AP1` through `~AP3`. This is similar to what is in use in the Support and
Security teams, they use `~SE` and `~SL` labels respectively to indicate
priority.
Use the following as a guideline to determine which Availability and Performance
Priority label to use for bugs and feature proposals. Consider the _likelihood_
and _urgency_ of the "scenario" that could result from this issue (not) being
resolved.
- **Urgency:** _Examples_
- U1
- Outage likely within a month.
- Affects many team members and/or many GitLab.com users
- U2
- Outage likely within three months.
- Affects some team members and/or a few GitLab.com users
- U3
- Outage can happen, but not likely in next three months.
- Affects some team members but no GitLab.com users
- **Impact:** _Examples_
- I1
- Outage of >= 25 minutes.
- Performance improvement (or avoiding degradation) of >= 100 ms expected.
- I2
- Outage of 5 - 25 minutes.
- Performance improvement (or avoiding degradation) of 10-100 ms expected.
- I3
- Outage of 0 - 5 minutes.
- Performance improvement (or avoiding degradation) of <= 10 ms expected.
| **Urgency \ Impact** | **I1 - High** | **I2 - Medium** | **I3 - Low** |
|----------------------------|---------------|------------------|----------------|
| **U1 - High** | `AP1` | `AP1` | `AP2` |
| **U2 - Medium** | `AP1` | `AP2` | `AP3` |
| **U3 - Low** | `AP2` | `AP3` | `AP3` |
## Database Performance
 
Some general notes about parameters that affect database performance, at a very
Loading
Loading
Loading
Loading
@@ -28,6 +28,7 @@ title: "Infrastructure"
 
- [GitLab.com architecture](production-architecture/)
- [Monitoring GitLab.com](monitoring/)
- [Performance of GitLab.com](/handbook/engineering/performance)
- [Database team handbook](database/)
- [Gitaly team handbook](gitaly/)
- [Production team handbook](production/)
Loading
Loading
@@ -46,7 +47,7 @@ infrastructure team works on
is in fact the first issue in the gitlab-ce issue tracker; for more on
pingdom see the [monitoring page](/handbook/infrastructure/monitoring/)),
measured per calendar month, and as recorded on
[pingdom](http://stats.pingdom.com/81vpf8jyr1h9/1902794/history).
[pingdom](http://stats.pingdom.com/81vpf8jyr1h9/1902794/history).
1. GitLab.com's performance.
- Current goal: [99% of user requests < 1 second](https://performance.gitlab.net/dashboard/db/transaction-overview?panelId=2&fullscreen&orgId=1)
- Latency here is _currently_ measured via the "Transaction Timings"
Loading
Loading
@@ -270,6 +271,7 @@ in the invite, or get in touch in the production chat channel to ask.
Any team or individual can initiate a change to GitLab.com by following this checklist. Create an issue in the infrastructure [issue
tracker](https://gitlab.com/gitlab-com/infrastructure/issues) and select the `change_checklist` template
 
## Make GitLab.com settings the default
 
As said in the [production engineer job description](jobs/production-engineer/index.html)
Loading
Loading
Loading
Loading
@@ -41,25 +41,26 @@ own time as the main scarce resource.
1. Transparency, clarity and directness: public and explicit by default, we work in the open, we strive to get signal over noise.
1. Efficiency: smart resource usage, we should not fix scalability problems by throwing more resources at it but by understanding where the waste is happening and then working to make it disappear. We should work hard to reduce toil to a minimum by automating all the boring work out of our way.
 
## Prioritizing Issues
## Workflow
 
Given the variety of responsibilities and number of "interfaces" between the Production
team and all the other teams at GitLab, here is a guideline on how to prioritize
the issues we work on. Basing this on the [goals of the Infrastructure team](../#infragoals) as
well as our [values](/handbook/values/) and [workflows](/handbook/engineering/workflow)
as a company as whole, the priority should be:
### Workout of the Week (WoW) Milestone
 
1. keeping GitLab.com available - and secure
1. unblocking others
1. automating tasks to reduce toil and increase _team_ availability (but be
explicit about the [costs](https://xkcd.com/1319/) and [benefits](https://xkcd.com/1205/)
1. improving performance of GitLab.com while being conscious of cost
1. reducing costs of running GitLab.com
Issues in the tracker are organized into [milestones](https://gitlab.com/gitlab-com/infrastructure/milestones)
to define the "workout of the week" (WoW) from one week to the next. The "week"
runs from Wednesday to end of Tuesday. The other milestone in use is "Next WoW"
to track items scheduled for the next week. Every week, the Production Lead
renames the WoW to "WoW ending yyyy-mm-dd", and closes it; then renames "Next
WoW" to "WoW". By doing this, the closed milestones provide a history of what
the team has worked on, while the team only needs to be concerned with two open
milestones. If issues are added to the "WoW" after the week has already
started, add the `~unscheduled` label (not needed if the issue is `~outage`
since those are by definition unscheduled).
 
### Labeling Issues
 
We use [issue labels](https://gitlab.com/gitlab-com/infrastructure/labels) to
assist in organizing issues within the Infrastructure issue tracker. Prioritized labels are
We use [issue labels](https://gitlab.com/gitlab-com/infrastructure/labels)
within the Infrastructure issue tracker to assist in prioritizing and organizing
work. Prioritized labels are:
 
- `~(perceived) data loss`
- `~critical`
Loading
Loading
@@ -68,13 +69,17 @@ assist in organizing issues within the Infrastructure issue tracker. Prioritized
- `~outage`
- `~blocked`
 
### Workout of the Week (WoW) Milestone
Issues in this tracker are organized into [milestones](https://gitlab.com/gitlab-com/infrastructure/milestones) to define the "workout of the week" (WoW) from one week to the next. The "week" runs from Wednesday to end of Tuesday. The other milestone in use is "Next WoW" to track items scheduled for the next week. Every week, the Production Lead renames the WoW to "WoW ending yyyy-mm-dd", and closes it; then renames "Next WoW" to "WoW". By doing this, the closed milestones provide a history of what the team has worked on, while the team only needs to be concerned with two open milestones. If issues are added to the "WoW" after the week has already started, add the `~unscheduled` label (not needed if the issue is `~outage` since those are by definition unscheduled).
We also use the `~AP1`, `~AP2`, `~AP3` labels as described in [availability &
performance priority labels](/handbook/engineering/performance/#performance-labels).
Those are mainly used to communicate priority of issues to Product Managers, for
scheduling purposes.
 
### Issue or outage hand off
 
Ongoing outages, as well as issues that have the `~(perceived) data loss` label and are (therefore) actively being worked on need a hand off to happen as team members cycle in and out of their timezones and availability. The on call log can be used to assist with this. (See link at top to on-call log).
Ongoing outages, as well as issues that have the `~(perceived) data loss` label
and are (therefore) actively being worked on need a hand off to happen as team
members cycle in and out of their timezones and availability. The on call log
can be used to assist with this. (See link at top to on-call log).
 
## Production events logging
 
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment