Skip to content
Snippets Groups Projects
Commit 2de34071 authored by Eric Johnson's avatar Eric Johnson
Browse files

Merge branch 'db-okr-scoring' into 'master'

Added DB OKR retrospective for Q4 of 2017

Closes #1880

See merge request gitlab-com/www-gitlab-com!9264
parents 15ee7a49 6faedd80
No related branches found
No related tags found
No related merge requests found
Loading
Loading
@@ -167,10 +167,10 @@ title: "2017 Q4 OKRs"
* Generate a project plan for the GCP migration and get approved by EJ and Sid
* Execute milestone 1 of the GCP migration plan by Dec 15
* Database Lead
* Demo restore time < 1 hour
* Solve 30% of the schema issues identified by Crunchy
* Database Uptime 99.99% measured in Prometheus
* SQL timing under 100ms for Issue, MR, project dashboard, and CI pages measured in Prometheus
* Demo restore time < 1 hour => postponed until the GCP migration has been completed
* Solve 30% of the schema issues identified by Crunchy => 6.5% done (2 out of 30)
* Database Uptime 99.99% measured in Prometheus => Done!
* SQL timing under 100ms for Issue, MR, project dashboard, and CI pages measured in Prometheus => Improving the 99th percentile has proven to be very difficult, but progress being made.
* Director of Security
* Strong security for SaaS and on-premise product. Top 10 actions from risk assessment done and actions for top 10 risks started.
* HackerOne bug bounty program. Implemented and bounties awarded.
Loading
Loading
@@ -240,7 +240,7 @@ title: "2017 Q4 OKRs"
* Hire 2 senior developers
* Discussion: Hire 2 developers
* Gitaly: Hire a developer
* Database: Hire a database specialist
* Database: Hire a database specialist => Done! Starting end of January 2018
* Director of Quality
* Hire a test automation lead
* Hire 3 test automation engineers
Loading
Loading
@@ -293,7 +293,24 @@ title: "2017 Q4 OKRs"
* Continue collaborating with FE on UI Repository and UX Backlog cleanup
* Push harder for significant iterative UX improvements in each release
* Anticipate holiday season's influence on ability to deliver
* Database Lead
* Staff Developer, Database
* GOOD
* Despite the large OKR we managed to solve a lot of performance issues.
* We improved the team workflow by using issue boards more actively and by having a weekly database meeting.
* We managed to add health / uptime monitoring to Prometheus / Grafana, allowing us to see how the database health changes over time. This is based on the number of alerts sent out, not the uptime of the database.
* We managed to hire a 3rd database specialist.
* We rewrote the GitHub importer from scratch, resulting in _much_ better performance.
* We wrote a (popular) blog post about scaling the database: <https://about.gitlab.com/2017/10/02/scaling-the-gitlab-database/>
* We managed to optimise retrieving CI pipeline statuses, which used to execute _very_ slow SQL queries.
* BAD
* We added far too much work to the Q4 OKR, resulting in us only being able to complete a small portion of the planned work.
* We didn't take the summit into account when planning the OKR.
* One database specialist was unavailable for a few weeks due to having to move to a different apartment. This lead to a reduction in productivity of the team as a whole.
* There were too many issues that required the help of others, some of these were not worked on for several weeks.
* We estimated we'd be able to complete 30 schema issues, but only ended up completing two of them.
* 10.3 had a few bad migrations causing trouble.
* TRY
* Schedule more well defined issues for an OKR so we can actually solve them.
* Make it harder to introduce performance problems (planned for Q1 of 2018).
* Assign database specialists to specific areas instead of having them take care of everything (<https://gitlab.com/gitlab-com/infrastructure/issues/3139>).
* Delegate more work to the other teams so database specialists don't have to do so much one their own.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment