Skip to content
Snippets Groups Projects

Proposal for performance priority labels

Merged Ernst van Nierop requested to merge evn-prio-labels into master
All threads resolved!

Current description

To help clarify priority of issues, from the perspective of their impact on availability and performance of GitLab.com.

Closes https://gitlab.com/gitlab-com/infrastructure/issues/1943

Original description

I do not think this is a priority right now, but it stems from https://gitlab.com/gitlab-com/www-gitlab-com/merge_requests/5373/diffs which I broke into smaller merge requests.

Edited by Ernst van Nierop

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • mentioned in commit 5ef422a9

  • Douwe Maan
  • yorickpeterse-staging
  • re the following "thread" @ernstvn @pcarranza

    AP1 = drop everything and do it now? AP2 = schedule for next release AP3 = schedule in next 3-6 months

    Ah no.... that's where your judgment comes in ;-) after all, that goes back to the business decision of how important it is to prevent the outage, or how important it is to improve the performance.

    I think it's fair, but on that scale, I would rarely use AP3 ever, I would use AP1 for little things, and most things would be in AP2.

    I'm more than happy to use my judgement, but the levels should give me some good guidance and, to be fair, there's a lot of guidance in the U portion regarding the time horizon on when an outage may occur.

    Let's see how it goes over the next couple of releases. I'm somewhat concerned about everything getting AP1 and AP2; for instance, there are a lot of performance improvements that may deliver > 100ms ms improvement to a particular call (i.e. I1) but is seldom used, say an .atom resource or a less frequently used UI resource that gets only a few hundred requests a day but that still makes it U2. By our matrix, that would get an AP1 and I really don't think that's a FIX IT NOW situation, and for me is probably a fix in 3-6 months type of horizon.

    The best thing to do is to stress test tags against this and see how the guidance is helping that decision making.

  • @mydigitalself I don't think that I would label something that would save us 100ms as AP1.

    To give you some idea:

    • AP1 -> circuit breaker for NFS mounts - please give me this ASAP because when one NFS server goes down it all crashes and creates a 20m outage.
    • AP2 -> general object storage, I may tag artifacts in particular as AP1 simply because they are a problem on fire right now, but the whole thing can take a bit longer and I'm reasonable on that.
    • AP3 -> improving an endpoint that is breaking the 1s latency SLA - I would like to see this done, but I can live with a slower page for now.

    At least this is how I would be labeling things, which I think responds to common sense, granted that it usually is not that common though.

    Does this reasoning make sense to you?

  • @pcarranza that's perfectly reasonable and pretty much fits how I think about the prioritisation, I was just pointing out that as per those definitions, we could overemphasize some things that should really be AP3.

  • @mydigitalself Oh, no, that last thing I want is using all the AP1 and AP2 bullets way too soon.

  • Ernst van Nierop resolved all discussions

    resolved all discussions

  • Ernst van Nierop mentioned in merge request !6580 (merged)

    mentioned in merge request !6580 (merged)

  • Ernst van Nierop resolved all discussions

    resolved all discussions

  • Please register or sign in to reply
    Loading