Skip to content
Snippets Groups Projects

WIP: add backup monitoring

Closed username-removed-475751 requested to merge mk-add-backup-monitoring into master
5 unresolved threads

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
1
1 2 module GitLab
2 3 module Monitor
3 4
4 5 # Probes a process for info then writes metrics to a target
5 6 class BackupProber
6 7
8 Aws.config.update({
9 region: 'eu-central-1',
10 credentials: Aws::Credentials.new('AKIAI56UH4XKC4TMFL4A', 'DoV6g31mU11LAokd6yEyj2ta/6ujOhj/hObqL3TZ')
11 })
  • 8 14 @metrics = metrics
    15 @wal_bucket = options[:wal_bucket]
    16 @opts = options
    9 17 end
    10 18
    11 def probe_backup_size
    12 @metrics.add("gitlab_backup_size", 124)
    13 self
    14 end
    19 # Assumed that with the specified prefixes, there is json files in directory with the first 57 chars as the corresponding backup directory name
    20 def probe_backup_info
    21 region = @opts[:backup_region]
    22 key = @opts[:backup_key]
    23 password = @opts[:backup_password]
    24 bucket = @opts[:backup_bucket]
    25 prefix = @opts[:backup_prefix]
  • 12 @metrics.add("gitlab_backup_size", 124)
    13 self
    14 end
    19 # Assumed that with the specified prefixes, there is json files in directory with the first 57 chars as the corresponding backup directory name
    20 def probe_backup_info
    21 region = @opts[:backup_region]
    22 key = @opts[:backup_key]
    23 password = @opts[:backup_password]
    24 bucket = @opts[:backup_bucket]
    25 prefix = @opts[:backup_prefix]
    26
    27 credentials = Aws::Credentials.new(key, password)
    28 s3 = Aws::S3::Resource.new(region:region, credentials: credentials)
    29
    30 # reference an existing bucket by name
    31 bucket = s3.bucket(bucket)
  • 30 # reference an existing bucket by name
    31 bucket = s3.bucket(bucket)
    32
    33 obj = bucket.objects(prefix: prefix)
    34
    35 # enumerate every object in a bucket
    36 last_backup_dir = nil
    37 last_backup_time = nil
    38 dir_name = nil
    39 bucket.objects(prefix: prefix).each do |obj|
    40 if obj.key.end_with?('.json')
    41 if last_backup_time == nil || last_backup_time < obj.last_modified
    42 last_backup_time = obj.last_modified
    43 dir_name = "#{obj.key[0..57]}/"
    44 end
    45 end
    • Maintainer

      Where are the tests? Please use TDD

    • @maratkalibek probe_backup_info and probe_wal_info are the same thing with small differences. Extract what is exactly the same so you can only implement what's different. It looks like pure noise.

    • @pcarranza I can make them mostly similar. But actually, they are working in different ways.

      1. WAL - For one I just get last modified date and total size in bucket directory.
      2. Backup - For the second, first of all I look to last modified date of json file (not in needed directory). Only then I calculate only size of calculated directory. But I can reuse here method from the wal version for this.

      They are time consuming operations, so tried not to do time consuming and extra calculations in one method. Only needed info.

    • Please register or sign in to reply
  • 66 s3 = Aws::S3::Resource.new(region: region, credentials: credentials)
    67
    68 # reference an existing bucket by name
    69 bucket = s3.bucket(bucket)
    70
    71 obj = bucket.objects(prefix: prefix)
    72
    73 # enumerate every object in a bucket
    74 size = 0
    75 last_backup_time = nil
    76 bucket.objects(prefix: prefix).each do |obj|
    77 if last_backup_time == nil || last_backup_time < obj.last_modified
    78 last_backup_time = obj.last_modified
    79 end
    80 size += obj.content_length
    81 end
  • 50 size += entry.content_length
    51 end
    52
    53 @metrics.add('gitlab_backup_size', size)
    54 @metrics.add('gitlab_backup_time', last_backup_time.to_i)
    19 55 end
    20 56
    21 def write_to(target)
    22 target.write(@metrics.to_s)
    57 # Assumed that with the specified prefixes, there is json files in directory with the first 57 chars as the corresponding backup directory name
    58 def probe_wal_info
    59 region = @opts[:wal_region]
    60 key = @opts[:wal_key]
    61 password = @opts[:wal_password]
    62 bucket = @opts[:wal_bucket]
    63 prefix = @opts[:wal_prefix]
  • added 1 commit

    • 58547548 - Refactored backup monitoring part

    Compare with previous version

  • username-removed-475751 unmarked as a Work In Progress

    unmarked as a Work In Progress

  • I pushed metrics export for:

    1. backup monitoring - gives last create time and size of last backup
    2. wal monitoring - gives last create time and size of wal directory

    PS: Array.select is not applicable for bucket objects here since it is enumerable, not array.

  • added 2 commits

    • 5e3fd873 - backups - added specs and wal fixed
    • 1018c42f - bump version to 1.1.1

    Compare with previous version

  • @pcarranza please review

  • 58 end
    59 end
    60
    61 # Probes a backup bucket/path for info then writes metrics to a target
    62 class BackupProber < S3Prober
    63 def initialize(options, metrics: PrometheusMetrics.new)
    64 @metrics = metrics
    65
    66 @region = options[:aws_region]
    67 @key = options[:aws_key]
    68 @secret = options[:aws_secret]
    69 @bucket_name = options[:bucket]
    70 @prefix = options[:prefix]
    71
    72 @s3 = obtain_s3(@key, @secret, @region)
    73 end
  • 44
    45 wal = Wal.new(0)
    46 size = 0
    47
    48 bucket = @s3.bucket(@bucket_name)
    49 bucket.objects(prefix: @prefix).each do |obj|
    50 size += obj.content_length
    51 current_wal = Wal.new(obj.last_modified.to_i)
    52 next unless wal < current_wal
    53 wal = current_wal
    54 end
    55
    56 @metrics.add("gitlab_wal_size", size)
    57 @metrics.add("gitlab_wal_time", wal.time)
    58 end
    59 end
  • 39 end
    40
    41 def probe_wal_info
    42 # current_backup = Backup.new(obj.last_modified.to_i, "#{obj.key[0..57]}/")
    43 # next unless backup < current_backup
    44
    45 wal = Wal.new(0)
    46 size = 0
    47
    48 bucket = @s3.bucket(@bucket_name)
    49 bucket.objects(prefix: @prefix).each do |obj|
    50 size += obj.content_length
    51 current_wal = Wal.new(obj.last_modified.to_i)
    52 next unless wal < current_wal
    53 wal = current_wal
    54 end
  • 1 require "spec_helper"
    2 require "gitlab_monitor/backup"
    3 require "aws-sdk"
    4
    5 describe GitLab::Monitor do
    6 describe GitLab::Monitor::BackupProber do
    7 let(:writer) { StringIO.new }
    8
    9 let(:options) { { aws_key: "key", aws_secret: "secret", aws_region: "region", bucket: "bucket", prefix: "prefix" } }
    10 let(:s3) { double(Aws::S3::Client.new) }
    11 let(:bucket) { double(Aws::S3::Bucket) }
    12 let(:objects) { double(Aws::Resources::Collection) }
    13 let(:files) { double(Aws::Resources::Collection) }
    14 let(:object) { double(Aws::S3::ObjectSummary) }
    15 let(:file1) { double(Aws::S3::ObjectSummary) }
    16 let(:file2) { double(Aws::S3::ObjectSummary) }
  • 63 allow(file1).to receive(:content_length).and_return(100)
    64 allow(file2).to receive(:last_modified).and_return(Time.at(123_123_122))
    65 allow(file2).to receive(:content_length).and_return(200)
    66 end
    67
    68 it "backup prober responds with valid data" do
    69 prober = GitLab::Monitor::WalProber.new(options)
    70 prober.probe_wal_info
    71
    72 prober.write_to(writer)
    73
    74 expect(writer.string).to match(/gitlab_wal_size 300/)
    75 expect(writer.string).to match(/gitlab_wal_time 123123123/)
    76 end
    77 end
    78 end
  • @maratkalibek tests are failing and old conversations are not resolved. Please fix all that before further reviewing.

  • username-removed-475751 marked as a Work In Progress

    marked as a Work In Progress

  • @maratkalibek why was this closed?

  • @ernstvn this one was closed because we tried a different approach, but it's not responsive enough.

    @maratkalibek could you point at the right MR ?

  • yes, as @pcarranza mentioned. I installed cloudwatch, but it was not ok for us, because it return metrics for us only once in 24h.

    Even this method currently I'm trying to solve here is not optimized for WAL backups, since it iterates among 80-90k files to calculate their total size, which results in 5-6 minute run. And it is for data less than one month. And size and time needed for WAL backups will linearly increase with the time.

  • Please register or sign in to reply
    Loading