Skip to content
Snippets Groups Projects

RAKE TASK: Added rake task to cleanup/maintain (git gc) repositories

Closed gitlab-qa-bot requested to merge github/fork/elvanja/git-cleanup-rake-task into master

Created by: elvanja

Performs git gc on repositories for all projects.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Created by: riyad

    On a more general note: does this have any (real) effect on the repo size (on the server)?

    I thought Git already minimizes/compresses/packs objects (i.e. commits, trees, blobs) before pushing/pulling them to/from a remote. So from GitLab's point of view it should only have received packs of compressed data, i.e. there should actually be no loose objects in the repos (on the server).

    So my question is: how does this benefit? Use compression over larger chunks (i.e. multiple packs) of data? Reduce the number of .idx/.pack files? Is it measurable?

    By Administrator on 2012-09-12T14:41:35 (imported from GitLab project)

    By Administrator on 2012-09-12T14:41:35 (imported from GitLab)

  • Created by: riyad

    From http://git-scm.com/book/en/Git-Internals-Packfiles :

    The initial format in which Git saves objects on disk is called a loose object format. However, occasionally Git packs up several of these objects into a single binary file called a packfile in order to save space and be more efficient. Git does this if you have too many loose objects around, if you run the git gc command manually, or if you push to a remote server.

    By Administrator on 2012-09-12T14:46:53 (imported from GitLab project)

    By Administrator on 2012-09-12T14:46:53 (imported from GitLab)

  • Created by: elvanja

    Related to cleaning up, the idea was more on the lines of removing stale branches on the repositories. Some materials on the subject: http://devblog.springest.com/refreshing-stale-remote-branches-with-git-prune http://alblue.bandlem.com/2011/11/git-tip-of-week-gc-and-pruning-this.html http://cosicimiento.blogspot.com/2010/09/git-prune-to-remove-old-remote-tracking.html The http://www.kernel.org/pub/software/scm/git/docs/git-gc.html states that prune is on by default, but maybe this could be conveyed more directly in the rake task itself. At my workplace we are still experimenting with this but the rake task is working, if anybody wan'ts to use it :-)

    By Administrator on 2012-09-12T17:56:25 (imported from GitLab project)

    By Administrator on 2012-09-12T17:56:25 (imported from GitLab)

  • Created by: riyad

    I've been reading up on the topic and it seems you might be able to clean some stuff up on the server side, if you happen to use a lot of branches that get pushed but don't get merged. So at some point you would want to clean up your branches and also delete unnecessary objects. But I'm not sure how GitLab would handle e.g. merge requests where not only the (unmerged) branch might get deleted but also the underlying commits are GCed. Probably @randx would know best.

    By Administrator on 2012-09-27T22:03:35 (imported from GitLab project)

    By Administrator on 2012-09-27T22:03:35 (imported from GitLab)

  • Created by: dzaporozhets

    thank you. But this one is not going to the core

    By Administrator on 2013-01-16T12:47:24 (imported from GitLab project)

    By Administrator on 2013-01-16T12:47:24 (imported from GitLab)

  • Created by: elvanja

    no problem :-)

    By Administrator on 2013-01-16T13:59:33 (imported from GitLab project)

    By Administrator on 2013-01-16T13:59:33 (imported from GitLab)

Please register or sign in to reply
Loading