Skip to content
Snippets Groups Projects
Select Git revision
  • move-gl-dropdown
  • improve-table-pagination-spec
  • move-markdown-preview
  • winh-fix-merge-request-spec
  • master default
  • index-namespaces-lower-name
  • winh-single-karma-test
  • 10-3-stable
  • 36782-replace-team-user-role-with-add_role-user-in-specs
  • winh-modal-internal-state
  • tz-ide-file-icons
  • 38869-milestone-select
  • update-autodevops-template
  • jivl-activate-repo-cookie-preferences
  • qa-add-deploy-key
  • docs-move-article-ldap
  • 40780-choose-file
  • 22643-manual-job-page
  • refactor-cluster-show-page-conservative
  • dm-sidekiq-versioning
  • v10.4.0.pre
  • v10.3.0
  • v10.3.0-rc5
  • v10.3.0-rc4
  • v10.3.0-rc3
  • v10.3.0-rc2
  • v10.2.5
  • v10.3.0-rc1
  • v10.0.7
  • v10.1.5
  • v10.2.4
  • v10.2.3
  • v10.2.2
  • v10.2.1
  • v10.3.0.pre
  • v10.2.0
  • v10.2.0-rc4
  • v10.2.0-rc3
  • v10.1.4
  • v10.2.0-rc2
40 results

blob.rb

Blame
  • Forked from GitLab.org / GitLab FOSS
    Source project has a limited visibility.
    • Yorick Peterse's avatar
      0bc443e3
      Handle encoding in non-binary Blob instances · 0bc443e3
      Yorick Peterse authored
      gitlab_git 10.6.4 relies on Rugged marking blobs as binary or not,
      instead of relying on Linguist. Linguist in turn would mark text blobs
      as binary whenever they would contain byte sequences that could not be
      encoded using UTF-8.
      
      However, marking such blobs as binary is not correct. If one pushes a
      Markdown document with invalid character sequences it's still a text
      based Markdown document and not some random binary blob.
      
      This commit overwrites Blob#data so it automatically converts text-based
      content to UTF-8 (the encoding we use everywhere else) while taking care
      of replacing any invalid sequences with the UTF-8 replacement character.
      The data of binary blobs is left as-is.
      Verified
      0bc443e3
      History
      Handle encoding in non-binary Blob instances
      Yorick Peterse authored
      gitlab_git 10.6.4 relies on Rugged marking blobs as binary or not,
      instead of relying on Linguist. Linguist in turn would mark text blobs
      as binary whenever they would contain byte sequences that could not be
      encoded using UTF-8.
      
      However, marking such blobs as binary is not correct. If one pushes a
      Markdown document with invalid character sequences it's still a text
      based Markdown document and not some random binary blob.
      
      This commit overwrites Blob#data so it automatically converts text-based
      content to UTF-8 (the encoding we use everywhere else) while taking care
      of replacing any invalid sequences with the UTF-8 replacement character.
      The data of binary blobs is left as-is.