Don't redact Markdown documents that don't contain private information
Currently our Markdown pipeline more or less works as follows:
- Render the Markdown to HTML, store the result in a database column
- The next time we want to get this data, just get the HTML instead of the Markdown
- Parse the HTML string into an HTML document
- Redact the document
- Serialize the document back to a String
In most cases a document won't contain any information that has to be redacted (e.g. a link to a private issue). If we were to store some kind of boolean (e.g. has_private_references
) for every field we could change the setup to the following:
- Render Markdown to HTML, cache it
- On the next request, if
has_private_references
isfalse
we just display the HTML string as-is, without parsing - If it's true, parse and redact
This could cut down response timings of most issues/merge requests by quite a bit. One caveat is that if you refer to a public resource (e.g. an issue), which is then made private, the link will stay visible. However, since the resource was public to begin with I don't think this is a very big issue.