Banzai::Filter::ExternalLinkFilter is slow
This filter is used for processing external links. Basically all it does is:
- Iterate over all
a
tags - For every
a
tag containing an HTTP URL, addrel="nofollow noreferrer"
andtarget="_blank"
Because of this it's rather odd that as per http://performance.gitlab.net/dashboard/db/markdown-filter-methods?var-process_type=rails&var-method=Banzai%3A%3AFilter%3A%3AExternalLinkFilter%23call it spends between 200 and 800 milliseconds doing this, with a peak going as far as 5.5 seconds (for the 99th percentile).
Looking at the code (lib/banzai/filter/external_link_filter.rb
) I can think of two things to start with:
- Use XPath instead of CSS to get
a
tags and use a predicate in this XPath query to only match links starting withhttp
(something likea[starts-with(@a, 'http')]
if I'm not mistaken) - Make sure this is cached in Redis (just like the other HTML) so we don't do this work on every request to the same page
Using the XPath query basically means we can remove the next unless link
and next unless link.start_with?('http')
lines.