correct User-agent placement in robots.txt
What does this MR do?
fixes #26807 (closed)
Are there points in the code the reviewer needs to double check?
I'v added a User-Agent field to the top of each record within the file per the specification:
The Format
The format and semantics of the "/robots.txt" file are as follows:
The file consists of one or more records separated by one or more blank lines (terminated by CR,CR/NL, or NL). Each record contains lines of the form "<field>:<optionalspace><value><optionalspace>". The field name is case insensitive.
Why was this MR needed?
Robots can currently crawl gitlab freely, even when honoring robots.txt.
it is currently parsed as
User-agent: *
and nothing else is processed as a result because there is no instruction for that user agent
Screenshots (if relevant)
Does this MR meet the acceptance criteria?
-
Changelog entry added -
Documentation created/updated [ ] API support added-
Tests[ ] Added for this feature/bug[ ] All builds are passing
[ ] Conform by the merge request performance guides[ ] Conform by the style guides-
Branch has no merge conflicts with master
(if it does - rebase it please) -
Squashed related commits together