We got a quiet big project on our GitLab instance and tried the new codeclimate feature.
After running codeclimate for the first time it generated a 25MB codeclimate.json file.
If we now open a new merge request, the browser gets unresponsive after a few seconds
while trying to diff both files (I guess).
As customer I would assume that the codeclimate feature also can handle/analyze bigger files.
Is it possible to generate stripped codeclimate.json that would only contain metrics and offenses?
@ayufan looking at json it actually contains only metrcis and offenses. I believe the content -> body section eats a lof of space. Considering huge amount of issues in the project this repeating text can be 50% of file size easily. See example:
[{"type":"Issue","check_name":"file_access","description":"Model attribute used in file name","fingerprint":"87526ba3cf2b7e2ee9f0f0ea555c84ae4031b353fd8ced591d7d7ff907a4e1eb","categories":["Security"],"severity":"normal","remediation_points":300000,"location":{"path":"lib/backup/repository.rb","lines":{"begin":127,"end":127}},"content":{"body":"Using user input when accessing files (local or remote) will raise a warning in Brakeman.\n\nFor example\n\n File.open(\"/tmp/#{cookie[:file]}\")\n\nwill raise an error like\n\n Cookie value used in file name near line 4: File.open(\"/tmp/#{cookie[:file]}\")\n\nThis type of vulnerability can be used to access arbitrary files on a server (including `/etc/passwd`.\n"},"engine_name":"brakeman"},{"type":"Issue","check_name":"regex_dos","description":"Model attribute used in regex","fingerprint":"b0f801806cb42e892d9dd36783914d6d1b2f4d1faf4a3df3e5688186e655f91c","categories":["Security"],"severity":"normal","remediation_points":300000,"location":{"path":"app/models/commit.rb","lines":{"begin":91,"end":91}},"content":{"body":"Denial of Service (DoS) is any attack which causes a service to become unavailable for legitimate clients.\n\nFor issues that Brakeman detects, this typically arises in the form of memory leaks.\n\n### Symbol DoS\n\nSince Symbols are not garbage collected in Ruby versions prior to 2.2.0, creation of large numbers of Symbols could lead to a server running out of memory.\n\nBrakeman checks for instances of user input which is converted to a Symbol. When this is not restricted, an attacker could create an unlimited number of Symbols.\n\nThe best approach is to simply never convert user-controlled input to a Symbol. If this cannot be avoided, use a whitelist of acceptable values.\n\nFor example:\n\n valid_values = [\"valid\", \"values\", \"here\"]\n\n if valid_values.include? params[:value]\n symbolized = params[:value].to_sym\n end\n\n\n### Regex DoS\n\nRegular expressions can be used for DoS if the pattern and input requires exponential time to process.\n\nBrakeman will warn about dynamic regular expressions which may be controlled by an attacker. The attacker can create an \"[evil regex](https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS)\" and then supply input which causes the server to use a large amount of resources.\n\nIt is recommended to avoid interpolating user input into regular expressions.\n"},"engine_name":"brakeman"},{"type":"Issue","check_name":"regex_dos","description":"Model attribute used in regex","fingerprint":"3ea396be233ceab55e796919e67b801c058713ba69d8d24151df13b14ab2d5fc","categories":["Security"],"severity":"normal","remediation_points":300000,"location":{"path":"app/models/commit_range.rb","lines":{"begin":45,"end":45}},"content":{"body":"Denial of Service (DoS) is any attack which causes a service to become unavailable for legitimate clients.\n\nFor issues that Brakeman detects, this typically arises in the form of memory leaks.\n\n### Symbol DoS\n\nSince Symbols are not garbage collected in Ruby versions prior to 2.2.0, creation of large numbers of Symbols could lead to a server running out of memory.\n\nBrakeman checks for instances of user input which is converted to a Symbol. When this is not restricted, an attacker could create an unlimited number of Symbols.\n\nThe best approach is to simply never convert user-controlled input to a Symbol. If this cannot be avoided, use a whitelist of acceptable values.\n\nFor example:\n\n valid_values = [\"valid\", \"values\", \"here\"]\n\n if valid_values.include? params[:value]\n symbolized = params[:value].to_sym\n end\n\n\n### Regex DoS\n\nRegular expressions can be used for DoS if the pattern and input requires exponential time to process.\n\nBrakeman will warn about dynamic regular expressions which may be controlled by an attacker. The attacker can create an \"[evil regex](https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS)\" and then supply input which causes the server to use a large amount of resources.\n\nIt is recommended to avoid interpolating user input into regular expressions.\n"},"engine_name":"brakeman"}]
As alternative to current implementation where json is compared on client side we can do it on server side by downloading artifacts, running diff command and storing result somewhere. Then we send it right to FE.
It will be way faster but more complex since we need to download artifacts on server side, run diff command and save diff result somewhere for further use by Frontend
@dzaporozhets It went from 25,5 MB raw to 6,2 MB. The browser (Chrome) still gets unresponsive for about 15-20 seconds, but it's much better than before (obviously)
Customer has shared a network profile from the page request. Strangely it appears to load the codeclimate.json file twice in serial, similar to what is described in https://gitlab.com/gitlab-org/gitlab-ee/issues/2910.