I like the idea of Akismet. Will we enable globally on GitLab.com or give projects the option to opt-in/out? I like the idea of giving projects an option. Private projects don't need any spam checking, but internal and public projects would benefit. We could auto opt-in for internal/public and opt-out for private? Having a choice could make it cheaper for us since # of checks against Akismet will be lower.
Do we need to amend terms/conditions to disclose if we start sending comments to Akismet? Some may be concerned about privacy.
Users of CE/EE will have to register with Akismet and enable it themselves.
I will copy @stanhu comment here as I think it's important consideration. At some point we may need a way for project master/owner to review things so we aren't discarding valid comments.
From @stanhu in #5573 (moved) - One quick thing we could do, however, would be to integrate Akismet and discard any messages that return as blatant spam before allowing the issue/comment/etc. to be saved.
By letting projects opt-out, we could still be abused by spammers creating clusters of keyword ranked content that could attract clicks from google users and then redirect to the desired website.
If akismet could bring privacy concerns, we could use this database to either reject content or to just send to akismet "suspected" origins: http://www.stopforumspam.com/
Is the spam filtering be built as a sort of API, maybe using a hook?
If a user creates a spam issue / merge request, maybe your GitLab host posts to a hook, which can then decide if the issue / merge request is spam and delete it / remove the user?
I think this would be cool, as it opens the possibility of people integrating a variety of spam fighting solutions.
@matthew.wilkinson That's definitely an interesting idea. It would be nice to have some flexibility. In the short term I think we should integrate Akismet quickly because the problem is so bad on GitLab.com. We may be able to introduce more flexibility later. I suspect spam issues are not a huge issue for most people running CE/EE themselves because most are not available for public sign up.
Makes sense @dblessing ! Though maybe the hooks could be used to help GitLab.com - as it will be attacked in lots of directions as it becomes more popular (maybe through spam merge requests / code snippets / Wiki's / Projects...) - not just issues? - Sorry for butting in, it's interesting to read how you guys are tackling this!
@sytses
I agree that it shoud be a server wide setting, but I think it would make sense to limit it to all public and internal projects.
The spammers can't access private projects, and this way no information about those gets send to an external server.
It also reduces the number of akismet lookups :)