Profanity Filter Test Report

The state of profanity filtering

It is not a stretch of the truth to say that effective profanity filtering is a problem that programmers and researchers have struggled with for decades. There have been many different approaches taken over the years, ranging from the basic use of a blacklist for detection, all the way up to machine learning approaches more recently. Lacking the resources or experience for a machine learning approach, we have turned our focus on a creative language-processing design.

(move) That is the nature of this report, and it shall be most instructive to ascertain who is the victor in our little competition.

The need for a new profanity filter

As many BCD members will be aware, the profanity filter for our own homegrown moderation bot Sentry has been in development for some time. This profanity filter is meant to replace Filtob, our current (also homegrown) profanity filter bot.

Of course, an important consideration for this endeavour is efficacy. Filtob is arguably extremely effective, but often experiences false positives, to the annoyance of many of our members. Other filter bots on the Discord market typically suffer from less false positives and more false negatives - that is, they are more easily bypassable. The trade-off between false positives and false negatives becomes an important one when considering the implementation of a profanity filter.

So the challenge faced for the new profanity filter is to effectively filter profanity, but in such a way as optimises the member experience as well. Perhaps this must inevitably result in a more lenient filter (approaching the norm for other filters in existence), but we are hopeful that we can still retain a much higher efficacy when compared to these other filters.

The contest

Two programming-minded members of BCD, King181 and SneezingCactus, recently became interested in the development of the new profanity filter. It was suggested via correspondence between King181, SneezingCactus and myself that we should each design our own attempt at the profanity filter, and compare the results. I formalised this challenge by inviting them each to submit their code to me, and I would test its efficacy using a dataset (described below).

Dataset

The dataset