uk.ac.cam.juliet.twitter.analysis
Class ProfanityFilter

java.lang.Object
  extended by uk.ac.cam.juliet.twitter.analysis.ProfanityFilter

public class ProfanityFilter
extends java.lang.Object

This class helps filtering strings that contain offensive content

Author:
Ahmad Akra

Field Summary
(package private)  java.lang.String badRegex
          this rgeular expression is constructed by oring all the bad regular expressions in the database ".*(w1|w2|w3...).*" and is matched against any string to determine whether it contains offensive language
(package private)  IDatabase db
          the database from which to loda the bad regular expressions
 
Constructor Summary
ProfanityFilter(IDatabase db)
          constructor for this class
 
Method Summary
 boolean isOffensive(java.lang.String str)
          Attempts to check whether a string contains offensive language, by checking against a library of bad words.
static void main(java.lang.String[] args)
          main method for quick testing
 void reloadBadwords()
          reloads the bad words from database into the memory cache, since it is expensive to load badwords from the database every time isOffensive(String) is called, we delegate to the caller the job of reloading the badwords from the database
Note: this method is thread safe
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

badRegex

java.lang.String badRegex
this rgeular expression is constructed by oring all the bad regular expressions in the database ".*(w1|w2|w3...).*" and is matched against any string to determine whether it contains offensive language


db

IDatabase db
the database from which to loda the bad regular expressions

Constructor Detail

ProfanityFilter

public ProfanityFilter(IDatabase db)
constructor for this class

Parameters:
db - the database from which to load the bad regular expressions
Method Detail

reloadBadwords

public void reloadBadwords()
reloads the bad words from database into the memory cache, since it is expensive to load badwords from the database every time isOffensive(String) is called, we delegate to the caller the job of reloading the badwords from the database
Note: this method is thread safe


isOffensive

public boolean isOffensive(java.lang.String str)
Attempts to check whether a string contains offensive language, by checking against a library of bad words.

Parameters:
str - input string to test for offensive content
Returns:
true if content is offensive, false otherwise

main

public static void main(java.lang.String[] args)
                 throws java.sql.SQLException,
                        java.lang.ClassNotFoundException
main method for quick testing

Parameters:
args -
Throws:
java.sql.SQLException
java.lang.ClassNotFoundException