MMOチャットの冒fanフィルター

32

Smartfox Serverを使用してMMOを開発しています。対象読者は7〜12歳の子供です。

このMMOにはグローバルチャットオプションがあります。
ユーザーがテキストボックスに入力したものは、ユーザーがエンターキーを押した後にアバターの横に表示されます。

We would like to filter abusive language / profanities from this chat.
We could capture the chat and read the text. The problem is getting the list of profanities itself.

Our questions are

Where will one get exhaustive list of all profanities?
What method is adopted in similar scenario to filter out these?

mmo

— naveen
ソース

17

Good luck with the Scunthorpe Problem.

— Cyclops

7

@yetanothercoder, my point is, filtering is a hard problem. For instance, will your game have any events on Saturday? Will players be able to type the word "Saturday" (note the middle four letters) into their chatboxes? (And don't know why the downvote either - it's not a bad question, but there may not be a simple answer).

— Cyclops

6

And it gets even more complicated when more languages come into play. For example: Starcraft 2 removes "weniger" from chat, which is just the german word for "less"...

— bummzack

4

Another problem I encountered frequently when I was young and playing filtered MMOs was that they're based on the English language. So if I spoke French, some decent French words would get censored because they looked like English curses, and in any case, I could still curse in French all I wanted.

— Xeon06

2

From what I've seen, the most important thing to making a good filter is having an option to turn it off. If you have no option, and players know they have no choice but to be censored, they WILL circumvent the censor. If you make it easy for them to turn it off, chances are they will cease to circumvent it, and those who do not wish to experience harsh language will not have to deal with the people who are trying to circumvent the filter.

— Michael Zehnich

46

Don't.

Filters don't work. At least, only filters don't work. Whitelists, blacklists, it doesn't matter. Neither of these will ever prevent kids from harassing each other. The only way to make this work would be to not filter the chat, but to provide large building-blocks for sentences. For example, a kid might select "Do you want to..." and the options for "go to..." and "trade..." would be pulled up. Selecting "go to..." would bring up a list of places in the game.

Disney settled on this method for their MMO "Toontown", after their 14-year old whitelist test subject decided to "stick [his] long-necked Giraffe up [their] fluffy white bunny." Simply put, you cannot blacklist or whitelist enough words to prevent abuse.

That all being said, if I were designing a kid's MMO, I would actually implement a stringent blacklist filter, but only as a second line of defense. Your first line of defense should always be moderators and the ability to report abuse. I would weight blacklisted words, with each user getting a secret score of how profane they are trying to be.

Chances are, any user who will try and circumvent your filter will trigger it first. The more obvious profanities, (as opposed to obscure or outdated ones,) or more repeated profanity attempts, put them on a watch list for moderators, or some sort of ban list. This way, moderators can focus on users who seem to be trying to harass others instead of wasting their time reading the comments of still-innocent kids.

— dlras2
ソース

6

+1 just for the Toontown link - I especially like the players' use of covert channels for people to exchange their secret code, so they could bypass the filter.

— Cyclops

1

It was a really interesting read I thought I'd dig up and share. If you don't read the rest of my answer, at least read that. =P

— dlras2

2

I believe Blizzard uses this technique (secret score of curses count posted to general chat) in World of Warcraft, at least I know they used to.

— Nate

2

@Dan Personal experience only. I was auto-banned. (Which was different experience than being banned by a GM) Some douche was verbally assaulting some chicks in my guild, and I went off on him. I was not banned from the game, just from /General for some period of time.

— Nate

2

+1 for the first word "Don't." Circumvention is what happens and is why you'll just feel like you've wasted valuable programming resources to create a big steaming pile of meecrob! ;-D

— Randolf Richardson

10

In response to people saying to not provide the filter, I would argue that you have to provide a filter, for no other reason than to cover your own butt with respect to the parents of your intended audience. Just make sure it can be disabled by the user. By implementing a profanity filter (albeit an imperfect and totally optional one), you can say that you've done everything expected of you to protect the sensibilities of your younger audience.

By making it possible to disable, you discourage users from trying to circumvent it using clever punctuation or substitution, since people who favor that sort of language will immediately disable the filter on their own computers, and will have long since forgotten that a filter even exists.

With that understanding, don't worry so much about the implementation. It doesn't need to be foolproof (which is good, because it can't be foolproof), but it should be relatively complete and as un-intrusive as possible. That is, you wan't to make sure you don't make the "clbuttic mistake".

The implementation can be extremely simple -- get a word list, and replace any words found in the list with asterisks or something similar. Best to search for whole words only, as well.

As for a word list, that's easy: http://www.google.com/search?q=profanity+word+list

Remember, it doesn't have to be all-inclusive, it just has to be representative of a valiant effort on your part to protect the children.

— tylerl
ソース

1

+1 would be my approach as well, after researching in detail what you actually need to do for a specific age rating.

— Oskar Duveborn

5

I would try to implement a solution allowing for a blacklist and a whitelist, where you could add 'cunt' to the blacklist, and 'scunthorpe' to the whitelist for example.

I don't believe that you could ever implement a failsafe solution, so I'd try to get the most "popular" words in your dictionary, and make it as easy as possible to add new words to the lists.

The reason for this is that languages, especially english, constantly evolve and something that has been inoffensive for decades could become offensive in the right context.

Try to get the most words possible and go from there, have quick reaction times when people complain and show that this is generally a concern and I doubt you'll have any problems.

It would be a good idea to know exactly what the guidelines are for censorship in the US: MBNL! (me be no lawyer!)

— Jonathan Connell
ソース

3

The solution to evolving language is to filter by prefanity.

— Cyclops

@Cyclops Win! xD

— Jonathan Connell

4

As I commented, filtering all offensive words is really hard - but you could turn it around, and use a whitelist of allowed words. Doing a google search, it seems fairly common for children's game to limit what they can type to a list. For instance, Lego Universe uses a whitelist.

Also see: Whitelisting for game chat. And note that whitelists can be circumvented. There is no guaranteed solution.

Considering that it's for young children, and mis-spelling could be a problem - depending on the client interface, you might consider word auto-completion. As the players start typing letters, offer a list of possible words and let them select the correct one.

— Cyclops
ソース

Good idea, though it would seem strange to me on a game for younger children that may get spelling wrong. It could also hinder their personal development out of the scope of what is available on the whitelist.

— Jonathan Connell

@3nixios, I agree it has problems, but so do every possible solution. :) One fix to the spelling problem would be - wait, I should add that to my post. :)

— Cyclops

+1: this will be a lot safer but as @3nixios: says it would either hinder development or it will be a very big list and so the execution time gets increased right?

— naveen

@yetanothercoder, depending on the client type (I'm assuming html/javascript), you could pre-download a list of valid words and check them in the client. This wouldn't slow down the server (it could theoretically be bypassed by a smart programmer, though). Yes, this is more work - again, there are no easy solutions, sorry. It all depends on how much risk is acceptable.

— Cyclops

1

@Cyclops For a kids game this could be an acceptable solution if you consider only kids playing. Unfortunately client-side checking would mean a 'bad-man' could easily say what he liked to the other players.

— Jonathan Connell

4

There's an answer from Programmers describing one system for building a profanity filter. He doesn't explain how he actually built it in great detail, but it should be enough to get an idea for implementation.

— thegrinner
ソース

4

This is a problem best solved by humans and social design rather than code.

Your best source for an exhaustive list is a live human who is present in the game and monitoring the chat stream. Put people in your game and let them be your ultimate filter.

Spend some time looking into Lane Merrifield's ideas and philosophies behind Club Penguin and about providing service. Here are two writeups from his presentation at the Austin GDC in 2008. I saw it and remember being very impressed with his style of solving human problems with humans and not code.

http://gamasutra.com/php-bin/news_index.php?story=20234

http://www.raphkoster.com/2008/09/15/agdc08-lane-merrifield-at-their-service/

Specifically because your game is aimed at kids, it's more than just swear filters you'll need to think about. You'll need to worry about people posing as kids who may or may not have bad motives. You'll need to assure parents that their kids are safe. You'll need to assure kids that they are safe too for that matter.

Another plus for humans is that they will understand context. You don't want some kid saying, "My Mom has breast cancer" and getting kicked.

— Tim Holt
ソース

we sure do have moderators who could ban potential manipulators. i am more concerned about profanity. it will be a tedious task for moderators when, most of the words used in the bad context will be repetitive.

— naveen

I'd say certainly you can have profanity filters active to detect what you might call the common stuff, and flag it to the moderators. It's not that hard to come up with a "top 100" list of words, then do some quick pattern matching on all strings. Remove all spaces and punctuation first so people don't C_H_E_A_T or M A N I P U L A T E the algorithm. Ultimately though its' humans that will do it right.

— Tim Holt

3

Simple solution to the problem:

Remove all spaces and punctuation from your input.
Blacklist everything in the Urban Dictionary.
Blacklist all homophones etc
Blacklist everything that could be use as a euphamism.
Write your software to understand the content, intention and tone of what is left.
Throw away game and go to market with sentient and omniscient creation from step 5.

— Colin Pickard
ソース

6

homo phones lolololol

— Jonathan Connell

3

This is the end result of the spammers captcha solvers and spam filters: sentient AI that battles for control of Earth: one side trying to sell Viagra and the other trying to protect Humanity. Very Transformers. :-)

— Zan Lynx

3

Some MMOs for children simply replace chat with a predefined list of emotes and phrases and simply doesn't allow free-form chat. Perhaps the game could be designed to accommodate that.

— Oskar Duveborn
ソース