[RFC] Use of Automated Moderation Tools

Crashdoom (he/him) · 1 year ago

[RFC] Use of Automated Moderation Tools

Raven Luni@furry.engineer · 1 year ago

@crashdoom I’m generally against automated moderation having been shadowbanned on other platforms for no reason I can identify. These scripts are never infallable nomatter how well intentioned. A computer can be trained to recognise keywords but it can never understand context.

Having said that, I do appreciate the urgency to do something. If you do go ahead with it, I would ask the following:

- Make sure the user is informed of any action, never use shadowbans.

- Make sure there is easy access to human review in the event mistakes do occur.

Draconic NEO@pawb.fun · 1 year ago

@RavenLuni @crashdoom Yeah I agree. Automated moderation systems can cause a lot of problems when they ban or limit without human interaction.

If they do though, they need to inform the user of the actions performed, and there needs to be an easy way to appeal them, so they aren’t just baseless automated bans like on every mainstream service.

cosmo · 1 year ago

I’ve upvoted this but I’d just like to chuck in that I think Raven makes a lot of sense here. I’ve had posts deleted or hidden by automod bots on other sites and even when they’re restored they don’t get as much traction as the posts which were left alone. So there’s an effect even if the action can be “reversed” - and I say that in quotes because it’s not like you can turn the clock back.

Hard agree on the no use of shadowbans and keeping users informed, and the easy escalation to a human.

My ideal would be some kind of system which looks at the public feed for keywords and raises anything of concern to an admin, and maybe the admin’s response goes back in as ‘training’. Something more like SpamAssassin’s Bayesian ham/spam classifier perhaps.

I don’t think automated actions without a human in the loop is the right way to go - and I have grave concerns about biases creeping into the model over time. The poster child for this is pretty much Amazon’s HR resume’ review system ended up with racist biases. There’s been a lot of good progress improving PoC/BIPOC/BAME/non-white acceptance and it’d be a shame if something like this accidentally ended up scarring or undoing some of that.

[RFC] Use of Automated Moderation Tools

[RFC] Use of Automated Moderation Tools

1. Monitoring of Public Streaming Feed

2. Building of a local AI spam-detection model

3. Use of local posts for non-spam training

4. Temporarily limiting suspected spam accounts