[RFC] Use of Automated Moderation Tools

Crashdoom (he/him) · 1 year ago

[RFC] Use of Automated Moderation Tools

cosmo · 1 year ago

I’ve upvoted this but I’d just like to chuck in that I think Raven makes a lot of sense here. I’ve had posts deleted or hidden by automod bots on other sites and even when they’re restored they don’t get as much traction as the posts which were left alone. So there’s an effect even if the action can be “reversed” - and I say that in quotes because it’s not like you can turn the clock back.

Hard agree on the no use of shadowbans and keeping users informed, and the easy escalation to a human.

My ideal would be some kind of system which looks at the public feed for keywords and raises anything of concern to an admin, and maybe the admin’s response goes back in as ‘training’. Something more like SpamAssassin’s Bayesian ham/spam classifier perhaps.

I don’t think automated actions without a human in the loop is the right way to go - and I have grave concerns about biases creeping into the model over time. The poster child for this is pretty much Amazon’s HR resume’ review system ended up with racist biases. There’s been a lot of good progress improving PoC/BIPOC/BAME/non-white acceptance and it’d be a shame if something like this accidentally ended up scarring or undoing some of that.

[RFC] Use of Automated Moderation Tools

[RFC] Use of Automated Moderation Tools

1. Monitoring of Public Streaming Feed

2. Building of a local AI spam-detection model

3. Use of local posts for non-spam training

4. Temporarily limiting suspected spam accounts