Recommendations for a context aware text classifier

Bluetreefrog@lemmy.world to

Machine Learning | Artificial Intelligence@lemmy.worldEnglish · 1 year ago

I’ve got a bot running/in development to detect and flag toxic content on Lemmy but I’d like to improve on it as I’m getting quite a few false positives. I think that part of the reason is that what constitutes toxic content often depends on the parent comment or post.

During a recent postgrad assignment I was taught (and saw for myself) that a bag of words model usually outperforms LSTM or transformer models for toxic text classification, so I’ve run with that, but I’m wondering if it was the right choice.

Does anyone have any ideas on what kind of model would be most suitable to include a parent as context, but to not explicitly consider whether the parent is toxic? I’m guessing some sort of transformer model, but I’m not quite sure how it might look/work.

You must log in or # to comment.

Chat

Machine Learning | Artificial Intelligence@lemmy.world

machinelearning@lemmy.world

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !machinelearning@lemmy.world

Welcome to Machine Learning – a versatile digital hub where Artificial Intelligence enthusiasts unite. From news flashes and coding tutorials to ML-themed humor, our community covers the gamut of machine learning topics. Regardless of whether you’re an AI expert, a budding programmer, or simply curious about the field, this is your space to share, learn, and connect over all things machine learning. Let’s weave algorithms and spark innovation together.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

26 users / day
26 users / week
26 users / month
26 users / 6 months
0 local subscribers
937 subscribers
43 Posts
0 Comments
Modlog

mods:
Hopps@lemmy.world