Reddit introduces an AI-powered tool that will detect online harassment

The new "harassment filter" is trained on previously flagged content.
By Chase DiBenedetto  on 
The reddit logo reflected on an iPhone screen and glowing red backdrop.
Credit: Jonathan Raa / NurPhoto via Getty Images

Reddit has introduced an AI-powered safety filter that will help sift out posts that contain harassing or other objectionable content.

The "harassment filter" — quietly added to the platform's support page last week and detected by Android Authority — uses a Large Language Model (LLM) "trained on moderator actions and content removed by Reddit’s internal tools and enforcement teams," Reddit explains. The tool intends to support the already tenuous work of reddit moderators tasked with supervising the online communities they're a part of.

Just last month, Bloomberg reported that Reddit had signed a content licensing deal to a major "AI player," which would offer site and user data to train potential AI tech.

When a community and its moderators turn on the filter, a new flag will appear in the site's mod queue indicating content (posts and comments) that has been flagged as “potential harassment." Moderators can then approve or remove the content, and report back to Reddit if it was accurately detected.

The platform has introduced a slew of new features and updated experiences in recent months, ahead of its stock market debut this month. Last year, Reddit announced the Modmail Harassment Filter, which acts like a "spam" folder for moderator messages containing potentially abusive content.

How to set up Reddit's harassment filter

  1. For desktop, go to the About Community tab on the right sidebar and select Mod Tools. For iOS and Android, click on the Mod Tools button below your community's banner.

  2. Go to Moderation. Click on Safety.

  3. Select the Harassment filter option, and toggle on.

  4. Choose between the Low or High filter options. Low filtering blocks the least amount of content, but is more accurate in spotting harassment. High filter does a broader sweep of posts, and thus will block more posts. Reddit recommends using the High option if your community encounters a "significant amount of harassing content."

While Reddit says administrators will continue to automatically remove posts that directly violate Reddit’s Content Policy, the harassment filter provides communities oversight on objectionable but still "policy-complying" content that might slip through the cracks.

Chase sits in front of a green framed window, wearing a cheetah print shirt and looking to her right. On the window's glass pane reads "Ricas's Tostadas" in red lettering.
Chase DiBenedetto
Social Good Reporter

Chase joined Mashable's Social Good team in 2020, covering online stories about digital activism, climate justice, accessibility, and media representation. Her work also touches on how these conversations manifest in politics, popular culture, and fandom. Sometimes she's very funny.


Recommended For You

New reports link Meta and 'momfluencers' in perpetuating child exploitation online
The Instagram logo on a black background. Bright camera flashes light up around it.

Google Search is trying to tackle 'low-quality' content
Google Search displayed in a photo illustration.


'Iwájú: A Day Ahead' trailer goes behind the scenes of the Disney series
A girl in Lagos looks excited in the animated film "Iwájú"

Trending on Mashable
NYT Connections today: See hints and answers for March 8
A phone displaying the New York Times game 'Connections.'

Wordle today: Here's the answer and hints for March 9
a phone displaying Wordle

NYT Connections today: See hints and answers for March 9
A phone displaying the New York Times game 'Connections.'

Wordle today: Here's the answer and hints for March 8
a phone displaying Wordle

NYT's The Mini crossword answers for March 8
Closeup view of crossword puzzle clues
The biggest stories of the day delivered to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.
Thanks for signing up. See you at your inbox!