Tumblr users, here's what to know about Tumblr selling your data to OpenAI and MidJourney

Parent company Automattic will reportedly sell Tumblr content to OpenAI and MidJourney for training data. Here's how you can opt out.
By Elizabeth de Luna  on 
Tumblr logo seen displayed on a smartphone.
Credit: Mateusz Slodkowski/SOPA Images/LightRocket via Getty Images

OpenAI and photo generator Midjourney will soon pay to train their AI models using public Tumblr content, according to internal documents reviewed by the site 404 Media.

404 Media has reported that a deal is "imminent" between Tumblr parent company Automattic and the two AI giants but could not specify what types of data would be sold to each company. The deal also reportedly includes the sale of data from Wordpress.com, another Automattic property.

Posts detailing how user content is used for AI training were published on Feb. 27 on the staff blogs of both Tumblr and Wordpress.com. However, the posts did not tell users that Automattic was in talks to sell that data.

Here's what you need to know about how the sale may affect your Tumblr content.

Which content will Automattic reportedly sell?

404 Media has reported that the documents it reviewed did not specify the types of data that would be sold to each company. It is also unclear if this deal will affect future posts to Tumblr only, or if it encompasses past content as well. AI companies have been critiqued for their rampant use of "publicly available" content to train their models, since much of what is publicly available online is still beholden to copyright.

According to a support article on OpenAI's website, "ChatGPT and our other services are developed using information that is publicly available on the internet" among other sources. Ostensibly, OpenAI has already scraped and used any and all content once publicly available on Tumblr. Given that, the current deal could serve as a sort of mea culpa on the part of OpenAI and Midjourney as they offer to pay for the use of all future Tumblr content as well.

Automattic did not respond to requests for comment from 404 Media regarding the deal but posted a statement called "Protecting User Choice" in which the company wrote, "We currently block, by default, major AI platform crawlers—including ones from the biggest tech companies—and update our lists as new ones launch." It is unclear when the site began blocking the crawlers, which is important considering that OpenAI has been training its algorithm on public content for years.

How do I opt out?

To opt out of sharing your public Tumblr content with third parties, you'll need to toggle on a new "Prevent third-party sharing" option in the settings of each individual blog you run. This needs to be done on a web browser, not through the Tumblr app. These updates have been added to Tumblr's support article about user privacy.

If you have already elected to discourage searching of your blog in the past, the new "prevent third-party sharing" option will already be toggled on by default.

But what if you decide to forgo toggling on the setting now, opting instead to do it in three months? 404 Media reported that, in a document it accessed from Feb. 23, a Tumblr staff member asked a question addressing this issue. "Do we have assurances," they wrote, "that if a user opts out of their data being shared with third parties that our existing data partners will be notified of such a change and remove their data?"

Automattic’s head of AI, Andrew Spittle, replied, "We will notify existing partners on a regular basis about anyone who's opted out... I want this to be an ongoing process where we regularly advocate for past content to be excluded based on current preferences. We will ask that content be deleted and removed from any future training runs. I believe partners will honor this based on our conversations with them to this point."

Is this normal?

It certainly seems to be, at the very least, the new normal. OpenAI is licensing news stories from the Associated Press and is reportedly in talks to do the same with CNN, Time, and Fox. Reddit is working with Google to monetize its database of content.

It was just a matter of time before Automattic started selling its own data, especially considering how much money it's losing on Tumblr. In its entire 17-year history, the site has never been profitable, and Automattic has failed to turn it around. In November, TechCrunch reported that resources had been diverted from the struggling site to support projects elsewhere within Automattic.

Mashable Image
Elizabeth de Luna
Culture Reporter

Elizabeth is a digital culture reporter covering the internet's influence on self-expression, fashion, and fandom. Her work explores how technology shapes our identities, communities, and emotions. Before joining Mashable, Elizabeth spent six years in tech. Her reporting can be found in Rolling Stone, The Guardian, TIME, and Teen Vogue. Follow her on Instagram here.


Recommended For You
Reddit introduces an AI-powered tool that will detect online harassment
The reddit logo reflected on an iPhone screen and glowing red backdrop.

The White House is cracking down on brokers selling your data to China and Russia
President Biden making a speech at a podium in front of American flags

Watch a swarm of robots lay artificial pheromones like ants
Robots used in stigmergy study

'The Daily Show' skewers Fox News over Joe Biden hypocrisy
A woman sits behind a talk show desk with the Fox News logo visible top left.

Yelp introduces AI-generated summaries of restaurants, bars, and more
Yelp AI business review summary on a smartphone against an orange abstract background

Trending on Mashable
NYT Connections today: See hints and answers for March 8
A phone displaying the New York Times game 'Connections.'

Wordle today: Here's the answer and hints for March 9
a phone displaying Wordle

NYT Connections today: See hints and answers for March 9
A phone displaying the New York Times game 'Connections.'

Wordle today: Here's the answer and hints for March 8
a phone displaying Wordle

Best hookup apps and dating sites to find casual sex with no strings attached
Cartoon graphic of a person on a dating app.
The biggest stories of the day delivered to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.
Thanks for signing up. See you at your inbox!