OpenAI Messed With The Wrong Mega-Popular Parenting Forum

OpenAI Messed With the Wrong Mega-Popular Parenting Forum

"Are You a CEO, Director, or Founder interested in a Feature Interview?"

All Interviews are 100% FREE of Charge

If you can think of a topic related to parenting, chances are there’s a post about it on Mumsnet, the long-standing and popular UK-based parenting forum for mothers. Over the course of more than 20 years, Mumsnet has amassed an archive of more than 6 billion words written by its dedicated user base on topics like dirty nappies and lazy husbands. A crazy rant about dolphins.

This spring, Mumsnet said it had decided to seek licensing deals with several of the sector’s biggest companies after discovering that AI companies were scraping its data, including OpenAI, which had indicated it was open to exploring a deal when Mumsnet first contacted it. After talks with OpenAI fell apart, Mumsnet said in July: Take legal action.

During early discussions, Mumsnet said, OpenAI’s head of strategic partnerships told the company that a dataset of over 1 billion words would interest the AI giant. Mumsnet executives were excited. “We spent a fair amount of time going back and forth with them,” Mumsnet founder and CEO Justin Roberts told WIRED. “We had to sign some NDAs, and they asked us for a lot of information.”

But more than a month later, OpenAI told Mumsnet that it was no longer interested in partnering, according to email exchanges reviewed by WIRED. Asked why, OpenAI staff said Mumsnet’s 6 billion-word dataset was too small to justify a licensing agreement, Roberts says. They also said OpenAI was primarily interested in large datasets that the public couldn’t access online, and that it wanted datasets that captured a broad range of human experiences.

Reached for comment by WIRED, the company echoed this sentiment: “We pursue partnerships on large-scale datasets that reflect human society, not just on public information,” said OpenAI spokesperson Kayla Wood. “We support publishers and creators’ choices, offering ways to express preferences for how their sites and content interact with AI in search results, and how we train our generative AI-based models.”

Roberts says she found the development “frustrating.” She recalls that OpenAI was initially particularly interested in Mumsnet because the platform contained a lot of content written by women. “It’s really high-quality conversation data,” she says. “90% of the conversations are by women, which is really unusual.”

OpenAI has signed various data license agreements with media and platforms over the past year, Vox Media, of Atlantic OceanAxel Springer, time(Automattic, owner of WordPress.com and Tumblr, was also said to be in licensing talks earlier this year.) Details of those deals have not been disclosed, so it’s unclear how big the stakes in each are.

When WIRED asked OpenAI about the size of the datasets covered by the commercial license, the company declined to provide that information, but spokesperson Kayla Wood stressed that the company’s partnerships with publishers are “focused on showcasing our products as publishers’ content and driving traffic to those publishers.”