Startup using blockchain to prevent copyright theft by AI is valued over $2 billion after fresh funding

Generative AI models require huge amounts of training data to enable their systems to produce advanced outputs. But the data that goes into them is often from sources where copyright restrictions are in place.

Cfoto | Future Publishing | Getty Images

San-Francisco-based startup Story said Wednesday that it raised $80 million of funding for a blockchain designed to prevent artificial intelligence makers like OpenAI from taking creators’ intellectual property without permission.

The round values the two-year-old company at $2.25 billion, sources familiar with the matter told CNBC. The sources preferred not to be named as the information has not been made public.

Story said that it raised the funds in a Series B round — typically the third major round of funding in a private startup’s growth journey after seed and Series A — led by Andreessen Horowitz, which is also known as a16z. Crypto-focused venture capital firm Polychain also invested in the round.

Story, which is unprofitable and doesn’t disclose revenue, is riding high on hopes from backers that its solution will give creators the ability to be attributed and compensated for work that gets fed into popular artificial intelligence platforms such as OpenAI’s ChatGPT and Perplexity’s AI-powered search engine.

Building an ‘IP legoland’

A blockchain is a distributed database that maintains an immutable record of activity. It is the technology that underpins cryptocurrencies, such as bitcoin and ether.

Story acts as a blockchain network that allows creators to prove they made a piece of content and are the intellectual property owners by storing their IP on the platform.

The firm’s tech works to protect individuals and entities’ IP by embedding terms associated with it, such as licensing fees and royalty-sharing arrangements, into smart contracts.

Smart contracts are digital contracts stored on a blockchain that automatically execute once a certain set of terms are met.

This makes copyright holders’ IP “programmable,” SY Lee, Story’s co-founder and CEO, explained to CNBC, as it sets up rules for how their content can be used and the price to pay for reproducing or remixing their works.

The benefit of this, Lee said, is that it effectively cuts out the middlemen typically involved in disputes over copyright theft in the media landscape.

“Now it’s turned from IP into IP Lego,” Lee told CNBC. “Now, you don’t need to go through lawyers. You don’t need to go through the agents. You don’t need to do this very lengthy business development negotiation. You just embed your licensing, royalty-sharing terms into small contracts.”

Story makes money by charging a network fee for any action that takes place on its network.

One example of a firm using Story is Ablo, an AI tool that allows users to make their own tailored items of fashion using designs from household brands including French designer clothing firm Balmain and Italian luxury fashion house Dolce and Gabbana.

Brands are compensated for their use of fashion designers’ IP through various respective licensing and revenue-sharing agreements.

Fighting AI copyright theft

Story is now trying to tackle a timely problem with its tech — theft of copyrighted media on the internet by powerful generative AI models like OpenAI’s ChatGPT.

These models, which power many AI chatbots that are increasingly being used as an alternative to search, require huge amounts of training data to enable their systems to produce advanced and informative answers to user queries.

But the data that goes into fueling these AI models is often from sources where there’s copyright restrictions in place.

The New York Times last year hit Microsoft and OpenAI with a copyright lawsuit seeking damages over abuse of the newspaper’s intellectual property.

In the suit, the Times included several examples of instances where GPT-4 produced altered versions of material originally published by the newspaper.

Big tech companies like Microsoft, which has invested $13 billion into OpenAI and is reportedly entitled to a 49% stake in the firm, are “essentially stealing your IP for training purposes and actually capturing all the upside,” Lee said.

In a motion to dismiss part of the Times’ suit in March, Microsoft said that such claims were “unsubstantiated,” and that the lawsuit presented a false narrative of “doomsday futurology.”

Content used to train these models, Microsoft’s lawyers argued, “does not supplant the market for the works, it teaches the models language.”

Microsoft was not immediately available for comment when contacted by CNBC about Lee’s comments.

Good IP is needed to train such AI models, Story’s Lee told CNBC, but he added that AI firms stand to lose long-term if they don’t adequately compensate the publishers and creators they’re sourcing those vast troves of IP data from.

“You need great IP going into AI to have a sustainable growth in AI. Without great human-created data, AI models are not going to be able to train themselves and improve themselves,” Lee said.

Not many startups are designing tech designed specifically to combat IP theft by AI.

One project from the University of Chicago, called Glaze, offers a free app for artists to combat the theft of their IP by AI tools with technology that makes subtle changes to artworks designed to disrupt AI models’ ability to read data on the works of art and mimic the style of the artwork and its artist.

Story, which was founded in 2022, plans to use the fresh cash to build out its IP network infrastructure and onboard more developer partners. The company already has over 200 developers using its platform to enable content creation using programmable IP.

Lee added: “There’s a huge, amazing digital renaissance making everyone a creator or a studio, but at the same time, if no one’s actually compensating and actually getting the IP monetized right, it’s a suicidal action for AI in the long term.”