By Jacinto Caballero-George
A few weeks ago, over one million works on popular fanfiction website, Archives Of Our Own, often referred to as AO3, was stolen by an AI company. Most of these were unpublished drafts, but some were published fanfictions. At first, many users were confused, but AO3 was quick to explain the situation. AI companies have taken and scrubbed this data to train their AI on writing and language.
AI company Sudowrite and OpenAI ChatGPT-3 were proven to be 2 of the AI’s that did this. AO3, however, is not the only fanfiction website that this happened to. Some have said works from not as popular websites, like fanfic.net and Wattpad, were also stolen from. However, there is less evidence that this happened.
One of the main reasons these companies stole from AO3 was to train Large Language Models. These are used to generate large amounts of text. Sudowrite used these works for their new “book writing.” This is a new feature where the AI can write a full length book based on the prompt. Many AO3 fanfictions are well over 100,000 words, meaning if this AI was trained with these works it would be able to write a response of similar length.
AO3 was quick to take action. As some may remember, on July 10th, 2023, a Russian hacking team tried to take down the entire AO3 website. However, in only 25 hours, AO3 was able to put their website back up fully running. So their speed in this debacle was not a surprise.
AO3 was able to completely delete the original data of these works, as they are all on a backup server that was not breached. In deleting this full data, the work of OpenAI and Sudowrite were no longer able to use these fanfictions in their data to produce long form text and responses. After AO3 was able to fully block these websites from data scrubbing, they put back up the published works and returned the unpublished ones.