Meta's AI Model 'Memorized' Huge Chunks of Books, Including 'Harry Potter' and '1984'

An analysis recently conducted by research teams from Stanford, Cornell and WVU, three world-renowned U.S. universities, highlighted a number of somewhat token questions regarding the relationship between artificial intelligence models and copyright. In recent times, the unauthorized use of copyrighted books and novels by this technology has often been discussed, going to highlight how the tech companies behind the mentioned products are allegedly infringing copyright.

According to what is understood through the research carried out, the models in question are not simply being “trained” by providing them with text material, such as current novels, historical texts and the like; rather, it appears that these artificial intelligences are storing entire copies of the content in question, to consult them when needed.

The researchers looked at several large language models (LLMs) available online, subjecting each sample to a specific test. By administering short snippets of text from a novel, they asked the chatbot to complete subsequent lines, thus testing its memory capacity.