Why Harry Potter is the copyright timebomb under generative AI models

  • 2023-05-30 08:00:00
  • Sifted

In early May, the green light was given to proposals for new copyright rules for generative AI models - if the bill is passed, it would be a first on planet Earth.

This new legislation would put all large language model (LLMcompanies in trouble, since many of their services are formed on copyrighted texts. Needless to say, this proposal would have a global impact on all companies in the field.

But why Harry Potter?

StabilityAI, a renowned British start-up, claimed that its new StableLM service was trained on a new experimental dataset built on an open-source dataset containing the text of over 190,000 pirated books, including J.K. Rowling's Harry Potter and George R.R. Martin's Game of Thrones.