Judge Denies Nvidia’s Request to Dismiss AI Copyright Lawsuit

U.S. District Judge Jon Tigar has rejected Nvidia’s request to dismiss a copyright infringement lawsuit filed against it. Nvidia argued it wasn’t responsible for how its clients use its AI-powered NeMo Megatron Framework. According to TorrentFreak, Nvidia wanted the court to throw out claims of direct copyright infringement related to its use of the Bibliotik eBook torrent tracker, the Books3 dataset, and ‘The Pile’ dataset for training language models. Nvidia then referred to the Cox vs. Sony ruling, where the U.S. Supreme Court decided that a service provider isn’t liable for piracy carried out by its users.

Nvidia stated that its NeMo Megatron Framework has many “non-infringing uses” and that it didn’t promote it as a tool for piracy. The company felt this should fall under Justice Clarence Thomas’s decision, which says, “Under our precedents, a company is not liable as a copyright infringer for merely providing a service to the general public with knowledge that it will used by some to infringe copyrights.” Unfortunately for Nvidia, Judge Tigar disagreed with their argument, saying that the problem wasn’t the framework itself, but specific scripts within it that broke copyright rules.

He explained that these scripts were designed to make it easier for users to automatically download and prepare ‘The Pile’ dataset, which the complainants claim contained copyrighted work. “The scripts are alleged to have no other purpose than to speed up the process of infringement, unlike the digital video recorder systems at issue in Sony Corp. or the internet service provided in Cox,” Judge Tigar wrote.

3rd party Ad. Not an offer or recommendation by hardwareanalytic.com.

Bibliotik is a private eBook torrent tracker, which reportedly contains over 197,000 books. It was then included in the Books3 dataset, which itself was part of the over 800-gigabyte ‘The Pile’ dataset. ‘The Pile’ was then used to train Nvidia’s AI Large Language Models (LLMs), leading several authors to file a class-action lawsuit against the company for copyright infringement.

There have been similar copyright infringement cases involving AI companies using data for training their models. Besides this case against Nvidia, Meta has also been facing a similar lawsuit since last year. Meta even defended itself by arguing that using pirated material is legal if you don’t actively share the content.

Google has also been pushing for AI data scraping to be considered fair use, stating that it wants “copyright systems that allow appropriate and fair use of copyrighted content to enable the training of AI models in Australia on a broad and diverse range of data while supporting workable opt-outs for entities that prefer their data not to be trained using AI systems.”

With this decision, the authors’ class-action lawsuit against Nvidia will move forward, and we will likely learn more details as the case continues. We don’t have a date yet for the next hearing. Still, we expect this to be a multi-year battle as the AI giant fights against the allegedly infringed writers.

Latest