OpenAI's Legal Quagmire: Latest Class Action Lawsuits

Published: July 4, 2023 • AI

OpenAI, the creator of the popular ChatGPT, is facing a new class action lawsuit that accuses the company of illicitly scraping data from the internet and using the stolen data to create its automated products. This lawsuit, filed by the Clarkson Law Firm in a Northern California court, is the latest in a series of legal challenges that question the very foundation of OpenAI’s business model.

Contents

OpenAI’s Journey: From Research Organization to Tech Titan

OpenAI’s journey began as a humble research organization, but it pivoted to a for-profit business in 2019. Since then, the company has been on a meteoric rise to the top of the tech industry. The launch of ChatGPT last November catapulted the company into the public eye, making it a household name.

However, as OpenAI lays the groundwork for future expansion, the controversial nature of its technology may jeopardize its ambitions. The AI industry is radical and new, and it’s only natural that legal and regulatory issues would arise. If legal challenges like the one filed this week prevail, they could undermine the existence of OpenAI’s most popular products and potentially threaten the budding AI industry that revolves around them.

The Clarkson Lawsuit: Allegations and Implications

Theft-Based Business Model Jointly with Microsoft

The Clarkson lawsuit alleges that OpenAI’s entire business model is based on theft. The company is accused of creating its products using stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge.

Microsoft’s role in this is also scrutinized. OpenAI, initially an open nonprofit, has evolved into a commercial partner of tech giant Microsoft, with an estimated value of $29 billion. The lawsuit suggests that Microsoft has been pushing OpenAI’s economic dependence model, further entangling the two entities and raising questions about their shared responsibility for the alleged practices.

OpenAI’s large language models, which power platforms like ChatGPT and DALL-E, are trained on massive amounts of data. Much of this data, the company has openly admitted, was scraped from the open internet. While most web scraping is legal, there are some exceptions. OpenAI has claimed that everything it does is above board, but it has also been repeatedly criticized for a lack of transparency regarding the sources of some of its data.

The lawsuit accuses OpenAI of violating multiple platforms’ terms of service agreements and various state and federal regulations, including privacy laws. The company is alleged to have systematically scraped 300 billion words from the internet, including personal information obtained without consent, in secret, and without registering as a data broker as required by law.

The lawsuit also highlights that OpenAI used the data it freely exploited to build commercial products that it is now attempting to sell back to the public for exorbitant sums of money. The lawsuit argues that without this unprecedented theft of private and copyrighted information, the OpenAI products would not be the multi-billion-dollar business they are today.

Whether the U.S. justice system agrees with the lawsuit’s definition of theft is yet to be determined. OpenAI has not yet commented on the new lawsuit.

Computer Fraud

Violation of Electronic Communications Privacy Act, 18 U.S.C. §§ 2510, et seq.: This law is meant to protect people’s private communications from being intercepted or accessed without permission. The lawsuit alleges that OpenAI and Microsoft intentionally accessed and used private information from users’ computers without their consent. This includes any data that was sent or received, like messages or other types of communication. The plaintiffs are asking for compensation for this alleged violation, which could include money for damages, legal fees, and other related costs.
Violation of the Computer Fraud and Abuse Act, 18 U.S.C. § 1030: This law is designed to protect computers from hacking and unauthorized access. The lawsuit alleges that OpenAI and Microsoft intentionally accessed users’ computers and obtained information without proper authorization. This could include any data stored on the computer or any activity that was done on the computer. Because of this alleged violation, the plaintiffs are saying they have the right to sue for damages and losses they may have suffered.

Illinois Biometric Information Privacy & Deceptive Trade Practices

Here’s a summary of the three counts with Illinois law violations:

Violation of Illinois’s Biometric Information Privacy Act, 740 ILCS 14/1, et seq.: The complaint alleges that the defendants systematically collected biometric identifiers and information, such as facial geometry, from Illinois residents without their consent, which is a violation of the Biometric Information Privacy Act (BIPA). The defendants are also accused of profiting from this biometric information, which is prohibited under BIPA. The plaintiffs are seeking damages and injunctive relief to enforce compliance with BIPA.
Illinois Consumer Fraud and Deceptive Business Practices Act 815 Ill. Comp Stat. §§ 505, et seq.: The complaint alleges that the defendants engaged in deceptive and unfair trade practices. This includes misrepresenting the characteristics of their services, advertising services with the intent not to sell them as advertised, and other conduct that creates confusion or misunderstanding. The plaintiffs claim that these practices caused substantial injury to consumers, outweighing any benefits. They are seeking monetary and non-monetary relief.
Illinois Consumer Fraud and Deceptive Business Practices Act 815 Ill. Comp. Stat. §§ 510/2, et seq.: This count repeats the allegations from the previous count, but specifically refers to the section of the Act that deals with deceptive trade practices. The defendants are accused of misrepresenting the quality of their services and creating confusion or misunderstanding among consumers. The plaintiffs are seeking damages for the injury and losses they claim to have suffered as a result of these alleged deceptive practices.

California Privacy Laws Violations

The complaint alleges several violations of California privacy laws:

California Consumer Privacy Act (CCPA): The defendants are accused of failing to provide notice to consumers about the collection of their personal information, as required by the CCPA. The defendants allegedly used web scraping technology to collect information from webpages across the internet, including personal information about consumers. They also allegedly intercepted and wiretapped users’ communications on various platforms to use these intercepted communications and gathered data to train their AI products. The defendants did not notify the affected consumers of this extensive wiretapping and that this information would be used for commercial purposes and development of their products.
California Online Privacy Protection Act (CalOPPA): The defendants are accused of violating CalOPPA by knowingly collecting information from minors under the age of thirteen without appropriate measures to ensure parental consent and without ensuring that the full deletion of information about minors is feasible from their products.
California’s Data Broker Laws: The defendants are accused of failing to register as data brokers under California law as required. The “sale” of information includes “making it available” to others for consideration, which the defendants have allegedly done by commercializing the stolen data into ChatGPT and building a billion-dollar business from it.
California Invasion of Privacy Act (CIPA): The defendants are accused of violating CIPA by intercepting communications and accessing, collecting, and tracking private information from platforms which integrated ChatGPT, Microsoft platforms, and ChatGPT platforms, to use these intercepted communications and gathered data to train their products.

The complaint seeks various forms of relief, including injunctions requiring the defendants to revise their privacy policies, fully disclose all information required under these laws, and delete all information previously collected in violation of these laws. The plaintiffs also seek restitution for the alleged unlawful business practices.

California Unfair Competition Law Violation

Count Four of the complaint alleges a violation of the California Unfair Competition Law (UCL), Business and Professions Code §§ 17200, et seq. The key points of this count are:

Unfairness: The defendants’ conduct that breached California’s privacy laws is alleged unfair within the meaning of the UCL. The unfair prong of the UCL prohibits business practices that either offend an established public policy or are immoral, unethical, oppressive, unscrupulous, or substantially injurious to consumers.
Unfair Business Acts: The defendants allegedly failed to disclose that they scraped information belonging to millions of internet users without the users’ consent. They also failed to disclose that they used the stolen information to train their products, without consent of the internet users. Furthermore, they failed to disclose that they were intercepting, tracking private information belonging to millions of ChatGPT users, and the users of other platforms which integrated ChatGPT.
Unfair Acts Tests: Unfair acts under the UCL have been interpreted using three different tests:
- Whether the public policy which is a predicate to a consumer unfair competition action under the unfair prong of the UCL is tethered to specific constitutional, statutory, or regulatory provisions.
- Whether the gravity of the harm to the consumer caused by the challenged business practice outweighs the utility of the defendant’s conduct.
- Whether the consumer injury is substantial, not outweighed by any countervailing benefits to consumers or competition, and is an injury that consumers themselves could not reasonably have avoided.

Failure to Warn

Count Fourteen of the complaint is about “Failure to Warn.” Here’s a summary of the key points:

Plaintiffs point out that there is a duty to warn consumers about the hazards inherent in the products. This allows consumers to either refrain from using the product altogether or use it carefully to avoid the danger.

Defendants in this case created AI technology and released it to the public. Plaintiffs claim that the AI products were defective due to inadequate warnings and insufficient testing before being made available to the public. Defendants knew that their technology was novel and that consumers did not fully understand its capabilities or how AI technology works in general.

Despite this knowledge, Defendants released their AI technology without adequately warning consumers about the dangers associated with it. Plaintiffs further allege that Defendants disclosed private information belonging to them and other class members without their consent. Defendants monitored, collected, and tracked users’ habits, preferences, thoughts, online activity, and geolocation data, including young children.

As a result of these unauthorized disclosures and the failure to provide adequate warnings, Plaintiffs’ and class members’ reasonable expectations of privacy in their private information were frustrated, leading to damages. Defendants collected and continue to collect personal data and information without consent, integrating it into their AI products and claiming the right to sell this data without notice.

Additionally, Defendants train their AI products on consumer input data, which remains in the system indefinitely without consumers being adequately informed. The lack of vetting and accuracy checking of this data leads to the spread of inaccurate information, invading consumers’ privacy and potentially disrupting their lives.

Plaintiffs seek injunctive relief, restitution, and any other available legal or equitable remedies. They argue that monetary damages alone are not sufficient to address the invasion of privacy and ongoing harm caused by Defendants’ actions unless they are restrained by court order.

New York Business Law Violations

Count Fifteen of the complaint is based on the New York General Business Law, specifically N.Y. Gen. Bus. Law §§ 349 et seq. Here’s a summary of the key points:

The alleged deceptive acts and practices include: a) Exploiting non-users and users of their products by stealing their data from web crawler caches without permission, including the data of minors. b) Knowing that they were collecting and profiting from individuals’ personal information, which posed a high risk. Defendants’ actions were negligent, knowing, willful, wanton, and reckless regarding the rights of the New York Plaintiff and New York Subclasses. c) Misrepresenting compliance with common law and statutory duties regarding the security and privacy of the plaintiff’s and subclass members’ data. d) Omitting and concealing the fact that they were stealing and profiting from the mass collection and analysis of the plaintiff’s and subclass members’ data without adequate consent. e) Omitting and concealing the fact that they did not comply with duties pertaining to the security and privacy of the data, including the inability to delete the data once it is incorporated into their language models (LLMs) as training data.

Defendants’ representations and omissions were material because they were likely to deceive reasonable consumers about the terms of use of their products and the control mechanisms over the plaintiff’s and subclass members’ data. Defendants’ actions were intentional, knowing, and malicious, and they recklessly disregarded the rights of the New York Plaintiff and New York Subclasses.

As a result of Defendants’ deceptive and unlawful acts, the New York Plaintiff and New York Subclasses have suffered and will continue to suffer injury, monetary losses, and other damages. The deceptive and unlawful acts affected the public interest and consumers at large, including millions of New Yorker User Class Members and Non-User Subclass Members.

The New York Plaintiff and New York Subclasses seek various forms of relief, including actual or statutory damages, treble damages, injunctive relief, and attorney’s fees and costs.

Risks from Unchecked AI Proliferation

The lawsuit also shifts focus to the broader risks associated with unchecked AI proliferation. It argues that the international community agrees that unchecked and lawless AI proliferation poses an existential threat. The lawsuit provides an overview of these risks, including massive privacy violations, AI-fueled misinformation campaigns, targeted attacks, sex crimes, bias, hypercharged malware creation, and autonomous weapons.

The lawsuit suggests that these risks are not just hypothetical but are already manifesting in various ways. For instance, it alleges that OpenAI’s practices have led to massive privacy violations, with personal data being scraped and used without consent. It also points to the potential for AI to fuel misinformation campaigns, targeted attacks, sex crimes, and bias. The lawsuit also raises concerns about the potential for AI to be used in the creation of hypercharged malware and autonomous weapons.

Despite these risks, the lawsuit suggests that there is an opportunity on the other side. It implies that with proper regulation and oversight, the potential harms of AI can be mitigated, and the technology can be used in a way that respects privacy, consent, and other ethical considerations. However, it argues that this requires a significant shift in how companies like OpenAI operate and how AI is regulated more broadly.

Plaintiffs’ Demands

Plaintiffs are not only suing for money. The plaintiffs’ demands for relief are as follows:

A. Injunctive relief: The plaintiffs request a temporary freeze on commercial access to and development of the products until the defendants can demonstrate satisfactory completion of certain requirements. These include the establishment of an independent body responsible for approving product uses, implementation of accountability protocols, cybersecurity safeguards, transparency protocols, opt-out options for data collection, technological safety measures, threat management program, and the creation of a monetary fund to compensate for past and ongoing misconduct.

B. Actual damages: The plaintiffs seek compensation for economic and non-economic harm, the specific amount of which will be determined at trial.

C. Attorneys’ fees and costs.

D. Treble damages: Treble damages refer to the awarding of damages that are triple the amount of the actual damages suffered by the plaintiff. In this case, the plaintiffs are requesting treble damages as allowable under applicable laws. Treble damages are often awarded in cases where the defendant’s conduct is deemed particularly egregious, and they serve as a way to deter similar behavior in the future. By tripling the damages, the court aims to provide a more substantial financial penalty to the defendant and compensate the plaintiff for the harm suffered.

E. Punitive damages: Punitive damages are additional damages that may be awarded to the plaintiff in certain cases to punish the defendant for their wrongful conduct and deter others from engaging in similar behavior. Punitive damages go beyond compensating the plaintiff for their losses and are intended to send a message that the defendant’s actions were highly reprehensible. The plaintiffs in this case are seeking punitive damages as allowable under applicable laws, indicating that they believe the defendants’ conduct warrants such additional punishment.

F. Exemplary damages: Exemplary damages are similar to punitive damages and are awarded to the plaintiff as a means of punishing the defendant for their actions. Exemplary damages are typically awarded in cases where the defendant’s behavior is considered willful, wanton, or malicious. These damages serve as an example or “exemplar” to others, highlighting the consequences of engaging in similar misconduct. The plaintiffs are requesting exemplary damages as allowable under applicable laws, suggesting that they believe the defendants’ actions warrant this additional form of punishment.

The plaintiffs also demand a jury trial on all triable issues.

OpenAI’s Mounting Legal Troubles

The Clarkson lawsuit is not the only legal challenge OpenAI is currently facing. The company has been subjected to an ever-growing list of legal attacks, many of which make similar arguments.

Another lawsuit was filed in California on behalf of numerous authors who claim their copyrighted works were scraped by OpenAI in its effort to train its algorithms. The suit accuses the company of stealing data to fuel its business and creating its products by “harvesting mass quantities” of copyrighted works without “consent, without credit, and without compensation.” It characterizes platforms like ChatGPT as being “infringing derivative works” made without the plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act.

Both the Clarkson suit and the authors’ suit bear some resemblance to another lawsuit filed shortly after ChatGPT’s release last November. This lawsuit, filed asa class action by the offices of Joseph Savari in San Francisco, accuses OpenAI and its funder and partner Microsoft of having ripped off coders in an effort to train GitHub Copilot — an AI-driven virtual assistant. The lawsuit specifically accuses the companies of failing to adhere to the open-source licensing agreements that underpin much of the development world, claiming that they instead lifted and ingested the code without attribution, while also failing to adhere to other legal requirements. In May, a federal judge in California declined OpenAI’s motion to have the case dismissed, allowing the legal challenge to move forward.

In Europe, OpenAI has faced similar legal inquiries from government regulators over its lack of privacy protections for users’ data. And back home, Australian law firm Gordon Legal is suing ChatGPT and its maker on behalf of a Victorian Mayor.

The Bigger Picture: OpenAI’s Position and the Future of AI

All of this legal turmoil takes place against the backdrop of OpenAI’s meteoric ascent to Silicon Valley stardom — a precarious new position that the company is clearly fighting to maintain. As the company fends off legal assaults, OpenAI’s CEO, Sam Altman, has been attempting to influence how new laws will be built around his axis-shifting technology. Indeed, Altman has been courting governments all over the globe in an effort to lay the groundwork for a friendly regulatory environment. The company is clearly positioned to be the de facto leader in the AI industry — if it can fend off the ongoing challenges to its very existence, that is.

Conclusion

The legal challenges faced by OpenAI are not just about one company’s practices. They raise broader questions about the ethical and legal boundaries of AI development. As AI technology continues to evolve and become more integrated into our daily lives, it is crucial that we establish clear guidelines and regulations to ensure that these powerful tools are used responsibly and ethically. The outcome of these lawsuits could set important precedents for the future of AI, influencing how companies approach data collection, privacy, and intellectual property rights in the AI industry.

The legal landscape of AI is still being shaped, and the decisions made now will have far-reaching implications. As we watch the drama unfold, we must remember that at the heart of these lawsuits are fundamental questions about privacy, consent, and the responsible use of data. How we answer these questions will define the future of AI.