Figma and Salesforce Sued Over AI Training DataWhy Your “Friendly SaaS Vendor” Is Suddenly a Data-Rights Adversary 🤖
Figma and Salesforce are now both defendants in high-profile lawsuits over how they trained their AI models—and more importantly, whose data they used to do it.
- In Figma’s case, the allegation is that it quietly turned customer design files into AI training fuel, despite earlier assurances that it wouldn’t.
- In Salesforce’s case, authors say the company used thousands of pirated books in datasets like RedPajama and The Pile to train its xGen models, while its CEO publicly criticized other companies for using “stolen data.”
If you are a B2B buyer, SaaS founder, or anyone drafting DPAs and AI addenda, these cases are not just tech gossip. They are the roadmap for your next dispute if you don’t lock down data rights in your contracts.
Two Lawsuits, Two Types of Data, One Theme: “You Used Our Stuff”
Here’s the 10,000-foot comparison:
| 🏢 Defendant | 📍 Court & Filing | 📚 What Data Is at Issue | ⚖️ Core Legal Theories | 💣 Why It’s Interesting |
|---|---|---|---|---|
| Figma 📐 | N.D. Cal., proposed class action filed Nov. 21, 2025 | Customer design files, layers, text, images built inside Figma | Misappropriation of trade secrets, unlawful access, misrepresentation, data privacy violations | Focuses on enterprise customer IP, not public web data; framed heavily as a broken promise / contract story |
| Salesforce ☁️ | N.D. Cal., class action filed Oct. 15–16, 2025 | Thousands of allegedly pirated books in training datasets (RedPajama, The Pile) | Copyright infringement, DMCA-type theories, unjust enrichment | Classic “pirated books” fact pattern; juxtaposed with CEO’s public stance against “stolen” AI data |
Both cases sit in the shadow of the $1.5B Anthropic settlement with authors whose books were pulled from pirate sites for training data.
The message from plaintiffs’ bar is pretty clear:
“If Anthropic paid, why not you?”
Figma: When “Your Files Stay Yours” Meets AI Training 📐🤖
The Figma case (Khan v. Figma Inc.) is notable because it is not about scraping public web content. It’s about what your paid, logged-in SaaS product does with the work you create inside it.
According to the complaint and coverage:
- Figma allegedly used customers’ proprietary design files—layouts, components, text, images, layer metadata—to train its generative AI tools without clear, informed consent.
- Plaintiffs say Figma auto-opted users into AI training, despite previous statements and marketing suggesting that customer content would not be used that way without permission.
- They claim this helped Figma boost its valuation around its 2025 IPO—Reuters mentions a $1.2B raise and other reports peg its implied valuation much higher, with plaintiffs arguing the value of customer IP used for training could be “tens or hundreds of billions.” (Reuters)
Figma publicly denies that it trains on customer content without permission and says its policies are being misunderstood. But the complaint is structured to look less like a pure privacy case and more like a mix of trade secret theft and broken commercial assurances. (Longbridge SG)
The key allegation is simple and dangerous:
You told us our designs were safe and under our control. Then you quietly turned them into training data.
For enterprise design teams shipping confidential product plans through Figma, that’s not a theoretical issue—it’s a trade-secret threat.
Salesforce: Pirated Books, Datasets, and Public Hypocrisy ☁️📚
The Salesforce suit is much closer to the Anthropic / OpenAI line of cases, but with its own twist.
Authors Molly Tanzer and Jennifer Gilmore allege that:
- Salesforce used thousands of pirated books, including theirs, by ingesting datasets like RedPajama and The Pile to train its xGen language models. (Reuters)
- These datasets were assembled from known pirate sources, so the company allegedly knew or should have known that they contained unlicensed works.
- The suit emphasizes Salesforce CEO Marc Benioff’s prior criticism of AI companies using “stolen” training data and his statements that paying creators would be “easy to do,” painting a picture of corporate hypocrisy. (Business Insurance)
Legally, it’s a straight copyright case with a strong narrative hook:
| 📚 Element | 🔍 Plaintiffs’ Framing |
|---|---|
| Source of works | “Pirated books” from datasets whose origins are widely discussed in the AI world |
| Use | Downloading, storing, and using full copies to train xGen models |
| Ongoing conduct | Continued storage and processing, not just historical ingestion |
| CEO statements | Benioff’s public stance becomes quasi-admissions about what should be done |
The factual overlap with the Anthropic case (pirate-sourced datasets, authors seeking compensation) makes it easier for plaintiffs to say: “Courts and companies already treat this as wrongful. We just haven’t litigated it against Salesforce yet.” (People.com)
Why These Cases Should Terrify (or Focus) B2B SaaS Users 🧩
From the customer side, Figma and Salesforce illustrate two different but related problems:
- Figma-type risk: your own proprietary content inside a vendor’s SaaS product is quietly treated as model fuel.
- Salesforce-type risk: your vendor’s model is trained on other people’s allegedly infringing content, and you’re now re-using that model inside your workflows.
For a typical enterprise buyer, the real questions are:
- Can my vendor train on our data?
- If yes, for what purposes (maintenance, product improvement, general model training, resale)?
- If their model turns out to be trained on infringing content, who carries the IP risk downstream—them or us?
Right now, many DPAs and SOWs answer those questions with a vague “we comply with all applicable laws” and a buried reference to the vendor’s online privacy policy. That’s not going to survive this litigation wave.
The Contract Problem: Data Rights Clauses Stuck in the Pre-AI Era 📜⚙️
Most legacy SaaS contracts were never written with “model training” in mind. They distinguish between:
- Customer Data (belongs to you, used to provide the service), and
- Service Data / Aggregated Data (usage metrics, logs, etc., used to improve the product).
AI training sits awkwardly between those. Vendors increasingly treat training as “improvement of the service.” Customers often view training on their actual content as a separate, licensable use.
Figma and Salesforce are the two archetypes of what happens when that gap is not negotiated clearly.
Here’s how a more AI-aware contract architecture looks:
| 🔐 Clause Type | 🧠 What It Should Address | 🧷 Why It Matters in Light of Figma / Salesforce |
|---|---|---|
| Data-Use Grant | Spell out whether the vendor may use Customer Content (not just logs) for (a) providing the service, (b) improving it, (c) training generalized models. | Makes it much harder for a vendor to argue “improvement” includes training on your raw creative files or full text. |
| Training Opt-In / Opt-Out | Separate, express consent for model training, ideally with project-level or tenant-level controls. | Prevents auto-opt-in situations like those alleged in Figma, and gives you a paper trail of choices. (Reuters) |
| Confidential / Trade Secret Protection | Treat design files, source code, unpublished product plans, etc., as confidential/trade secrets with narrow exceptions. | Supports misappropriation claims if the vendor re-uses that content in ways never negotiated. (Prism Media) |
| Third-Party Content Warranties & Indemnity | Vendor warrants it has rights to training data; indemnifies you against IP claims tied to its models. | Directly targets Salesforce-type risk: if their model is built on allegedly pirated books, you have a contractual backstop. (Saveri Law Firm) |
| Audit / Transparency | Right to high-level disclosure of training sources (categories, not every file) and to specific info if a claim is asserted. | Makes it harder for vendors to hide behind “proprietary” training pipelines when you’re on the hook in a third-party suit. |
| Sunset / Deletion of Training Data | Mechanism for demanding deletion or exclusion of your content from future model versions when the contract ends or upon breach. | Lines up with remedies seen in Anthropic settlement (destroying copies / excluding works from future use). (People.com) |
Most customers don’t negotiate all of this. But after Figma and Salesforce, not negotiating it has a clear cost.
What If You Suspect Your Vendor Trained on Your Data? Demand Letters 101 ✉️
There are two common fact patterns where your next move is a demand letter rather than quiet acceptance:
- You discover, or strongly suspect, that a SaaS vendor used your proprietary content (designs, code snippets, documents) to train its model without an explicit grant.
- You learn that the vendor’s model itself is under fire (Salesforce-style), and you worry about downstream liability for using it in your product or workflow.
A well-structured demand letter in this context usually has three jobs:
- Information – pin the vendor down on what they did and under what theory.
- Preservation – demand preservation of logs, training data, and contractual history.
- Positioning – reserve your rights and frame this as a breach of contract / misrepresentation / confidentiality issue, not just “hurt feelings.”
In a Figma-style scenario, you’re typically focusing on:
- what exactly the contract, ToS, and DPA said about data use;
- any marketing or sales assurances (“your designs are safe,” “we never train on your files without consent”); (Reuters)
- what you want fixed: opt-out of training, separation of your tenant from shared models, deletion of existing training copies where feasible, and potentially monetary compensation if trade secrets were exposed.
In a Salesforce-style scenario, the demand may be more defensive:
- do you have AI IP indemnity from the vendor;
- what representations they made about training sources;
- whether they will defend/indemnify you in the event of third-party claims.
Either way, these are not “support tickets.” They’re pre-litigation documents that will be read later by a judge if things go sideways.
For Vendors: Don’t Be Figma or Salesforce by Accident 🧱
From the vendor side, the instinct is often to keep policies as broad as possible: “we can use your content to improve the service.” That’s understandable. But after these cases, the “don’t ask, don’t tell” approach to training data is starting to look like malpractice.
A more sustainable posture:
- Segment training rights. Treat bug-fixing and maintaining the service differently from training generalized models. Get separate consent.
- Align marketing with contracts. Don’t promise “we never train on your content” if your DPA says you can. That’s essentially the Figma theory in one sentence. (Reuters)
- Clean your upstream datasets. If your model relies on datasets like RedPajama or The Pile, you need a real theory of rights, not wishful thinking. Salesforce is being sued precisely because plaintiffs say those datasets were sourced from pirated material. (Salesforce Ben)
- Offer real opt-outs for enterprises. Many large customers will live with some training use if they can meaningfully opt out where it matters (sensitive projects, regulated data).
The alternative is to let plaintiffs’ lawyers, not your product team, decide how your AI stack evolves.
The Bigger Picture: Training Data Litigation as the New Normal ⚖️
Taken together with Anthropic’s settlement and ongoing suits against Cohere, OpenAI, and others, the Figma and Salesforce cases mark a shift:
- Public web scraping is no longer the only flashpoint; inside-the-app data use is now front and center. (Prism Media)
- The plaintiffs’ bar has a growing playbook—pirated datasets for some cases, misrepresented SaaS data practices for others.
- Contract lawyers and in-house counsel are suddenly very important in AI risk management, because data-use rights are now a negotiated economic term, not boilerplate.
If your company builds or buys SaaS and AI, the practical takeaway is straightforward:
- Treat data-use clauses, training rights, and AI indemnities as core deal terms, not fine print.
- Assume that any mismatch between what you say in marketing and what you do in product will be quoted back to you in a complaint.
- Have a demand-letter playbook ready—on both offense and defense—before you find your name next to Figma and Salesforce in the next wave of AI-training cases.