California's AI Transparency Requirements — SB 942 Compliance for Startups [Discussion]

TM

TechFounderMike OP Jan 30, 2024

California's SB 942 (the California AI Transparency Act) took effect January 1, 2024 and I'm trying to figure out what this actually means for startups building on top of foundation models.

We're a Series A company with an AI-powered legal document analysis tool. We use GPT-4 and Claude under the hood for summarization and clause extraction. Our product processes contracts and outputs structured summaries, risk flags, and suggested edits.

SB 942 requires "generative AI systems" to disclose when content is AI-generated and provide certain transparency information. But the implementation details are murky at best:

Do we need to watermark every output our tool generates?
Do we need to disclose which specific model we're using under the hood?
What counts as a "generative AI system" versus a tool that happens to use AI internally?
Our outputs are a mix of AI-generated text and template-based content — how do we handle hybrid outputs?

I've read the bill text three times and I'm still confused. Anyone dealt with this yet?

SJ

StartupLawyerJess Attorney Feb 4, 2024

I've been advising several AI startups on SB 942 compliance, so let me break this down.

SB 942 primarily targets providers of generative AI systems, not every company that uses AI. The key obligations fall on what the statute calls "covered providers" — entities that create, own, or operate a generative AI system with over 1 million monthly users. Unless your legal doc tool has over 1M monthly users, the heaviest obligations don't apply directly to you.

Here's what covered providers must do:

AI-generated content disclosure: Include a "manifest disclosure" (machine-readable metadata) in AI-generated content indicating it was made by AI
Detection tools: Make available, at no cost, a detection tool that allows the public to assess whether content was created by their system
Latent disclosure: Include a "latent disclosure" embedded in the content itself (essentially watermarking) that persists even if the manifest metadata is stripped
Public-facing transparency: Maintain documentation about the system's capabilities, limitations, and intended uses

For companies like yours that build on top of foundation models via API: you're what the law considers a "deployer." Your obligations are lighter — primarily ensuring you don't strip the metadata/watermarks that the foundation model provider embeds, and that you disclose to users that your product uses AI.

You do NOT need to disclose which specific model you use. The law requires disclosure that AI is involved, not the specific vendor or model.

DD

DataPrivacyDan Jan 31, 2024

The privacy implications of SB 942 are what concern me most, and I think they're getting overlooked in the compliance rush.

The transparency requirements effectively ask companies to create audit trails of AI-generated content. If you need to prove that content was AI-generated (or prove you disclosed it properly), you're likely logging inputs and outputs. That's a massive data retention obligation with its own privacy risks, especially under CCPA/CPRA.

Think about it: if your product processes legal contracts (like OP's does), you're now potentially retaining copies of sensitive business documents as part of your AI transparency compliance. That creates a tension between SB 942's transparency goals and CPRA's data minimization principles.

My recommendation: work with your privacy counsel to design a compliance approach that satisfies SB 942 without creating unnecessary data retention. You can maintain metadata about disclosures without retaining the actual content.

MK

AttorneyMichaelK Attorney Feb 1, 2024

One important clarification for this thread since I see the title references SB 1047: SB 942 and SB 1047 are completely different bills addressing different aspects of AI regulation.

SB 1047 (Safe and Secure Innovation for Frontier Artificial Intelligence Models Act) was about AI safety obligations for developers of large frontier models. It would have required safety testing, kill switches, and liability for catastrophic harms. Governor Newsom vetoed it in September 2024.
SB 942 (California AI Transparency Act) is about disclosure and transparency — watermarking, metadata, and detection tools. It was signed into law and took effect January 1, 2024.

I mention this because there's a lot of confusion in the startup community. People hear "California AI law" and assume SB 1047 passed. It didn't. SB 942 is the one you need to worry about.

Also worth monitoring: the EU AI Act's transparency obligations took effect in February 2024 and have similar (but not identical) requirements. If you serve EU customers, you may need to comply with both frameworks.

DM

DevOps_Marcus Feb 3, 2024

From an engineering perspective, the "manifest disclosure" requirement is the most concrete obligation and also the most implementable. The standard the law points toward is C2PA (Coalition for Content Provenance and Authenticity) — an open standard for embedding provenance metadata in digital content.

The good news: OpenAI, Google, and Anthropic are all C2PA members and have committed to implementing it. So if you're consuming their APIs, the metadata should be included in responses by default (or will be soon).

The challenge: C2PA was designed primarily for images and video. For text outputs (like OP's legal doc summaries), the standard is less mature. There's no universally adopted method for "watermarking" text in a way that persists through copy-paste. The "latent disclosure" requirement for text content is, frankly, an unsolved technical problem.

My prediction: for text-based AI applications, regulators will initially focus on the manifest disclosure (metadata) and public-facing transparency requirements, not the latent watermark. The technology just isn't there yet for robust text watermarking.

TM

TechFounderMike OP Jan 31, 2024

Thanks everyone — this is incredibly helpful. So if I'm understanding correctly, our main obligations as a "deployer" are:

Don't strip C2PA metadata from model outputs (easy — we weren't doing this anyway)
Disclose to users that our product uses AI (we already do this in our Terms and in-product)
Maintain transparency documentation about how we use AI (need to create this)

That's much more manageable than I feared. The watermarking/latent disclosure stuff is on OpenAI and Anthropic as the "covered providers," not on us.

Follow-up question: does anyone know how this interacts with trade secret protection? We have proprietary prompt engineering and fine-tuning that gives us a competitive advantage. If a competitor could use the SB 942 detection tools to reverse-engineer which model we're using and how we're using it, that's a real business concern.

SJ

StartupLawyerJess Attorney Feb 12, 2024

On the trade secret question — it's a valid concern but I think the risk is limited in practice. SB 942's detection tools are designed to identify whether content is AI-generated, not which specific model or which prompts were used. The C2PA metadata identifies the provider (OpenAI, Anthropic, etc.) but not the downstream application or its proprietary techniques.

That said, if your competitive advantage depends on people NOT knowing you use AI, that's the real tension. SB 942 requires disclosure that AI is involved. You can't hide the fact that your product is AI-powered. But you don't have to reveal your prompt engineering, fine-tuning approach, or RAG pipeline architecture.

More broadly, I'd encourage startups to lean into AI transparency rather than fighting it. The market is moving toward expecting disclosure, and companies that are upfront about their AI use build more trust than those that try to obscure it. Several of my clients have found that "powered by AI" is a selling point, not a liability.

RL

RemoteCFO_Lisa Feb 12, 2024

From a business operations standpoint, I want to flag the compliance cost angle. For my startup clients, I'm seeing three categories of SB 942 compliance expense:

Low cost (deployers under 1M users): Mostly documentation and disclosure updates. $5-15K in legal fees plus minor engineering time. This is where most startups fall.
Medium cost (deployers approaching 1M users): Need to plan for "covered provider" obligations as you scale. $25-50K for proactive compliance architecture.
High cost (covered providers 1M+ users): C2PA implementation, detection tool development, ongoing monitoring. $100K+ easily, potentially much more for image/video AI companies.

The threshold matters a lot. If you're at 800K monthly users and growing, you need to be building toward compliance NOW, not scrambling when you cross 1M.

Also worth noting: other states are watching California closely. Colorado's AI Act (SB 24-205) takes effect in February 2024 with different but overlapping requirements. This isn't going to be a California-only obligation for long.

DD

DataPrivacyDan Feb 3, 2024

One more thing worth mentioning for the privacy-minded: SB 942 includes a provision that users can request information about whether specific content was generated by a covered provider's AI system. This creates something like a "right to know" for AI-generated content, parallel to CCPA's right to know for personal data.

The practical implication: if a user receives content (say, a customer service email or a marketing message) and suspects it's AI-generated, they can ask the company to confirm. The covered provider must maintain records sufficient to respond to these requests.

This is going to get interesting when people start using it to challenge AI-generated legal documents, medical summaries, or financial analyses. "Was this advice generated by AI?" is about to become a very common question — and companies will need to answer honestly.

KM

KellyMartinez_Mod Mod Feb 9, 2024

Great discussion covering both the legal and practical angles. Key takeaways for anyone working with AI in California:

SB 942 vs SB 1047: SB 942 (transparency/disclosure) is law. SB 1047 (safety/frontier models) was vetoed. Don't confuse them.
Covered providers (1M+ users): Must implement C2PA metadata, latent watermarks, free detection tools, and public documentation
Deployers (most startups): Lighter obligations — don't strip metadata, disclose AI use to users, maintain transparency docs
Privacy tension: Compliance logging may conflict with CCPA data minimization — design carefully
Trade secrets: Must disclose AI use but NOT specific models, prompts, or proprietary techniques
Cost: $5-15K for most startups (deployers), $100K+ for covered providers
Watch also: Colorado AI Act (SB 24-205), EU AI Act transparency provisions

Enforcement specifics are still being developed by the AG's office. We'll update this thread as guidance is issued.

California's AI transparency requirements just kicked in — SB 942 compliance for startups