Journalist here with practical experience on this issue. My outlet discovered that several AI companies had scraped our entire archive of investigative articles going back to 2015. We documented this by using specific prompts designed to elicit verbatim or near-verbatim reproduction of our content, then compared the outputs against our published articles.
Our legal team pursued two parallel tracks. First, we sent DMCA takedown notices to companies where we could identify specific outputs that reproduced our content. Second, we joined a coordinated legal action with other news organizations challenging the use of our content for AI training without licensing.
For individual content creators: document everything. Use the Wayback Machine to establish publication dates. Register your most valuable content with the Copyright Office, since 17 U.S.C. Section 412 requires registration before you can claim statutory damages. The filing fee is $65 per work, and you can file groups of related works together.