Using AI To Reproduce Content: What’s Legal, What’s Not
AI makes it insanely easy to do things that used to be slow and painful:
- Turn an article into a new “blog post” in your own voice.
- Have a model spit out code that looks a lot like what’s on GitHub.
- Ask for “a version of this chapter that’s not plagiarized.”
- Generate an image that looks suspiciously like something from Getty or a famous artist.
The problem: putting AI in between you and someone else’s work doesn’t magically wipe away copyright or contract law.
This guide walks through:
- When AI-assisted reproduction is low-risk vs dangerous,
- How courts and the Copyright Office are thinking about AI outputs,
- Practical rules of thumb for using AI without stepping on landmines.
🧩 “Reproducing Content” With AI: What Are You Actually Doing?
Different uses look similar from the user’s perspective, but they’re very different legally.
| 💡 What You Ask AI To Do | What’s Happening Under The Hood | Why It Might Be A Problem |
|---|---|---|
| “Summarize this article” | Model digests the original and outputs a shorter version in its own words. | Usually fine if you’re not copying structure/phrasing; still risky if summary is too close or you publish it as a substitute. |
| “Rewrite this paragraph so it’s not plagiarism” | Model keeps key ideas and often the same structure, just swaps words. | Can still be a derivative work very close to the original; AI paraphrasing doesn’t make infringement disappear. |
| “Translate this book” | Model outputs a translation of protected content. | Translation is a derivative work; you usually need the original rightsholder’s permission to publish or sell it. |
| “Generate an image in the style of [Artist]” | Model pulls from training to mimic stylistic patterns. | Style alone is murky; if the output copies protected expression or distinctive elements, you can hit copyright/right-of-publicity issues. (jipel.law.nyu.edu) |
| “Give me code that works like this proprietary library” | Model reproduces or closely tracks code in its training data. | If it regurgitates protected code from someone else’s repo or product, that’s classic infringement. |
| “Rewrite Westlaw headnotes in your own words” | Model shadows Westlaw’s editorial content and structure. | Exactly the kind of use a court already found infringing and not fair use. (Reuters) |
The key question is not “did AI touch it?” but “does the output copy protected expression from someone else?”
⚖️ Copyright 101 For AI Reproduction
At the core, copyright law gives rightsholders exclusive rights to:
- Reproduce their work,
- Prepare derivative works (translations, adaptations, rephrasings),
- Distribute, display, and perform the work.
Using AI as a tool doesn’t change those basic rights.
Training vs Output: Courts Are Treating Them Differently
Recent cases and commentary increasingly separate:
- Using works in training (copying into a model’s training pipeline), and
- What the model outputs (does it spit back something too close?).
Some courts have suggested that reproducing works inside a training process can be fair use in certain circumstances (for example, reproducing books internally to teach a system language patterns), but they leave the door wide open to infringement claims when outputs compete with or closely mimic the original works. (fenwick.com)
At the same time, in the Thomson Reuters v. Ross Intelligence decision, a court held that using Westlaw’s editorial headnotes to power a competing AI legal research tool was not fair use:
- The headnotes were creative, copyright-protected material.
- Ross’s use was commercial and aimed to replace Westlaw’s product.
- The copying was substantial and not transformative enough. (Reuters)
Takeaway: even if training might be defendable in some contexts, using AI to reproduce someone else’s value-added content for a competing product is very much in the danger zone.
AI Outputs And Human Authorship
The US Copyright Office has been explicit:
- Purely AI-generated content without meaningful human creative input is not protected by copyright. (copyright.gov)
- Humans can have copyright in:
- The selection, coordination, and arrangement of AI outputs,
- Creative editing or modification of those outputs,
- Combined works where human expression is meaningfully added.
That cuts both ways:
- You don’t automatically own full rights in raw AI output.
- But if your AI output copies someone else, you can still be liable for infringement, even if your own rights in the output are limited.
🎨 Images, Styles, And “That Looks Like My Work”
AI image tools raise a different flavor of “reproduction” questions.
- In Andersen v. Stability AI, artists challenged the use of their work to train image generators and alleged that outputs could reproduce key elements of their art. A court let core copyright claims proceed against some defendants, signaling real legal risk when outputs are too close to particular works. (jipel.law.nyu.edu)
- In a UK case between Getty Images and Stability AI, the court ultimately ruled that the AI model itself was not an “infringing copy” of Getty’s images under UK law, and Getty’s main copyright training claim was dropped for lack of UK-based training evidence. However, the court did find trademark infringement when generated images included a fake Getty watermark. (Latham & Watkins)
Practical translation for users:
- Asking for “something in the style of impressionism” is different from “recreate this specific Getty photo” or “give me something that looks exactly like this artist’s painting.”
- When outputs contain distinctive logos, watermarks, or signature motifs, you’re stepping into obvious infringement and trademark trouble.
📚 Fair Use: Helpful Doctrine, Not A Free Pass
People sometimes assume that because AI “changed” the content a bit, it’s automatically fair use. That’s not how courts see it.
Fair use is a multi-factor test, looking at:
- Purpose and character of your use (commercial? transformative?),
- Nature of the original work,
- Amount and substantiality used,
- Market impact on the original.
Courts have been clear in recent AI cases:
- Using someone else’s editorial content to build a direct competitor is unlikely to be fair use, even if an AI model is involved. (Reuters)
- Internal, non-public reproduction for certain kinds of training may be more defensible, but that doesn’t bless downstream outputs that replace the original in the market. (fenwick.com)
For AI users, the simplest heuristic:
If the output is close enough that a reader, viewer, or court would see it as a substitute for the original, fair use is a weak shield.
🧪 Common AI Reproduction Use Cases, Ranked By Risk
Here’s a pragmatic “heat map” for typical ways people use AI to reproduce content.
| 🔍 Use Case | Risk Level | Why |
|---|---|---|
| Ask AI to summarize a publicly available article just for your own understanding | 🟢 Low | Private summary, no distribution; still wise to credit source if used at all. |
| Use AI to summarize a paywalled/licensed article and publish the summary as your own SEO post | 🟡/🔴 Medium–High | Can substitute for the original; copying structure and key expression; can breach the site’s ToS. |
| Paste someone’s blog post and ask AI to “rewrite in different words so Google can’t see it’s the same” | 🔴 High | Classic attempt to disguise copying; AI paraphrase doesn’t change underlying derivative nature. |
| Ask AI to generate generic marketing copy on a topic without feeding in specific third-party text | 🟢 Low | Risk mostly around coincidence; still review for accidental close matches. |
| Generate images “in the style of [famous illustrator]” for commercial use | 🟡 Medium | Style per se is murky, but if outputs echo recognizable composition/characters, risk increases. |
| Ask AI to recreate or “clean up” specific commercial photos, logos, or product shots | 🔴 High | Directly targets protected works; creates close derivatives or confusingly similar images. |
| Ask AI to produce code “similar to” proprietary code you don’t have rights to | 🔴 High | If the output mirrors structure/logic/textual expression from protected code, that’s textbook infringement. |
🧾 Contracts, ToS, And Academic / Professional Rules
Even when copyright risk feels low, you can still get hit via contract or professional rules.
- Terms of service for research databases, SaaS tools, and course platforms often forbid:
- Bulk download,
- Automated copying,
- Using their content to train or test competing AI tools.
- Universities and schools are issuing explicit AI policies:
- Using AI to reproduce or closely paraphrase sources without citation is still plagiarism, regardless of how “different” the words look.
- “The AI wrote it for me” is not an acceptable defense.
- Clients, employers, and regulators increasingly expect you to:
- Disclose AI use when it materially affects work product,
- Not feed confidential or licensed data into tools in violation of NDAs or license agreements.
The FTC has been reminding companies there is no AI exemption from existing laws against deception, fraud, or IP misuse. Misrepresenting AI-reproduced work as entirely original or misusing others’ IP through AI tools can trigger enforcement just like any other tech-enabled scheme. (Federal Trade Commission)
🧭 Practical Rules Of Thumb For Using AI Without Getting Burned
You don’t need to become an IP scholar. A handful of rules will get you most of the way there.
| ✅ Safer Habits | 🚫 Risky Habits |
|---|---|
| Use AI to ideate (headlines, angles, outlines) rather than to mechanically rewrite someone else’s article. | Feeding full articles or books into AI with the goal of publishing a “different” version that competes with the original. |
| Treat AI like a research assistant, not a photocopier: read inputs, then write in your own structure and voice. | Treating AI as a “spin bot” whose job is to hide the fact you copied another creator’s work. |
| When you must use specific sources, credit them and keep AI’s role to summarizing or helping you understand. | Stripping attribution, presenting close paraphrases as if they’re entirely your original work. |
| For visuals, favor prompts that describe concepts, not “make this exact Getty image without the watermark.” | Asking for clones or near-clones of specific photos, logos, characters, or UI screens you don’t own. |
| For code, use AI on problems you understand, and compare output with known open-source patterns you have rights to use. | Asking AI to “recreate the source code of [commercial product]” or dropping decompiled code in and asking for a “clean” variant. |
A decent mental test:
If you removed AI from the picture and did the same thing manually, would it obviously look like infringement, plagiarism, or a ToS breach?
If yes, adding AI to the process doesn’t make it better. It just makes it faster – which can make the legal consequences arrive faster too.