Can AI Outperform Humans in Literary Style and Copyright?

Can AI Outperform Humans in Literary Style and Copyright?

Dive into the fascinating intersection of artificial intelligence and copyright law with Rupert Marais, a renowned expert in intellectual property and AI ethics. With a deep understanding of the legal and ethical challenges surrounding AI training on copyrighted materials, Rupert offers unparalleled insights into a groundbreaking study that reveals readers’ surprising preference for AI-generated texts over human-written works after fine-tuning. In this interview, we explore the implications of these findings for the literary world, the publishing industry, and the ongoing debates about fair use in copyright law, as well as the broader impact on creative fields. Join us for a thought-provoking conversation about the future of creativity in the age of AI.

Can you share what sparked your interest in exploring how AI-generated texts compare to human-written works in mimicking famous authors’ styles?

My interest stemmed from the growing tension between technological innovation and intellectual property rights. AI’s ability to replicate human creativity is advancing at an incredible pace, and I wanted to understand if it could truly capture the nuanced styles of celebrated authors. With numerous lawsuits emerging over the use of copyrighted works to train AI models, it felt urgent to investigate not just the technical capabilities, but also the ethical and legal ramifications of this technology.

How did you approach selecting the styles of famous authors as the foundation for this research?

We wanted to test AI’s potential against the best of human creativity, so focusing on famous, award-winning authors made sense. Their styles are distinct, widely recognized, and often deeply personal, which provided a rigorous benchmark. It also tied directly into the copyright debate, as many of these authors’ works are protected, raising questions about whether AI training on such material respects or violates their rights.

What was your process for choosing the specific group of award-winning authors for the study?

We aimed for diversity in genre, cultural background, and writing style to ensure a comprehensive test. We selected 50 authors who had won major literary awards, as their works represent a gold standard in creativity. This selection wasn’t just about prestige—it was about challenging the AI to emulate a wide range of voices, from minimalist to ornate, across different eras and perspectives.

Can you describe how you went about recruiting participants from top writing programs to contribute human-written excerpts?

We reached out to 28 candidates from leading Masters of Fine Arts programs across the US. These are individuals trained in the craft of writing, often with a deep understanding of literary styles. We invited them to create excerpts mimicking the chosen authors, ensuring they had the skills to produce high-quality imitations. It was a meticulous process to balance talent and availability, but their contributions were vital for a fair comparison.

How did you design the evaluation to ensure fairness when judging AI-generated texts against human-written ones?

Fairness was paramount, so we implemented blind pairwise evaluations. Neither the expert readers—also MFA candidates—nor the lay readers knew whether they were reading AI or human work. We presented the excerpts without identifiers and randomized the order to eliminate bias. This way, judgments were based purely on the text’s quality and stylistic fidelity to the original author.

What stood out to you about the initial reactions to AI-generated texts before any adjustments were made to the models?

Before fine-tuning, the reactions were quite telling. Expert readers were quick to dismiss the AI texts, often pointing out clichés and a lack of depth in stylistic fidelity. Lay readers had more mixed responses—some couldn’t tell the difference, while others sensed something off. It highlighted how raw AI output, without refinement, struggles to match the subtlety of human writing, especially under expert scrutiny.

Can you explain what fine-tuning entails for AI models and why it shifted reader preferences so dramatically?

Fine-tuning is the process of taking a general AI model and training it further on a specific dataset—in this case, an author’s complete body of work. This allows the AI to better grasp unique stylistic elements, like sentence structure or thematic tendencies. After fine-tuning, the AI texts lost many of the generic quirks that readers disliked, resulting in outputs that felt more authentic. Both expert and lay readers began favoring these refined AI texts, which was a game-changer in our findings.

How did the opinions of expert readers evolve regarding the quality and authenticity of AI texts after fine-tuning?

Post fine-tuning, expert readers did a complete turnaround. Initially critical, they started to praise the AI-generated texts for both stylistic fidelity and overall writing quality. They noted how the AI captured nuances they hadn’t expected, often mistaking it for human work. This shift underscored how powerful fine-tuning is in bridging the gap between artificial and human creativity.

What do you think it signifies for the literary community that even lay readers preferred AI texts after fine-tuning?

It’s a wake-up call. Lay readers, who may not have the trained eye of experts, still form the bulk of the reading public. Their preference for AI-generated texts suggests a potential shift in market demand. It raises questions about the value placed on human authorship and whether readers will care about the source of a story if the quality feels comparable or even superior.

Can you walk us through the cost disparity between producing a novel with a fine-tuned AI versus hiring a human writer?

The numbers are stark. Our research estimated that fine-tuning an AI model and generating a 100,000-word novel costs around $81. Compare that to hiring a professional writer, which can easily run $25,000 or more. That’s a 99.7% cost reduction. While AI doesn’t replicate the full creative process of a human, the financial difference is hard to ignore, especially for publishers looking at bottom lines.

How do you envision this cost difference reshaping the publishing industry over time?

It could be transformative—or disruptive, depending on your perspective. Publishers might lean toward AI for certain genres or mass-market content to cut costs, potentially reducing opportunities for human writers. On the flip side, it could democratize storytelling, allowing more voices to be heard through accessible tools. But without regulation or ethical guidelines, there’s a risk of flooding the market with low-cost AI content at the expense of original human work.

Your research suggests AI could displace human-authored works. Can you elaborate on the potential impact on writers’ livelihoods?

Absolutely. If AI-generated content becomes a viable substitute in the eyes of readers and publishers, it could shrink the demand for human writers, especially in commercial genres like romance or thrillers where formulaic structures are common. Many authors already struggle to make a living, and this could exacerbate that, pushing them into niche markets or forcing them to adapt by using AI tools themselves. It’s a complex issue with no easy answers.

How do your findings challenge traditional notions of fair use in copyright law when it comes to AI training?

Fair use in the US hinges on factors like the purpose of use and the effect on the market value of the original work. Our study shows that AI, when fine-tuned on copyrighted material, can produce outputs that emulate an author’s style so well that it might compete with their work. This market substitution effect challenges the assumption that training on copyrighted content is always fair use, as it could directly impact an author’s earning potential.

What are your thoughts on the argument that training AI on copyrighted works should be considered fair use since it doesn’t create exact copies?

It’s a compelling argument, but it misses a crucial point. While AI may not replicate works verbatim, our research shows it can still substitute for the original in a market sense by capturing an author’s unique style. Copyright law, especially through the lens of the Copyright Office, often interprets market harm broadly. If AI outputs diminish the value of an author’s work, even without direct copying, the fair use defense becomes shaky.

How do you address concerns from the AI industry that requiring permission for training on copyrighted works could stifle innovation?

I understand the concern—innovation shouldn’t be unnecessarily hampered. But there’s a balance to strike. Authors and creators deserve protection for their intellectual property, just as AI companies seek to protect their investments. A licensing model or a framework for compensated use could be a middle ground, ensuring AI development continues while respecting creators’ rights. Ignoring copyright risks undermining the very creativity AI seeks to emulate.

With over 50 copyright lawsuits against AI companies in the US, how do you think your research might shape the outcomes of these cases?

Our findings provide concrete evidence of market impact, which is a key factor in fair use analysis. Courts may look at our data showing reader preference for AI-generated texts and the potential displacement of human works as proof of economic harm to copyright holders. This could influence rulings by emphasizing that training on copyrighted material isn’t just a technical process—it has real-world effects on creators’ livelihoods.

Can you discuss the wider implications of your research for other creative industries, like music or video, where AI is also trained on copyrighted content?

The parallels are striking. Just as in literature, AI in music or video can replicate styles or elements so closely that it competes with original works. Our findings suggest that fair use arguments in these fields will face similar scrutiny—courts might question whether AI outputs substitute for human creations in the market. This could lead to broader legal and ethical debates about how we value human creativity across all artistic domains in the age of AI.

What is your forecast for the future of copyright law in relation to AI development?

I believe we’re heading toward a pivotal moment where copyright law will need to evolve significantly to address AI’s unique challenges. We might see new categories of protection or exceptions tailored to AI training, alongside international efforts to harmonize rules. The tension between fostering innovation and protecting creators will likely result in hybrid solutions, like mandatory licensing or revenue-sharing models. But without proactive legislation, we risk a patchwork of court decisions that could create uncertainty for both AI developers and artists.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later