If ChatGPT produces AI-generated code for your app, who does it really belong to?

AI head in background of software codes — monsitj/Getty Images

In one of my earlier AI and coding articles, where I looked at how ChatGPT can rewrite and improve your existing code, one of the commenters, @pbug5612, had an interesting question:

Who owns the resultant code? What if it contains business secrets – have you shared it all with Google or MS, etc.?

It’s a good question and one that doesn’t have an easy answer. Over the past two weeks, I’ve reached out to attorneys and experts to try to get a definitive answer.

Also: I’ve tested dozens of AI chatbots since ChatGPT’s stunning debut. Here’s my top pick

There’s a lot to unpack here, but a good starting point is the overall theme of this discussion. As attorney Collen Clark of law firm Schmidt & Clark states:

Ultimately, until more definitive legal precedents are established, the legal implications of using AI-generated code remain complex and uncertain.

That’s not to say there is a shortage of opinions. In this article, I’ll discuss the copyright implications of using ChatGPT to write your code. In a related article, I discuss issues of liability pertaining to AI-generated code.

Who owns the code?

Here’s a probable scenario. You’re working on an application. Most of that application is your direct work. You’ve defined the UI, crafted the business logic, and written most of the code. However, you’ve used ChatGPT to write a few modules and linked that resulting code into your app.

Continue to Part 2: If you use AI-generated code, what’s your liability exposure?

Who owns the code written by ChatGPT? Does the inclusion of that code invalidate any ownership claims you have on the overall application?

Attorney Richard Santalesa, a founding member of the SmartEdgeLaw Group based in Westport, Conn., focuses on technology transactions, data security, and intellectual property matters. He points out that there are issues of contract law as well as copyright law — and they’re treated differently.

From a contractual point of view, Santalesa contends that most companies producing AI-generated code will, “as with all of their other IP, deem their provided materials — including AI-generated code — as their property.”

OpenAI (the company behind ChatGPT) does not claim ownership of generated content. According to their terms of service, “OpenAI hereby assigns to you all its right, title, and interest in and to Output.”

Also: AI is coming to a business near you. But let’s sort these problems first

Clearly, though, if you’re creating an application that uses code written by an AI, you’ll need to carefully investigate who owns (or claims to own) what.

For a view of code ownership outside the US, ZDNET turned to Robert Piasentin, a Vancouver-based partner in the Technology Group at McMillan LLP, a Canadian business law firm. He says that ownership, as it pertains to AI-generated works, is still an “unsettled area of the law.”

That said, there has been work done to try to clarify the issue. In 2021, the Canadian agency ISED (Innovation, Science and Economic Development Canada) recommended three approaches to the question:

Ownership belongs to the person who arranged for the work to be created.
Ownership and copyright are only applicable to works produced by humans, and thus, the resultant code would not be eligible for copyright protection.
A new “authorless” set of rights should be created for AI-generated works.

Also: 92% of programmers are using AI tools, says GitHub developer survey

Piasentin, who was also called to the bar in England and Wales, says: “Much like Canada, there is no English legislation that directly regulates the design, development, and use of AI systems. However, the UK is among the first countries in the world to expressly define who can be the author of a computer-generated work.”

“Under the UK Copyright Designs and Patents Act, with respect to computer-generated work, the author of the work is the person who undertook the arrangements necessary to create the work and is the first owner of any copyright in it,” he explains.

Piasenten says there may already be some UK case law precedent, based not on AI but on video game litigation. A case before the High Court (roughly analogous to the US Supreme Court) determined that images produced in a video game were the property of the game developer, not the player — even though the player manipulated the game to produce a unique arrangement of game assets on the screen.

Because the player had not “undertaken the necessary arrangements for the creation of those images,” the court ruled in favor of the developer.

Also: I’ve tested a lot of AI tools for work. These 4 actually help me get more done every day

Ownership of AI-generated code may be similar in that, “the person who undertook the necessary arrangements for the AI-generated work — that is, the developer of the generative AI — may be the author of the work,” Piasenten notes. That doesn’t necessarily rule out the prompt-writer as the author.

Notably, it also doesn’t rule out the unspecified (and possibly unknowable) author who sourced the training data as an author of AI-generated code.

Fundamentally, until there’s a lot more case law, the issue is murky.

What about copyright?

Let’s touch on the difference between ownership and copyright. Ownership is a practical power that determines who has control over the source code of a program and who has the authority to modify, distribute, and control the codebase. Copyright is a broader legal right granted to creators of original works, and is essential to controlling who can use or copy the work.

If you look at litigation as something of a battle, Santalesa describes copyright as “one arrow in the legal quiver.” The idea is that copyright claims provide an additional claim, “above and beyond any other claims, such as breach of contract, breach of confidentiality, misappropriation of IP rights, etc.”

He adds that the strength of the claim hinges on wilful infringement, which can be a challenge even to define when it comes to AI-based code.

Also: How to use ChatGPT to write code

Then there’s the issue of what can qualify as a work of authorship — in other words, something that can be copyrighted. According to the Compendium of the U.S. Copyright Office Practices, Third Edition, to qualify as “a work of ‘authorship,’ a work must be created by a human being…Works that do not satisfy this requirement are not copyrightable.”

Additionally, the Compendium notes that the U.S. Copyright Office “will not register works produced by nature, animals, or plants. Likewise, the Office cannot register a work purportedly created by divine or supernatural beings.”

While the Copyright Office doesn’t specifically say whether AI-created work is copyrightable or not, it’s probable that that block of code you had ChatGPT write for you isn’t copyrightable.

Also: 25 AI tips to boost your programming productivity with ChatGPT

Piasenten says this applies in Canada, too. Provisions that point to “the life of the author” and the requirement that the author be a resident of a certain country imply a living human.

Piasenten notes that, in CCH Canada Ltd. v Law Society of Upper Canada, the Supreme Court of Canada found that original work is derived from “an exercise of skill and judgment” and cannot be “purely mechanical exercise.”

Messy for coders

Let’s wrap up this part of our discussion with some thoughts from Sean O’Brien, lecturer in cybersecurity at Yale Law School and founder of the Yale Privacy Lab. Taking us from analogies and speculation to actual rulings, O’Brien points to some US Copyright Office actions on AI-generation.

“The U.S. Copyright Office concluded this year that a graphic novel with images generated by the AI software, Midjourney, constituted a copyrightable work because the work as a whole contained significant contributions by a human author, such as human-authored text and layout,” O’Brien says. “However, the isolated images themselves are not subject to copyright.”

If this ruling were applied to software, the overall application would be copyrighted, but the routines generated by the AI would not be subject to copyright. Among other things, this requires programmers to label what code is generated by an AI to be able to copyright the rest of the work.

Also: The most popular programming languages in 2024 (and what that even means)

There are also some messy licensing issues. O’Brian points out that ChatGPT “can’t properly provide the copyright information, specifically refusing to place free and open source licenses, like the GNU General Public License, on code.”

Yet, he says: “It’s already been proven that GPL’d code can be verbatim repeated by ChatGPT, creating a license infringement mess. Microsoft and GitHub continue to integrate such OpenAI-based systems into code authoring platforms used by millions, and that could muddy the waters beyond recognition.”

What does it all mean?

We haven’t even touched on liability and other legal issues, which you’ll want to read about in Part II. There are some clear conclusions here, though.

First, this is somewhat uncharted territory. Even the attorneys say there’s not enough precedent to be sure what’s what. I should point out that in my discussions with the various attorneys, they all strongly recommended seeking an attorney for advice on these matters, but in the same breath, acknowledged there wasn’t enough case law for anyone to have more than a rough clue how it was all going to shake out.

Second, it’s likely the code written by an AI can’t be owned or copyrighted in a way that provides legal protections.

Also: Generative AI brings new risks to everyone. Here’s how you can stay safe

This opens a huge can of worms because unless code is rigorously documented, it will be very difficult to defend what is subject to copyright and what’s not.

Let’s wrap this up with some more thoughts from Yale’s O’Brien, who believes that ChatGPT and similar software are leaning on the concept of fair use. However, he says:

There have been no conclusive decisions around this affirmation of fair use, and a 2022 class action called it “pure speculation” because no court has yet considered whether usage of AI training sets arising from public data constitutes fair use.

Pure speculation. When considering whether you own and can copyright your code, you don’t want a legal analysis to end with the words “pure speculation.” And yet here we are.

Continue to Part 2: If you use AI-generated code, what’s your liability exposure?

You can follow my day-to-day project updates on social media. Be sure to follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.

Source link