Written by admin

Updated on:




As Artificial Intelligence is going rapidly and advancing to an extent in this globalising world, it raises important questions about the existence of copyright law? How AI are regulated in the same manner for the output created by AI or for the infringement aspect for training models of Artificial Intelligence which uses the works of original author. The Sine quo of Copyright is Originality, which means originality is a pre-condition to copyright protection, whereas Artificial Intelligence (AI) is mix of human intelligence and machines in order to produce an output from the infringed work of authors. The application of artificial intelligence is wide, therefore training artificial intelligence on large scale will lead to copyright infringement as artificial intelligence and copyright law are interconnected, But recently as use of AI is on massive level to maintain fair use of copyrighted work is becoming difficult therefore there are various cases and issues regarding infringement of copyright while training generative AI.

In this paper we would discuss how training generative leads to copyright infringement determined on a case to case basis of various landmark judgments around the world. Further to this I’d also analyse how AI and copyright are interconnected and what further steps can be incorporated to stop the training generative Artificial Intelligence from infringing copyrighted works. A strict guideline needs to be developed by the Copyright office around the world, so that the prominent companies around the global south and north can implement these guidelines effectively while training AI based on the data collected by the copyright owners.

If we go through the journey of AI various machines like CHATGPT, MIDJOURNEY and other BOTS have recently popped up and are using various works of the authors, artists who had copyrighted their work in any of the domains. These have attracted lawsuits around the world that this Generative AI is using the data of protected authors to train their AI in order to serve the purpose of the derivative works. As technology is considered very novel in nature, AI in general is been trained by the data of copyrighted work which has been sourced from various social online intermediaries and repositories. This is considered as threat to the first owners as there is no standard way established to constitute whether this will fall under Fair Use principle under the copyright law. It is still a myth revolving around this and court determines on case-to-case basis. If courts rules in favour of the authors, that the copyrighted material cannot be used to train AI models, then it will significantly impact the training process of these models. For copyright protection, the work should meet the criteria of originality, in the case of AI the question is whether AI possess originality as while training AI producers rely on existing data and algorithms created by humans. Therefore, here is the prime question, is training generative AI leads to infringement of copyright? For example, DALL-E for its training purpose used approximately 650 million images of publicly available online licensed resources, this one example can show us that it can be a case of infringement when the copyrighted work of owners is used without any prior permission and royalty fees.

Is Training Generative AI Infringement of Copyright ?

With the growing use of generative AI and more technological advancements, the gap between the original work and work created by AI has been fading, hence making it difficult for original author to justify their work and creating multiple complexities with respect to intellectual property laws. Some of them are ambiguity about ownership of the work, infringement and rights of use issues and ethical concerns with regards to use of unauthorised and unlicensed data to train generative AI[1].

To deal with such cases, fair use doctrine, consent of the original owner and licensed given for authorised work are some of the pre conditions that shall be resorted in order to find whether the training the generative AI will amount to copyright infringement. Generally, fair use has been a cornerstone of modern societies, ensuring that information available in public domain can be used for welfare of society and to create new designs and statement based on it. Thus, the usage should be reasonable and justifiable. Before considering it under fair use principle, the four criteria need to be looked, that is the nature and intent of the use like for education, business, or research purpose. Second is the composition of the protected work used. Third is size and significance of the work used proportionately to the entire copyright work. And the final is the measure of the impact on the potential market and worth of actual copyrighted work.  Thus, multiple cases at hand discusses these moot points.

The abovementioned claims have been litigated in various cases, like in Andersen v. Stability[2] et al., three artists came together to sue multiple AI platforms who used artist’s work without license to train the generative AIs in their style, hence this allowed users to generate works which are insufficiently transformative from their existing protected work and would be termed as unauthorised derivative work. Thus, the scholars believed that such blatant unauthorised work would amounts to copyright infringement and the original owner have right to claim for compensation for the same.

Similarly, in Getty images Inc. v Stability Inc[3]. case the Getty images has filed a suit against stability AI, an open-source platform of AI for usage of multiple photographs by Getty images (a photography company) in order to train the AI based on the given model without creator’s consent. This was considered as “Brazen Infringement” of Intellectual property of the company.  Now the case is totally depending upon the US courts to rely it upon the fair use doctrine or consider it as “transformative use” by the AI Company. The question also arises about the output produced by stability AI as it has violated copyright and trademark rights it has in its watermarked photographed collection.


In both abovementioned cases, the main contention for the court is to find out as to what will come under the bounds of “derivative work” under intellectual property laws and depending upon the respective jurisdictions. Hence, the outcome of these cases will assume a great importance in determining as to what constitute fair use, intention if the AI developers platforms, and to what extent it is justified to use owner’s original work without its consent to train the generative AI platforms. Some of the scholars argues that since, AI will be revolutionary in assisting mankind and research works, thus whether such training given to generative modules would amount to copyright infringement shall be dealt with broad viewpoint and generative platforms shall be provided with enough liberty to use such content to generate new work out of it or to help the existing users. But the other side which was contended in the current abovementioned cases stresses upon the direct impact on the market of original copyrighted work.

Similarly, in Warhol foundation v Goldsmith[4], creator’s rights were identified after the long battle with AI developers who use copyrighted images without paying. Hence using the author’s original work by AI companies for training models would be a ‘Showstopper’ and this would infringe upon work of the owner. The Warhol ruling has sent a message to all the Internet aggregators and platforms that transformative use of the creative work would unlikely constitutes Fair Use and hence determined on a case to case basis. The aspect of “Commercialism over creativity” would have a reduced ability in any of the segments of the AI platforms if it uses the copyrighted work without permission. Hence new work created by software after due training will be downranked considering due interest of the original author. The fair use doctrine would still be applicable, only if the creators consent is been determined by the AI companies. As a result, according to the court, the original and copied works had either identical or very correlated purposes. Furthermore, the distribution of the secondary work may cause that the original or any licensed works are substituted without proper credit.

But there are other non-technological cases where such similar moot points have been raised. The decision in those cases could be useful to find out the treatment for generative AI as well. Like in case of The Authors Guild v Google Inc[5]., the plaintiff sued Google for using the published books of plaintiff and producing the excerpts of it in its Google book data base without permission. In the Judgment the US courts ruled it as a Fair use under Sec 107 of US copyright law and reasoning was based on social welfare purpose such as education and research purpose for the students and other meritorious professional. It was considered as transformative use and any such revelations of protected work do not constitute infringement. As such kind of work did not directly impact the potential market of the original copyrighted work. Thus, such can be termed as unintentional and fair use of work[6].

Therefore, looking at the cases at hand and current dynamic scenario of generative AI and its uncertainty which presents slew of challenges for the AI developer companies when it comes to intellectual property laws which are not at all clear with regard to the copyright infringement and multiple jurisdictions having different viewpoints, it would be reasonable to decide on such question on case to case basis. Also, intellectual property laws aims to uphold the originality of the work and its prime interest is to protect the owner of the copyrighted work hence, in my opinion in this ambiguous situation the current aim is to protect original owner and their copyrighted work and unauthorised and work used without consent shall be applicable for penalty under copyright infringement laws. This is because generative AI has been developing at great pace but this is not same with  copyright legislations, hence the original owners shall not be suffered due to the gradual development in the field of law.

 Though, for a long term this stance might sound restrictive but in the meantime we have to lay more focus on discussions, deliberations and new guidelines and amendments in copyright laws in order to incorporate the changes caused by generative AI. Hence, this is a long debate and collaboration of multiple stakeholders is needed to address this issue.


As Artificial intelligence is being considered as significant for each and every country, many state actors are easing restrictions placed on AI to regulate the usage of data to train their AI development models in order to provide various insights about the information asked. However the aspect of Intellectual property is need to be aligned with the AI system. The IP right confers monopoly protection to the owners and sometimes this may also be considered as negative to the contours of society. Thus cautious approach needs to be taken by the policy makers in India and also in the globalising world, against the dilution of human-centricity of copyright law.

A suggestion with respect to this can be appointment of compliance officer by the AI developers and the intermediaries as per Information Technology, (Intermediary guidelines digital media ethics code, 2021) to regulate the rules and regulations and overseeing the copyright protection and conducting frequent assessments. The companies can also ask prior permission from the owners of copyrighted work which can result in continuous growth and overall development of the sector. In similar manner AI tool can be established to detect any of such copyright Infringement

[1] Pamela Samuelson, ‘Generative AI Meets Copyright’ (2023) 2023 Jotwell: J Things We Like 1 <> last accessed on 05th Nov 2023

[2] 3:23-cv-00201, (N.D. Cal.)

[3] 1:23-cv-00135, (D. Del.)

[4] No. 21–869.

[5] No. 13-4829-cv (2d Cir. Oct. 16, 2015)

[6] Jessica L. Gillotte, ‘Copyright Infringement in AI-Generated Artworks’ (2020) 53 UC Davis L Rev 2655 <> last accessed on 05th Nov 2023