There is great potential for using Artificial Intelligence (AI) in research projects on audiovisual works. With the new developments in natural language processing and computer vision, AI is becoming more capable in tasks like transcription, translation, or genre classification, without help from humans. AI can also detect faces and can highlight certain things in the content (feelings, violence, injuries, sex, bad language). Such developments allow researchers to uncover new knowledge from the existing audiovisual works.
But what happens if you want to use this technology on the works that are still in copyright or works with unclear copyright status? Even if the collected works are not going to be shared and will only be analysed by the machines, it is useful to know that two rights might still be affected: reproduction (CDPA s.17) and adaptation (s.21).
Not so surprisingly, copying is in the centre of copyright law. Under the UK Copyright Law, any form of analogue or digital copying requires the authorisation of the rightsholder. To give some examples where reproduction might occur in such research projects:
• First copy, for example when a broadcast is stored on a DVD
• When previously analogue material is digitised, for example an archive decides to turn their analogue collections into digital format
• When further copies are made for a specific AI project, for example taking some works, in whole or in part, to gather a small dataset on a specific theme
• When the technological process requires numerous copies to be made, which could be allowed under the temporary copying exception (s.28A)
This right is more difficult to recognise. Unlike the United States copyright law where all derivative works are subject to authorisation, UK copyright law defines it narrowly and according to the type of the works. For literary and dramatic works, it includes translating the work (including translating to machine-readable languages), dramatizing a non-dramatic work (and vice versa) or conveying the story by images. For computer programs, databases and musical works, adaptation right covers alterations or arrangements. Most of the time, an adaptation activity will already require the copying the work and already potentially infringe reproduction right.
It should be noted that this right is only for literary, dramatic, or musical works, computer programs and databases and does not apply to artistic works, sound recordings or films. While this excludes many of the works held in audiovisual archives, it could still be relevant in some situations. To give some examples:
• When your research includes generating transcripts from film and TV and then data mining those
• When you have literary works in your collections and you translate them for AI analysis
• When you alter a database to make the data easier to read for AI.
If the audiovisual works you need for your research are already in public domain, then there are no copyright concerns. But limiting your AI analysis to only older or freely licensed works might not be enough for your research or it could lead to biased findings.
But (1) if the works you are using are still protected by copyright or if their copyright status is unclear and (2) if these actions are not authorised by the rightsholder, then these actions might be infringing reproduction and adaptation rights. Depending on how much of the content will be included in the project outcomes, other rights might be infringed too. Whether you have taken the work in full or in substantial part becomes important. Your purpose also becomes important, as it might be covered by one of the defences under the UK copyright law.
This part will introduce two of the relevant copyright defences that could support such AI projects for academic research: fair dealing for research and private study (s.29) and copying for text and data mining (s.29A). There are other useful defences (education, preservation, quotation and/or criticism) that will not be discussed here.
Research and private study
Fair dealing for research and private study is for non-commercial purposes. Similar to the other UK fair dealing defences, the dealing has to be “fair”. When determining if the dealing was “fair”, certain factors should be assessed, such as the amount taken, the use made of the work, consequences of the dealing, whether the work is unpublished and whether the work was obtained legitimately. In the example of AI data analysis in the archives, it could be difficult for the parties to determine what is fair, with the risk of being overly cautious and excluding too much from the research projects.
Text and data mining
The second and more relevant exception here is for ‘copying for text and data analysis for non-commercial research’. This section was introduced in 2014 and allows copying for computational analysis but requires sufficient acknowledgement and having lawful access to the copied work. It does not require assessing for “fairness” and can help with analysing large amounts. It is for copying only, so it will not cover adaptation activities or sharing the outputs that contain the works.
Both of these exceptions require sufficient acknowledgement unless it is “impossible for reasons of practicality or otherwise”, which would be the case when analysing many works at once with AI. Neither of these copyright defences can be overridden by contract, which means that they can still help despite restrictive contracts. Most importantly, both of these exceptions are for non-commercial purposes only. Although it makes sense that non-commercial research is supported, it can be challenging to determine the status of research groups with commercial partners.
It should also be mentioned that the new Directive on Copyright in Digital Single Market has two provisions on text and data mining (Articles 3 and 4). The second one even allows reproductions and extractions for all purposes by all stakeholders (including commercial), unless the rightsholder has expressly reserved their right. This article could be useful for the text and data mining activities of public-private partnerships. However, the UK will not implement this Directive.
Overall, there are very positive developments in technology and the usage of AI for analysing audiovisual archives. It is also great that UK copyright law is not hindering these developments, especially with the text and data mining defence. But it can still be useful to consider the copyright implications in this area and be aware of the unclear or insufficient sides of copyright.
Dr Pınar Oruç is a Research Associate at UK Copyright and Creative Economy Centre (CREATe), University of Glasgow. This piece is based on the research done within the ambit of ‘reCreating Europe: Rethinking digital copyright law for a culturally diverse, accessible, creative Europe’ (Horizon 2020). More on this project can be found on www.recreating.eu.