Definition of AI — Primer

Jonathan Poritz

Introduction

As a Working Group on AI and Copyright, one of the key factors in our work is reaching a shared understanding of what is meant when we use the term Artificial Intelligence (AI)

Existing CC position papers on AI and copyright such as

Submission by Creative Commons to the UK Consultation on Artificial Intelligence and Intellectual Property Policy (November 2020)
CC Statement at WIPO Conversation on IP and AI (2nd session) (July 2020)
Creative Commons submission to the European Commission White Paper on Artificial Intelligence (June 2020)
CC Submission to WIPO Consultation on AI and IP Policy (February 2020)

are thoughtful and beautiful, and what this working group might do can build closely on those documents … perhaps merely to make versions of their content for other audiences or purposes.

Another thing which might be worth doing is some effort to future-proof those documents, because they make the argument that the landscape of (so-called) AI is changing rapidly. It could be worth clarifying what properties and features would future computer systems need to have for CC’s arguments and positions on this topic to need to change.

One possibility is that there is nothing today, or in the relatively near future, which reasonably could be construed as Artificial Intelligence, at least in any sense which matters for copyright issues. This might be called the “There’s no there, there1” position (shorthand: the TNTT position).

Even asserting this thesis, there are definitional aspects of the term AI which bear examination, such as:

What are the specifics of the actors implicated by copyright law which some computer system in the future would have to match in order for that system to then be a subject of that law?
Why would someone assert that no current system is anywhere near to possessing these specific properties? I.e., what are the systems which today are described by the term AI?

The Turing Test

The classic definition of AI, due to Alan Turing (in Turing, Alan. "Computing machinery and intelligence." Mind, 59.236 (1950): 433.) is via the famous Turing Test, in which a computer “tries” to win an imitation game: an interrogator is able to communicate with two remote parties through a text-only interface and must by conversation decide which of the two is an actual human and which is a computer “pretending” to be a human.2

Turing asserted that a computer which could win the imitation game — by convincing the interrogator of their humanity — could be said truly to be intelligent. Sometimes this kind of intelligence is called, in modern computer science, general-purpose artificial intelligence. This is to be held in contrast to complex tasks which sophisticated computer systems might be able to perform but which are nevertheless limited to a specific domain such as reading handwriting or turning an audio signal of speech into a transcript of the words spoken.

The interrogator in Turing’s imitation game seems to be an oddly subjective element to introduce into a hopefully precise definition — the definition relies upon the clever questioning of the interrogator and their insightful analysis of the responses. But while this is perhaps an uncomfortable aspect of the definition for computer scientists, it could be argued that it fits well with legal systems where a judges are called upon to determine if a work has enough creativity to be worthy of a copyright, etc.3

No current computer system comes anywhere close to passing the Turing Test.

It is not clear whether passing a Turing Test is a reasonable criterion for an entity to be a legally recognized as one of the actors in copyright law.

Actors in copyright law:

creator

rightsholder of the rights over a new work, i.e., owner of a new piece of property
rightsholder of the rights over prior works which may be used in the creation of a new work
user, exercising user rights (exceptions and limitations)

[Do there need to be distinctions drawn between property rights and moral rights?]

A problem is that many legal systems already recognize entities which can function as persons in some contexts (particularly as regards entering into contracts and owning property) but which are not natural persons. Instead, unnatural persons include:

for-profit corporations
various other associations of natural persons
sovereign states
the Whaganui River in New Zealand
the European Union

This is only a “problem” in that it presumably isn’t instantly clear that AI could not act as a creator or rightsholder of some kind simply because an AI is not a natural person. Instead, some flaw in the character of an AI must be found which disqualified it from acting in a way that is the subject of copyright law.

A modified TNTT position could tease out required properties of the actors subject to copyright law, which could be applied to some (distant) future computer system.

Creativity seems to be a key point. But it, also, is notoriously difficult to define. However, creativity seems to imply choice and agency. So it might be possible to exclude some entities from the role of because they are entirely deterministic.

A terminology change can drive home the TNTT position. Computer systems today (and in the foreseeable future) which are called by (the marketing teams of) their supporters

AI,
machine learning (ML), or
deep learning

are nothing more (or less) than large, complex statistical models.

Statistical models do amazing work. Undergraduate computer science students can easily write code which implements a very simple statistical model of written language and use it, for example, to build software that automatically breaks some simple cryptographic systems. Both the statistics behind, and the source code of, this software would likely be largely incomprehensible to lawmakers and judges. Nevertheless, it is quite simple for experts to create and use.

Statistical models are tools. Another alternate terminology which would support the TNTT position would be to substitute “[complicated data analysis] tool” for “AI” wherever the latter occurs. Thinking of AI as merely a tool makes it seem absurd to give the AI itself any formal role as a creator or rightsholder. Metaphors abound: a car is a complex tool for transportation, and even if a driver does not know how the carburetor of their car works, when they are stopped by a police officer for speeding, that ignorance doesn’t mean the office will issue a speeding ticket to the car.

A tool-user’s ignorance of how a complex tool works is not the same thing as the tool itself possessing agency.

Thinking of all current (and currently foreseeable) so-called AI as merely complex tools also has implications for users’ rights. For example, consider this image:

Two of Arts - 2000 Visual Mashups by qthomasbower licensed CC BY 2.0

It seems to have been made out of a large number of smaller (openly licensed) images, where each contributes — via some algorithm, presumably — only its color and perhaps texture. The combining algorithm and its code are presumably fairly complex … or perhaps qthomasbower simply used a tool like the GNU Image Manipulation Program (knows as “the GIMP,” a worthy competitor of Photoshop) a complex piece of FLOSS consisting of probably tens of lines of code and implementing some quite complex algorithms (e.g., “Gaussian blur”). Should that algorithm or the GIMP be the rightsholder of this new work? (Would it matter if qthomasbower had contributed to the code in the GIMP and understand the mathematics of Gaussian blur?)

What about this image:

Modified Two of Arts by Jonathan Poritz is licensed CC BY 4.0 International and is a modification of Two of Arts - 2000 Visual Mashups by qthomasbower which was released under CC BY 2.0.
The modifications consisted of several image manipulations such as Gaussian blur and edge detection using the GNU Image Manipulation Program.

Even if the original small images used as pixels in the first version of this image were under all-rights-reserved copyright, it could be argued that here they were being used merely for some gross overall features like color and texture and their original identity is removed by the image manipulations which were performed … so the use is fair. This might be an analogy of datamining images or text for large ML models in use today.

Discuss quote from the CC position paper Creative Commons Submission to the European Commission Consultation on Artificial Intelligence

“AI” itself is an evolving concept. AI is an umbrella term that encompasses, for the most part at th ecurrent time, different types of machine learning algorithms. There is a lot of confusion around related and different concepts such as machine learning, natural language processing, predictive models, neural networks as well as algorithms. AI is generally understood as “something that can be done by a computer that until then, could only be exclusively done with human intelligence.” However, that new capability is only “AI” until it becomes normalized as simply “software.” What is AI today may not be so tomorrow. As technology advances over time, what is considered “AI” as opposed to "normal software" may continually evolve. That means that whatever copyright framework is put into place, if any, it has to remain flexible enough and technology-neutral to account for and adapt to the moving-target nature of AI. There is also danger in categorizing all manner of algorithms as “AI” and in adopting rules or measures where these categories are arbitrarily determined

Maybe Clarke’s comment about “Any sufficiently advanced technology is indistinguishable from magic.”

Large Statistical Models

(… and other things which which should not not be called AI.)

A warm-up example:

Let’s start with a baby example which has many of the features of modern, so-called AI, albeit in the internals are very different.

Suppose your organization has been hit with a ransomware attack. You have thousands or even millions of files on your network which were originally in English4 but have been encrypted, each with a separate key. You are a cryptanalytic5 genius, so you have written a program which can generate a reasonably small number — perhaps a few thousand — of possible decryptions of each document, one of which you are certain will be the correct, decrypted version. (This may sound far-fetched, but it is actually fairly similar to things which do happen in real life.)

On the face of it, this is a very simple task: sort through a few thousand versions of each document, almost all of which will be obviously gibberish, and pick out the one version which is just in plain English. For any human who knows how to read and knows English, it is a trivial — but maybe very tedious — task. For a computer system, it is not at all obvious — early in the information age, this would have been a typical example of what would have been called an “application of artificial intelligence.”

Here’s an approach to solving this problem (actually due to the great Abu Yūsuf Yaʻqūb ibn ʼIsḥāq aṣ-Ṣabbāḥ al-Kindī, أبو يوسف يعقوب بن إسحاق الصبّاح الكندي‎, one of the thinkers at the House of Wisdom in the Abbasid-period Baghdad).

You used to solve puzzles in the newspaper when you were a child, and you remember that the letter E is the most frequent letter in English. So you write a program which takes the thousands of possible decryptions of one of the ransomed documents and picks out the one where E is the most frequent character, as the true (English!) decryption.

This doesn’t work: in fact, roughly 1/26^th of the gibberish failed decryptions also are selected by this simple program as the one true decryption.

So you significantly up your game: you find a large, openly licensed, corpus of English texts on the `net and compute not just the most frequent letter (which is, you already knew, E), but in fact how often, on average out of this entire corpus, each letter in the English alphabet occurs. You will end up with something called the relative letter frequency chart for English (see Wikipedia’s article Letter frequency for actual values for a number of languages).

Now you write a more sophisticated program which goes through the thousands of possible decryptions of your ransomed document, computes the relative letter frequency chart for that version, and proposes as the correct decryption the one whose frequencies are closest6 to what you computed from your openly licensed English corpus. I.e., you are using the relative frequency table of a piece of text as a fingerprint which should distinguish between actual English and gibberish.

This approach is surprisingly effective — relative frequency tables are a good linguistic fingerprint. In fact, strangely enough, it even works when you try to use it on very strange texts, such as a chunk of the book A Void, which is an English translation by Gilbert Adair, made entirely without the letter E, of Georges Perec’s novel — written entirely without the letter E! — La Disparition.

Some high-level features to notice about this simple example:

it’s not intelligent, it’s just a (quite modest) statistical model
to train the model, you needed a large corpus of human-made materials which had been evaluated by humans (here the evaluation was merely in putting these files together and calling it a “collection of English texts”)
the actual code of the model had some internal data structures (those relative frequency tables), a little bit of math (that formula of distance between relative frequency tables), and a little bit of code (“run through all the decryptions and pick out the one with relative frequency table closest to the model table from your large corpus”).

There is one other feature which is not so clear at the level of description we have given here, but is important out of contrast with the nearly magical7 power of modern so-called AI systems:

There is good theoretical understanding of the very simple mathematics of relative frequency tables and so it is well understood when and how this approach should be successful.

Neural Networks

In just the last decade, a new algorithmic approach to many specific problems in simple (not general-purpose) artificial intelligence (so-called) has been remarkably successful. This approach is based on a different data structure than the simple example given above, a modestly more sophisticated bit of mathematics driving the training process, and —crucially — is embedded in much less satisfyingly explanatory theory.

The data structure (well, one of them: there are variants) looks something like

This image, Schematische Darstellung eines neuronalen Netzwerks is by Bennani-Baiti, B., Baltzer, P.A.T. and is licensed CC BY 4.0 International. The labels at the top mean:
**Input Layer**, **Hidden Intermediate Layers**, and **Output Layer.**

In this diagram, the round “nodes” represent variables which can contain a numerical value and the arrows indicated that that value in the node at the start of the arrow is propagated by a small and simple bit of code to contribute to the value in the node at the end of the arrow.

In practice, that “small and simple bit of code” which describes how values propagate through the network has numbers in it, which were calculated during the training period for this model, based upon a (preferably large and unbiased) corpus of human-verified data.

The code and formulae which implement these models are slightly more complicate that they would have been in the toy model above, but the difference in difficulty is very small to practitioners. What is striking, though, is that neural network-based models are fantastically more successful at a host of tasks including:

speech recognition
optical character recognition
facial recognition
interpolating data in space and time to make old, jerky, low-res films appear to be modern, smooth, high-resolution video
creating “new works” of visual art or music based upon some existing style or composer (through the training dataset) and using some new starting image or melody
etc., etc.

Note that this wonderful new tool (tool, not intelligent agent) has the exactly the same high-level features as were noted for the toy model above, but in contrast to the other general feature with which the toy discussion concluded, neural network-based tools have

There is very little theoretical understanding of these models can be so successful.

It seems silly to endow a tool with agency and legal status simply because it is sufficiently complex that (like the carburetor example above) tool-users do not understand how it achieves its (admittedly wonderful) results.

There is a very nice series of four videos by 3Blue1Brown called Neural Networks, pitched at about an undergraduate mathematics level. There are also very many written articles and books, many even with open licenses, which explain these models at any desired level of detail.