AI artwork is going mainstream, but it’s doing so off the back of human artists.
What’s happening?
In the last few months, the creative artificial intelligence systems that I wrote about back in 2021 have traded the comfort and anonymity of universities, labs and Big Tech data centres for the bright lights of social media and the scrutiny of the mainstream press.
By now you will have seen friends, celebrities and “influencers” swap their traditional profile photos for what look like custom, contemporary portraits, possibly of them wearing a space-suit. Don’t be fooled. These people have not commissioned an artist to capture their essence on canvas (or in pixels). They have instead bought an app, handed an AI a dozen or so images of themselves, selected some themes and waited a minute or two, with some fairly spectacular results:
This image was created with Lensa, which rapidly climbed to the top of the Apple app store within days of its release late last year.
The systems that create these images, like Lensa or Midjourney, are called GANs, or “Generative Adversarial Networks”. In a nutshell, one AI (the “discriminator”) teaches itself what real art looks like, having been fed massive amounts of data consisting of images and labels explaining what's in the image. A second AI (the “generator”) creates random images that are sent to the discriminator, which determines whether or not they look like a real, human-produced image. Images that don’t pass muster are sent back with a realness score and the generator tries again. This happens over and over until the generator produces an image that the discriminator believes is real.
Why is it important?
Every single part of this is worthy of a deep dive, but our concern (as technology and IP lawyers) is with the data sets used to train the discriminators. Machine learning (the concept underpinning all modern AI) relies on vast amounts of data, frequently served up in the form of datasets comprised of information “scraped” from the internet by “bots” (automated programs designed to carry out a specific task).
Midjourney, for instance, (along with its competitors like DALL-E and Stable Diffusion) has been trained on a dataset called LAION-5B. This is an openly accessible image/text database that contains URLs pointing to approximately 5.85 billion images and their associated captions/descriptions. At no point has LAION, the dataset’s compiler (a non-profit organisation with the aim of making it easier for the general public to research and develop machine learning models), asked the creators or copyright holders of the scraped images for permission. And herein lies the problem.
From a pure copyright law perspective, LAION has an argument that they haven’t done anything wrong (although this is going to be tested soon – as we’ll come to). Almost all global copyright rules provide for a certain amount of “fair use”, and it is arguable that compiling a dataset for educational and research purposes is “fair”. In LAION’s case, it is unclear whether they would even be considered to have used the images at all, as the dataset they compiled just contains the URLs (the web addresses where the images can be found).
Furthermore, Midjourney and the like no longer need to rely on the dataset. The AIs have been trained, and don’t need to directly reference the images. Much like how I don’t need to reference an annotated picture of a 747 to know when I’m looking at a 747, Midjourney no longer needs to reference existing images to create new ones. If I ask it to create a “mid-century home in a lush valley with gravel driveway and a BMW E9”, it can create an entirely original image in a few seconds without copying an existing one:
Car nerds among you will know that that is most definitely not an E9, but it captures the essence of one, much like if someone tried to draw one from memory.
However, leaving aside technical arguments, these systems represent a major challenge to the fundamental purpose of copyright. Put simply, copyright laws recognise the value to society of creative output, and grant a creator the monopoly over their work, allowing them to monetise it and protect it from theft, derogatory treatment and misattribution. But Midjourney can take an artist’s style (provided they were included in the training dataset) and create a brand new image in seconds. Possibly one that the artist would not want to be associated with. I asked Midjourney to draw Joseph Stalin in the style of the legendary Jack Kirby (I intentionally chose an artist who isn’t with us anymore) and it made these in about 15 seconds:
Those will look pretty familiar if you have looked at a comic book in last 50 years, but not exactly Mr Kirby’s usual subject matter (such as Captain America)!
Web cartoonist Sarah Andersen (of Sarah’s Scribbles) recently wrote in the New York Times about her struggles with her work being appropriated, and the ease with which AI image generators allow this to happen. It started with crudely photoshopped images swapping out the speech bubbles, but with these new services, entirely new frames can be generated using Sarah’s distinctive, black-and-white style. Her work was included in the LAION-5B dataset without her knowledge or permission.
The implications of this are profound, not only from an artistic integrity standpoint, but from a commercial one too. Where an artist with a reasonable online profile might have previously been commissioned to create a work, the potential customer can now, for a few bucks (or even a free trial) have a machine create a specific work “in the style of” the artist of their choosing. This has the potential to seriously dent the earnings of an entire industry. Add to this the implications of a world in which the quintessentially human practice of creating and consuming art is diluted by a glut of soulless, AI-created works, and the fun new apps start to look a little dystopian.