The week’s most-talked about new song is voiced by Drake and The Weeknd without their input or consent. Duncan Greive looks at the strange new AI era dawning for music, and asks what it means for our culture.
Over a keening four note piano loop and shuddering sub-bass, comes a very familiar voice, rapping entirely plausible lyrics: “I came in with my ex like Selena / The flex, bumping Justin Bieber…” It’s Heart on My Sleeve, a new single from Drake, instantly recognisable through his brooding, nasal tone and confidence in celebrity name drops. Only, it’s not Drake at all.
Instead it’s a song debuted by Tiktok user @ghostwriter977, who apparently wrote and voiced the lyrics, before using an AI trained on Drake’s vocals to create what is a pretty uncanny representation of the Canadian megastar. Later a hook comes in, this time voiced by fake The Weeknd, though to these ears it’s not quite so impressive as the verse before it.
While the song’s origins are mysterious, and it’s still possible that Drake will pop up to say it was him messing with us the whole time, it certainly doesn’t look that way. Even the beat, attributed to Metro Boomin on the track, seems to have come from somewhere else. As of right now, the original post has been taken down from Spotify and YouTube (though dozens of re-posts still exist), due to copyright claims from Universal Music Group, Drake’s label.
It is the most prominent example of a wave of AI music which has emerged in recent weeks, from Kanye West singing the likes of Old Town Road and Somebody I Used to Know to Drake (again) voicing an Ice Spice single – that one he has actively disavowed on social media. What’s extraordinary is the pace at which all this is occurring. Mere months ago I played around with a copyright-free AI music generator which created what I would describe as tinny early-2000s versions of generic genre sounds. To be here so quickly is confronting, to say the least. I’ll break down what’s going on, and the complex, charged issues this era seems certain to bring forward.
Music has always been patient zero for tech
Due to a combination of global cultural ubiquity and the simplicity and popularity of a pop song, music has been at the forefront of technological motion for over a century. As soon as the concept of recorded sound had been crystalised, it was recording music, over 160 years ago (the recording is pretty amazing). From there it was central to mass cultural tech, from radio, to the compact disc, to the walkman, to peer-to-peer downloading, streaming and most recently became a huge part of the growth of TikTok.
In some ways pop music is the original user-generated content, with huge artists emerging from suburban garages, high school drama departments, and songs sold out of car boots capable of becoming global hits. What has changed through that period is that the music industry has gone from an imperious and highly profitable master of entertainment media, to a much more financialised product, one which largely accepts the rules laid down by tech platforms, which have in some bleak way succeeded in taking their place at the heart of culture, without creating any original work themselves.
In the internet era in particular, music was impacted first – largely due to songs being a small file size, they were the most easily shared in the dial-up era. A song might be three megabytes, where a movie could be 700 megabytes – an essentially arbitrary data distinction which meant that music was impacted by streaming long before the likes of movies and gaming.
Music is about to be disrupted again
Talk to artists about this era, and many are somewhat glum. Being forced to try and cut-through on TikTok can feel undignified, and because streaming hosts 100,000 new songs a day, good luck trying to find a moment in the torrent. While industry revenues have largely recovered to the glory era, that’s mostly working more for the labels than the artists, the vast majority of whom now rely almost entirely on live or merch revenues over recorded royalties to survive.
Unfortunately, the history of technology largely says that once a particular door is opened, it’s very hard to close it again. The flurry of takedown notices issued by Universal has prompted Dan Runcie, author of the excellent music industry newsletter Trapital to comment that “this feels like Napster in 1999. New technology is here and the industry’s protocol is to resist.”
What’s not entirely clear is whether Universal has legal standing to claim copyright over the song. If, as appears likely, the lyrics and beat are original, then it’s solely the vocal style which could plausibly be claimed. And it’s not at all clear whether such a case would be successful, were it to reach a courtroom.
Mark Mulligan is one of the sharpest analysts of the modern music industry, and wrote earlier this week that “copyright law was not designed for AI. Music rightsholders will do their best to apply existing law, but they will face challenges in doing so. Meanwhile, there will simply be too much output to effectively pursue plagiarism cases, which take time and ultimately depend on the personal interpretation of non-expert judges and juries.”
What culture will be next?
The point he’s making here is that we are at the start of an incredible explosion of AI-generated content. Given the speed at which large language models and generative AI are evolving, it feels inevitable that there will be the technological capacity to produce near infinite songs involving different stylistic elements of major artists’ sound in the near future. Plug that into the powerful distribution modes of social media, and you have a recipe for enormous disruption.
It’s easy to imagine that with other less complex forms of cultural output, too. I do a weekly media podcast called The Fold, and have no doubt that were you to train AI on my voice and content, along with my written output, you could convincingly create an AI facsimile of me that would just need to right prompts to create something I might plausibly have said or written.
That’s really confronting! We are not ready for the waves of deepfakes which will flow out of this technology, nor the impossibility of verifying what is real at scale. AI video is still halting and basic, but evolving incredibly swiftly. Once the technology arrives, instead of the deeply weird and soon inevitably cancelled AI Seinfeld streaming on Twitch, people might be able to whistle up season five of Succession.
How is the media industry responding – and what might the future hold?
The traditional media companies are reeling, basically. And calling their lawyers. As the Washington Post reports, of the sites most commonly used to train large language models like ChatGPT, a large number are news sites – including half of the top 10 overall. Getty images has already sued Stability AI, the company behind Stable Diffusion, one of the main image generation services, citing copyright infringement. Universal has asked Apple Music and Spotify to update their terms of service to explicitly bar AI services from being trained on their artists’ work.
As with most situations involving creativity and billions of dollars, it seems likely to end in yet more lawsuits. The outcome is far from clear, though – it’s easy to imagine a judge viewing the machines’ output as no different from that of a human who has listened or read a bunch, and come up with their own original composition.
There is also an argument that this is hopelessly overblown. Just as the remix mp3 boom of the late-2000s didn’t stop people listening to the original songs, many believe that the concept of authorship is meaningful to an audience, and that the novelty of hearing Kanye West sing George Michael will wear off before you make it through the first play. This is setting aside the fact that pop music, like all cultural forms, thrives on the creation of something new – something inherently impossible with the current generation of AI, trained solely on history.
There’s also a version of the future in which deals are struck allowing for generative AI to pay to be trained on these vast datasets, and for the results to pay royalties based on the popularity of the results. This has a form of precedent, as Stratechery’s Ben Thompson points out. “On YouTube, TikTok, and most social platforms today, the user just needs to tag the song associated with the artist and it’s cleared. The rights holders will be compensated… Similarly, a user could tag the artist whose voice or likeness was used to create an AI song. Ideally, the record labels and streaming services would have agreements in place to count song’s plays as a ‘stream’ or some other measurable unit that can compensate the underlying owners.”
This could plausibly apply to text, imagery and video too, potentially creating new revenue streams which encourage the creation of new, original material, which can then be endlessly remixed by robots. It’s a somewhat optimistic idea, in a space which understandably fills many in creative communities with fear and dread. But it’s not a stretch to say that the hype here is real. Even President Biden, pushing 80, seems to have his head around just how consequential this moment is for society and culture.
“Look what’s happening with artificial intelligence right now,” he said recently. “It poses enormous promise and enormous concern. Our world stands at an inflection point. The choices we make today are literally going to determine the future of this world.”