Tim Cohen'sšŸ’„ Loose Canon šŸ’„

Loose CanonšŸ’„Stand up under one unbroken algorithm: an amateur meets AI music

Tracking the process of making a virtual song and accompanying video, blow-by-blow. Literally.
Tim Cohen 8 min read
Loose CanonšŸ’„Stand up under one unbroken algorithm: an amateur meets AI music
AI image by ChatGPT
Loose CanonšŸ’„Stand up under one unbroken algorithm: an amateur meets AI music
AI image by ChatGPT

Where do we stand currently with AI? I thought I would have some fun testing this question and, along the way, track the process of making a virtual song and accompanying video. The intention is really to provide a blow-by-blow of what AI actually does in the creative process - and does not do. I’ll loosely describe the AI models I used, how long it took, the failures and successes. And at the end of it, you can be the judge of whether this process resulted in something worthwhile or not. 

More importantly, if a rank amateur can do this, the question is, how long will it take for someone with real skills to produce absolutely worthwhile creative content - and start putting musicians out of work?!   

There are a lot of AI songs out there now, but until recently it's been notably unsuccessful. Artists like David Bowie used a programme called Verbasizer to help write lyrics for his 1995 album "Outside". Bowie has written some fabulous lyrics (I love the line, ā€œWe could be heroes, just for one dayā€),  but he has also penned some shockers. "Ha-ha-ha, hee-hee-hee, I'm a laughing gnome, and you can't catch me,ā€ springs to mind. So its understandable that even a songwriter as fabulous as Bowie might appreciate some gentle guidance.

Now AI music is now wildly viral.  Everything changed in April 2023 with "Heart on My Sleeve" by Ghostwriter977, which used AI to mimic Drake and The Weeknd's voices. This song sparked massive controversy ... and also excitement, accumulating over 9 million views on TikTok Variety before being taken down by Universal Music Group. But not by youtube.

The rise of platforms like Suno AI (launched December 2023) and Udio (April 2024) have democratised AI music creation. Today, Deezer, a music streaming app, reports that 50,000 AI-generated songs are uploaded daily to the service alone. These songs have fooled people spectacularly. "Heart on My Sleeve" was so convincing that many fans initially believed it was an actual Drake-Weeknd collaboration. 

From there, other genres followed. "Walk My Walk" by Breaking Rust topped the Billboard Country Digital Song Sales chart in November last year and Verknallt in einen Talahon by Butterbro peaked at No.3 in Germany as the first fully AI-generated track to enter a national chart. Xania Monet (AI gospel/R&B) earned a multi-million dollar record deal and generated 44.4 million US streams. 

The technology has become so sophisticated that Recording Academy CEO Harvey Mason, Jr. recently said that every songwriter and producer he knows has now used an AI tool. Obviously, this is becoming a moral question: do you acknowledge AI assistance with lyrics, the melody or composition? I mean, musicians have been using auto tune for years now without acknowledgement. How different is that to AI?

For the streaming companies, it's a question too: Xania Monet is available now on Spotify and YouTube, so for the time being, they are going along for the ride. There is also no acknowledgement on Spotify that ā€œsheā€ is AI generated. And she is now getting over a million listens a month!  That earnings of around $1200 a month.

But having said that, there is often something obvious about AI songs. Walk my Walk includes the line ā€œI keep moving forward, never looking back/With a worn-out hat and a six-string strapā€. What is a "six string strap"? I feel as though Breaking Rust AI is just not putting "his" back into it at this point, frankly. And the video is beyond amateurish.

If it's that bad and still gets a following, what does that say about the quality of a lot of popular music? Maybe, we just don’t care. We are not looking for Brahms; we are looking for some rousing, self-affirming fun. 

Anyway, here is the quick-and-dirty version of how I created a song and its video. (Actually, I created several, as we will see). The thing that AI really can’t do for you is to choose what you want to create which, in some ways, is the first and most important lesson. I have been writing about soccer World Cup songs, so I chose that as my theme - its World Cup year after all. I also chose this concept: teamwork, participation, enjoyment, excitement, and the kind of buoyancy you get from participating in a team sport, particularly when it represents the nation. 

I had a few ideas; one was about an "unbroken sky" to suggest global unity; and the other was ā€œno one wins aloneā€ to emphasise teamwork and togetherness. As hackneyed as it might seem, I like the idea of ā€œstanding tallā€, just to underline the pride element. I then asked ChatGPT and Google Gemini for verses in different formats with those phrases in them. And I got a real mixed bag. Some were interesting, some were embarrassing, some were just nuts.

One suggestion was : 

Stand tall where the shadows fall

Answer the call, give it all

Every voice, every tone

No one ever wins alone

This is too pedantic and direct. In some ways, I had to pair the LLMs back a bit to try to make them less obvious. I did ask for no specific reference to soccer, its not about the tournament, its about the idea. Eventually, by a process of weeding and adding, and asking for other options, I developed something more or less usable. The ā€œstanding tallā€ ended up ā€œstand upā€, which is more direct and imperative. 

I put the lyrics into Mureka.ai, which is what Suno.ai is apparently now called. It seems very reliant on the lyrics. The instruction I gave was ā€œUplifting, chanting, gospel-likeā€, and the genre I chose was ā€œTexas bluesā€ with a male singer.

Oddly enough, this was the easy part. I tried slightly different instructions,  and it gave me some stomping stuff. Is it good? Honestly, I was both stunned by its facility and appalled by the result. But it's not, I don’t think, entirely unlistenable - its catchy and fun actually in a very ordinary kind of way.

The hard fact is that modern music uses a very defined palette. Songs with more than five chords are now a rarity. Tonic, subdominant, and dominant chordal patterns are now de rigeur, with a submediant minor often thrown in for colour.  Notes within those patterns are ruthlessly diatonic. I know, you're now saying, eh? Bear with me. Timing changes within a song are very rare, as are chromatic notes. In a sense, when you think about it, it's hardly a surprise that the AI finds it pretty easy to replicate something that hits the middle of the popular taste spectrum so easily. 

And then I was off. I created five new songs in quick succession. And to make it more challenging, I decided to create a video for a song. This was actually very difficult, maybe because I’m just not very good at it. The big problem is lip-synching.  Eventually, I worked out how to do it. I used ChatGPT and Nano Banana to create a set of static images. I then sliced the song into around 15 bits, or sections, using the Apple music software called Logic. Then I loaded those slices into an AI called OpenArt. I also loaded, one by one, an appropriate image, and then asked it to create a lip-synched video. By now, you are saying - he has too much time on his hands!

This took a lot of time, yes, and was very expensive. There is lots of "signing up" and "plans". And the result was really poor - the lip-synch was really bad, presumably because there is so much sound in a song, it struggles to know what to synch against. But it occurred to me that if I split the soundtracks into different instruments and just synched against the vocal track, I would get a better result.

I put all the pieces of video into iMovie, loaded the original soundtrack, and aligned the music on each of the video tracks with the sound on the music track. That was time-consuming but kinda worked. 

Anyway, after all that, this is the result - loaded very easily to YouTube, under the pseudonym Ryan Alexis Cross, for reasons I cannot really explain.

So what does this tell us? First, as astounding and wonderful as it is, the AI is not quite there yet. There are good novelty singles out there, but they are not terribly sophisticated. The press toward an amalgamated character. So, when it comes to popular music, the Turing Test remains as yet failed IMHO.

But for idea generation, or something to use for experimentation and for initial ideas, something to spark your imagination, it's just fabulous. Its ability to adopt a persona is enormously powerful. The AI won’t give us great songs, but it may just lead us down untrodden paths.

Or it could lead us down tracks that have been visited a zillion times before.

Honestly, I’m still not sure.

Let me know what you think.

For paying members, I’ll put up some of my other experiments and also track with poor lip-synch, so you can see what happens.

Tim Cohen'sšŸ’„ Loose Canon šŸ’„

I'm a South African journalist - formerly editor of FM, Business Day & Business Maverick. I'm currently Senior Editor on Currencynews.co.za. Commentary and reflections on business, economics.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Tim Cohen'sšŸ’„ Loose Canon šŸ’„.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.