
Since the Middle Ages, classical musical composition has evolved from oral traditions to complex written scores. Its indisputable greatest-of-all-times include Beethoven’s Symphony No. 5 and Mozart’s Requiem.
What makes certain compositions more laudable than others? That is the age-old question. But a contemporary question – one that could only be birthed in the era of generative artificial intelligence (GenAI) – is this: will the 21st century yield a musical composer capable of stirring human emotions without being human itself?
Musicians and computer scientists at the University of California San Diego are considering these and related questions in their pursuit of ear-bending, AI-generated artistic expression. Led by Shlomo Dubnov, a professor in the departments of music and computer science and engineering, and an affiliate of the UC San Diego Qualcomm Institute (QI), a collaborative team is devising strategies to quantify what constitutes good music. From there, they are prompting AI to not only replicate music but to create a composition of its own.
“We are exploring ways to make GenAI creative, going beyond text-to-music prompts, which was one of the more influential contributions from our early research, to push the boundaries of how AI engages non-traditional scores,” said Dubnov.
Consequently, AI is positioned as both a tool and collaborator.
Artificial Improvisation with AI
Hit play on an improv session created with AI. For roughly eleven minutes, listeners are transported through a sound portal into another dimension of musical otherworldliness. Ethereal wind-blown pitches are interspersed with jazzy riffs, robotic beeps, haunting echoes of organ pipes, and lighthearted melodies reminiscent of bird calls and whale songs.
The recording from Dubnov’s team demonstrates a novel method of interpreting and performing graphic scores with the help of Open AI’s ChatGPT 40 and their own developed Music Latent Diffusion Model (MusicLDM) algorithm. Inspired by Cornelius Cardew’s Treatise, the contemporary improvisation audibly illustrates how AI can transform visual stimuli into sound and expand the creative possibilities in experimental music composition.
Briefly, Cornelius Cardew’s Treatise is a landmark in the history of experimental music and graphic notation. Composed between 1963 and 1967, it consists of 193 pages filled with abstract shapes, lines, and symbols that defy traditional musical interpretation. Lacking any conventional notation, the score offers performers freedom, allowing each realization to be a unique artistic event.
Imagine jazz musicians riffing at a nightclub, a cacophony that finds cohesion in the ear of each listener. Now, picture AI playing all the instruments.
“We leveraged OpenAI’s ChatGPT to interpret the abstract visual elements of Treatise, converting graphical images into descriptive textual prompts for MusicLDM, a pre-trained latent diffusion model designed for music generation,” said Dubnov.
Together with his PhD student, now UC San Diego alumnus, Ke Chen, and in collaboration with the Institute for Research and Coordination in Acoustics/Music (IRCAM) in Paris, France, the project generated OuchAI, the first composition to use text-to-music to create improvisations, which debuted at a live performance at the Improtech 2023 festival.
The new work on Treatise leans into this previous research but introduces a technique called “outpainting,” which overlaps sections of AI-generated music to create a seamless composition of rich soundscapes.
Dubnov and co-authors Tornike Karchkhadze and Keren Shao, both PhD candidates at UC San Diego, won runner-up at the AI Music Competition 2nd Workshop on AI Music Generation of the 2024 IEEE International Conference on Big Data for their music composition and paper titled, "Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew’s Treatise."
Advocating for AI in Music
On February 20-21, UC San Diego hosted the 2025 GenAI Summit, where researchers, industry and thought leaders shared their research and insight on where generative artificial intelligence will take us next. As a committee chair, Shlomo Dubnov advocated for “AI for Music.”
Anna Huang, an assistant professor from MIT, gave a lecture on her research using AI modeling to interact with musicians. She began by thinking of different ways to create melodies, using machine learning as a “puzzle-building process.”
During the FAQ portion of the talk, someone brought up one of the more controversial possibilities for artists in every field. Is AI a tool for artists? Or a competitor?
Zachary Novack, a PhD student in computer science, had some insight into this very idea:
“I think that, as researchers, we need to include musicians and artists in the whole pipeline of development on AI systems to bring perspective on what is actively useful for creatives. I think there’s a lot of full stack text-to-song models out there, which may be incredibly divorced from what musicians desire as tools, and we can do a lot better on our end in designing useful creative tools rather than ones that automate the creative pipeline fully.”
At the GenAI Summit, Novack gave a talk as co-creator of Presto, an innovative model for accelerating music generation. As both a musician and engineer, he strongly believes that artists should be compensated for their work and the data they provide for these models, reinforcing the relevancy of artists.
Currently, Dubnov and his fellow researchers, computer science professors Julian McAuley and Taylor Berg-Kirkpatrick, along with students from Musaic lab are building systems to capture so called "tacit" knowledge in accompaniment or interaction between multiple musical tracks. They are also exploring robotic conducting for experimental musical scores.
Dubnov’s research was supported by Qualcomm Institute and funded by Project REACH: Raising Co-creativity in Cyber-Human Musicianship, a European Research Council Advanced grant. REACH promotes the study of “shared musicality” at the intersection of the physical, human and digital spheres; it is designed to produce models and tools to better understand and encourage human creativity in a context where it is increasingly intertwined with computation.
By Kimberley Clementi
Natalie Calderon-Hansen contributed