Composing Structured Music Generation Processes with Creative Agents

A branch of research on generative musical systems today aims at developing new creative tools and practice through the composition of high-level abstract specifications and behaviors, as opposed to designing autonomous music-generating agents. Our research on interactive real-time music generation has contributed to this research with models and architectures of generative agents specialized in a wide range of musical situations, from instant reactivity to long-term planning. The resulting models combine machine learning and generative processes with reactive listening modules to propose free, planned as well as reactive approaches to corpus-based generation.

The corresponding generative strategies are all built upon a musical memory: a model learned on a segmented and labelled musical sequence providing a graph connecting repeated sub-patterns in the sequence of labels. This graph provides a map of the memory’s structure, motives and variations, and can be walked through following different strategies to generate new sequences reproducing its hidden internal logic.

A scenario was introduced to guide the generation process by Jérôme Nika’s PhD thesis, enabling then to introduce anticipation and forward motions in the improvised human-computer performances. This symbolic sequence is defined on the same alphabet as the labels annotating the memory (chord labels, chunked audio descriptor values, or any user-defined labelled items). As summarized in the figure below, the concatenative generation process outputs an optimal sequence made up of subsequences of the memory, and whose sequence of labels matches the scenario.

Architecture of of a generative agent articulating a “memory” and a “scenario”.

The Python library implementing this models was interfaced with a library of Max objects dedicated to real-time interaction. In this research axis we focus on higher-level compositional applications taking advantage of this scenario object. This work is associated to the development of client interface implemented as a library for the OpenMusic and OM# computer-assisted composition environments. In order to increase the level of abstraction in the meta-composition process, a second agent can be introduced beforehand to generate the scenario itself (figure below).

Navigation through a “Memory Corpus” guided by (1) explicit scenarios manually inputted or (2) scenarios generated by another agent.

The training data (memory) of this other agent is a set of sequences of labels, from which new symbolic sequences are produced by “free” generation runs to be used as scenarios to query the first agent. The scenarios are then unknown to the user, and the meta-composition paradigm changes radically: the user no longer explicitly defines a temporal evolution, but provides a corpus of sequences that will serve as “inspirations” to articulate the narrative of the generated music.

This framework makes it possible to easily create large numbers of variations around the same explicit or underlying structures that can be used individually or simultaneously as polyphonic masses. The compositional practices allowed by this association of generative models and computer-assisted composition could be qualified in a metaphorical way as the generation of musical material “composed at the scale of the narrative”; where the compositional gesture remains fundamental while at a high level of abstraction.

Read more

Jérôme Nika, Jean Bresson. Composing Structured Music Generation Processes with Creative Agents. 2nd Joint Conference on AI Music Creativity (AIMC), 2021, Graz, Austria. ⟨hal-03325451⟩