The Acids Team at IRCAM (Institut de Recherche et Coordination Acoustique/Musique) – a French institute dedicated to the research of music and sound – has introduced Neurorack, an AI-based realtime-synthesizer that’s available as an open source DIY project.
Neurorack is based on IRCAM research into Diet Deep Generative Audio Models With Structured Lottery (pdf).
The research explores the idea that deep models may be highly over-parameterized and the hypothesis that extremely efficient small sub-networks exist in deep models and would provide higher accuracy than larger models, if trained in isolation. The approach tested removes up to 95% of the model weights, without significant degradation in accuracy. This makes it possible to implement deep generative audio models on embedded platforms.
The developers note:
“Deep learning models have provided extremely successful methods in most application fields, by enabling unprecedented accuracy in various tasks, including audio generation. However, the consistently overlooked downside of deep models is their massive complexity and tremendous computation cost.
In the context of music creation and composition, model reduction becomes eminently important to provide these systems to users in real-time settings and on dedicated lightweight embedded hardware, which are particularly pervasive in the audio generation domain. Hence, in order to design a stand alone and real time instrument, we first need to craft an extremely lightweight model in terms of computation and memory footprint.”
The Neurorack is a hardware demonstration of this approach, based on the the NVIDIA Jetson Nano, a compact computer that lets you run multiple neural networks in parallel. The prototype, shown in the demo embedded above, is designed to be compatible with Eurorack modular synthesizers.
The developers note four goals for Neurorack design:
- Musical: The generative model chosen is particularly interesting, as it produces sounds that are impossible to synthesize without using samples.
- Controllable: The interface was relevantly chosen, to be easy to manipulate.
- Real-time: The hardware behaves as traditional instrument and is as reactive.
- Stand alone: It is playable without any computer.
The initial model is designed for the generation of impact sounds. The model allows for the creation of a wide range of impact sounds, with 7 adjustable ‘descriptors’:
It’s not obvious from the project’s “cryptic demonstration video”, but all of the impact sounds are generated by the Neurorack module.
The Neurorack design is open source and available now via Github.