Sound Escapes

|

The Sound Escapes concert is a collaboration with Spectrum Music released on July 30, 2021 as a online concert.

Concert streamed on twitch on July 30th at 8pm ET. It is now available on youtube.

Watch the full concert here.

Inspired by the wild and uninhibited beauty of nature, it is a collaboration of 6 emerging composers, an amazing jazz trio and myself. For each piece I have created accompanying visuals; a music video crafted from found footage, local landscapes and generated by machine learning / artificial intelligence, all made with open source software. The concert speaks to a love of nature and the technological revolution that is ahead for humanity. While seemingly at odds, I hope that the alien lens of machine learning provides a new perspective on the wonders of everything non-human and a tool to empower more people to make art. Art that can help us see the beauty that is rapidly disintegrating around us.

Every piece integrates different ML techniques, some of which are only a few months old. I had fun working with the composers trying to find a vision for their concepts and then translating or matching that to various ML research projects. The pieces feel quite unique, although the alien-ness of the machine makes for a pretty psychedelic experience that I leaned into pretty heavily for some of the pieces. After about 7 months of day trips (for video footage), reading more than a hundred papers, experimenting with countless projects and then about 60 hours video editing on each piece I have some new perspectives on ML and digital art tools. You can find out more below and in the artist talk Mason and I recorded.

Spectrum composers included: Mason Victoria, Suzy Wilde, and Jackson Welchner with guest composers Dhaivat Jani, Mac Rogers and Sarah Thawer. Zoe Brown, the Managing Artistic Director of Spectrum Music, coordinated and managed the concert.

Musicians were Caleb Klager on bass, Sarah Thawer on drums, and Chris Pruden on piano.

Credits

Composers

Bird Brained – Jackson Welchner

Robbie’s Hobbies – Suzy Wilde

Fade to Black – Dhaivat Jani

i am a strange loop – Mac Rogers

The Night is Young – Mason Victoria

Hot Bod – Sarah Thawer, arr. Mason Victoria

Ensemble

Sarah Thawer (drums)

Chris Pruden (piano, synths)

Caleb Klager (bass)

Audio Visual Team:

Lead Audio Artist – Mason Victoria

Lead Visual Artist – Ryan Kelln

Additional Visual Artist – Sol Sarratea

​Mixing Artists – Pablo Del Pozo, Dhaivat Jani, Jackson Welchner

Studio Engineers – Pablo Del Pozo, Mason Victoria

Videographers – Ryan Kelln, Laura Soch

Video Editing – Ryan Kelln, Mason Victoria, Zoe Chronis Brown

Graphic Design – Juyeon Kang

Spectrum Music team

Managing Artistic Director – Zoe Chronis Brown

Artistic Producer, Sound Escapes – Mason Victoria

Artistic Producers – Jackson Welchner, Suzy Wilde

Artistic Associate – Dhaivat Jani

Piece details and visuals credits

The entire project was created using only open source tools. Using a base of Ubuntu Linux OS I was excited to try out kdenlive which, after a bit of experimentation, seemed to be both easy-to-use and full featured enough for the project. It was my first time using the tool, and I had a mixed experience: I ran into a number of frustrating bugs but was happy with the final results. Linux has been missing a good quality video editor, and kdenlive is getting close to the point where I would recommend it.

For each of the pieces I used different machine learning (ML) techniques based on open source projects to generate images and video. The main technique involves using GANs (Generative Adversarial Networks). This technique, discovered in 2014, and subsequently heavily researched, has a fascinating premise that mimics the natural phenomena of competition and detection/identification as seen in camouflage. A generator and a critic/discriminator both start the training process completely naively, and then in tandem learn their tasks better. The training data consists of real examples (of the subject, say images of cats), and the generator tries to create convincing “fake” cat images while the critic is given either a fake or a real cat image and guesses its authenticity. The training software then modifies the critic based on the correctness of its answer, and then modifies the generator based on whether it was able to fool the critic. In some sense they evolve together but after training is complete (whenever the artist likes the output) the generator can be used on its own to generate/hallucinate/mimic the training data. Helena Sarin has written a good article about using GANs for art.

Ashbridges Bay sunrise video shoot

This project is a part of a love story for me. I met my partner, Laura, during the Covid summer of 2020 and in the fall we started doing day trips each weekend to film landscapes. Together we learned how to shoot video and frame landscape scenes while falling in love with each other and the Southern Ontario and local Toronto parks, resplendent in fall colours. I hoped that I could use that video to train an AI that would learn the motion in the video and then be able to generate new video but re-time the motion to match the music for the concert.

Boxing Day video shoot in High Park

In January 2021 I started researching how to make this happen. For three months, every day I would read multiple papers, try out various projects, and contact researchers. Despite the large amount of research being done in the area, what I was looking for hadn’t been done yet, and video generation was still very limited by the large amount of data in video compared to still images. It was unlikely I’d have enough time to develop a completely new technique, or if it was even feasible, especially on the consumer level hardware I possessed. Disappointed, I had to rethink my approach, especially after meeting with the composers and getting a better idea of the concepts they wanted to pursue. Instead, throughout April and May, I needed to imagine visuals that reflected their concept, then find techniques that could create those visuals and experiment until I had something that seemed usable for each piece. These experiments were then used to start building a video storyboard for each piece.

I filmed our rehearsals, held at the end of May, and used this footage to finish the video storyboards that were used to convey my plan to the composers and musicians and get their feedback. For the recording session at the start of June, we had 7 active cameras and Laura was a critical second camera operator. June and July was spent constructing each piece (approximately a 60 hour week each) after one week of cutting and syncing video tracks and finishing with a week and half of fixes, polish and final video renders. Throughout this time Laura was instrumental in giving me feedback on the work so far and providing extra support so that I could spend the time needed. The art here is very much a result of the teamwork and love that grew from the challenges and successes of this project.

Technical details

Each piece used a variety of machine learning techniques to generate the visuals and many peices also included machine learning generated audio created by Mason Victoria.

Audio details

The music AI was built on Google Magenta and a custom LSTM (long short term memory) trained neural network. The AI was given over 1000 jazz melodies, jazz “licks” and classical etudes and used Yotam Mann’s “AI Duet” interface to “Call and respond” to musical inputs provided by live performers.

We realized early on that the AI was not performing to the level of our performers (at the time, we were stuck with fixed velocity and straight 8th quantization) so we decided to use the AI to trigger a custom sample library of synthesized environmental sounds in addition to musical notes.

In addition, we developed a binaural object based audio system that allowed listeners to hear these generated sounds in an immersive audio environment (similar to Dolby Atmos). Sounds produced by the AI are randomly placed in an ambisonic soundstage creating an immersive and dynamic soundscape that surrounds the listener.

Visuals details

See the decriptions of each piece below for more information and links to the specific projects used. In addition to the machine learning software, every piece also used a number of open source tools, listed below.

Software:

Code:


Bird Brained

Bird Brained boids still

Bird Brained on Youtube

Composer: Jackson Welchner

Jackson’s pieces often have a cheekiness to them and challenge the musicians and leave them whooping in delight once it grooves. Bird Brained piece fits the mould and is themed visually around bird flocking.

At the outset I knew I wanted to get a Craig Reynold’s inspired ‘boid’ simulation going, but wasn’t sure how to tie that to the music. Fortunately for me Stephen Young had already done some experiments with boids using my favourite game engine: Godot. From there I experimented with making the simulation look even better from the perspective of a bird watcher and added an interface that allowed someone to “play” the bird flock in time to music. The final controls scheme allowed for adding and removing boids, changing their speed, all of the parameters controlling flocking behaviour, some special effects and perhaps most beautifully a painting mode where you could use the keyboard or a midi keyboard to colorize the boids based on the note pressed and they would paint the screen with their color, creating a sort of cloud of color reminiscent of fireworks or slow motion Holi festival colored powder.

You can try this for yourself by downloading my software.

A lot of the piece didn’t fit musically with bird flocking however, so I had to get creative and started to draw on a lot of inspiration, mainly from The Books, whose handcrafted music videos first inspired me to want to make video to music. Laura and I searched for additional Creative Commons footage and discovered a few pieces of gold too: Caleb Wood’s bird shit and some fantastic footage of ducks being ducky.

To really explore the bird theme I used the Big Sleep project to imagine life from a bird’s point of view and even a world where the composer, musicians and I were all birds.

Some of the sections of this piece I really struggled with, but everything came together in the end. For example, Laura and I spent most of a day learning how to play the bird sim together - her on mouse and me on keyboards, trying to match the birds to the piano solo. Despite the frustrations of software bugs and learning a new instrument, it was a great experience to be (the first?) flocking instrumentalists.

Building little art instruments like that really makes me happy, and Jackson always provides a great piece to jam to.

Footage:

Code:

Tools:


Robbie’s Hobbies

Robbie's Hobbies still

Robbie’s Hobbies on Youtube

Composer: Suzy Wilde

Full score

This piece is dedicated to Robert Wilde, Suzy’s grandfather, and features his landscape paintings and a cameo of him playing the piano for her as a child. This was really a delight to work on, as I’m thrilled any time I can be working on art that acts as a love letter to your friends, family or communities. It was also one of the more frustrating and challenging pieces (technically).

Supposedly machine learning is good at style transfer (copying the style of one image to the content of another), but my video results, with a set of less than 20 paintings as training data, weren’t what I was looking for. These techniques still struggle with consistent video results (often suffering from flicker or wildly different styleization each frame). I tried many different approaches, including a number of non-neural net image processing algorithms, but if the stylization was good then it was bad temporally, if the video was good then the stylization was weak. The end result was a mix of effects built into kdenlive and ReReVST.

I also found a few projects that I thought would be perfect for animating landscape paintings, but had worse results than in the papers or were limited in resolution (or both). In the end it required a lot of video editing to integrate these techniques, but I was happy that I got to use them, likely for the first time, in an art context.

I’m looking forward to this video being updated when techniques improve and high resolution style-aware animation is possible. My ultimate vision is to make the paintings move realistically in the slower sections in the beginning and then dance in the uptempo sections with the musician footage looking like it was part of the painting.

It is always a pleasure working with Suzy, as she is such an emotive and lyrical artist, something I deeply appreciate but isn’t how I create art on my own.

Footage:

Code:


Fade To Black

Fade to Black still

Fade To Black on Youtube

Composer: Dhaivat Jani

Full score

This piece went through a bunch of changes during its conceptualization, but had the fewest changes in implementation. Dhaivat was originally inspired by the beautiful tree outside his window and its transformation throughout the year. We had a good time biking around his neighbourhood looking for magnolia trees (which also had inspired him and we thought would make a good representation of the spring season).

The piece took a dramatic turn while it was being composed, and became more centered around our feelings regarding the climate emergency and how technology has shaped the world. So, we combined everything together. I felt that despite a sort of overwhelming visual and thematic complexity, that complexity and layering represented our feelings when thinking about the combination of problems facing humanity.

The (many) layers in this piece are:

  1. A history of human technology, as dreamt by a machine. Using the aphantasia project, I fed it a list of 146 technologies from a few thousand years ago to today, basing it from the research done for my Creo Animam project. These images form an almost invisible backbone to the piece.
  1. A sequence of videos representing humanity and our relationship with the environment and human industry. Pieced together from public domain and creative commons videos, I was particularly delighted by the Prelinger Archives at the Internet Archive, and some government sources.
  1. The seasons of spring, fall and winter as imagined by StyleGAN2. Using images based on seasonal search terms for training data these models were deliberately under-trained with far too little training data to create a vague sense of seasonal time - an impression of color and abstract form. These models were then animated using the maua-stylegan2 project to react to the audio of the piece.
  2. The tree and musicians: a filter or lens which, in this case, literally provides a glimpse of layer 2, the relationship between humanity, industry and nature.

I wanted to illustrate two relationships: those that see a landscape and think, “it’s so beautiful and untouched” and those that think, “it’s so empty and useless”. Both consider the lack of human intervention a great resource or opportunity, but for very different reasons. I’d like to believe there is a perspective of mutuality that dissolves the separation of humanity from their environment.

This piece purposely feels unnerving to me, building to a sense of overwhelming terror or madness, but the final sequence fills me with renewed hope. Václav Havel said, “Hope is not the conviction that something will turn out well but the certainty that something makes sense, regardless of how it turns out”. Fighting for justice and peace for all, including the non-human, will always make sense to me.

Dhaivat really made this piece personal, writing it amidst adversity, and I did my best to respect and honor that, so I was grateful when he thought I had captured the feelings well. For me it was an attempt to bottle the feeling of being changed, with the final afterimage echoing the initial inspiration for this piece and a reaffirmed love of nature, including us.

Footage:

Code:


i am a strange loop

i am a strange loop still

i am a strange loop on YouTube

Composer: Mac Rogers

Visual FX Shader Artist: Sol Sarratea

Full score

Mac really dived into the concept and design, heavily inspired by Hofstadter’s I Am a Strange Loop book. I knew the perfect artist to help achieve this vision: Sol Sarratea, a brilliant visual artist and huge fan of Hofstadter, who has the math and shader skills to pull it off.

To better integrate the concert theme we explored the concept of people defining themselves by their favourite places in nature. Where Hofstadter explored how living emerges from inanimate matter, we wanted to loop back to how a self can define itself by its environment. Mac collected photos, text and audio recordings of friends’ and family’s beloved places. This treasure trove then drove the visual concept.

Mac also integrated a story arc through the piece, a transition from inanimate, moving through self-replicating patterns, achieving life and/or/then digital life, and then transcending or deconstructing back to inanimate matter. Plus visually we all liked the idea of highlighting the Fibonacci sequence, fractals and other patterns found in nature. Like Fade Into Black there were many themes, so the challenge was to manage the complexity, interactions and the transitions. To complicate matters, integrating the audio recordings required rearranging the piece thematically so that the voices could really shine.

Sol’s research uncovered the Droste effect, which had a Fibonacci-like feel but with beautiful curves and we were able to build on her existing work with self-organizing textures. Using textures derived from the photos Mac supplied in both effects I was able to start gluing the nature, nature-defining-self, and self-replicating pattern themes together.

Droste effect example

To pair with the descriptions of favourite places I used aphantasia to dream up visuals based on the text. It turned out that the text actually required some translation to help guide the dreaming. This art of ML text interface is well described by Charlie Snell - it is a new and intriguing way to control machine generated art. Using machine learning to generate the images provides a nice way to re-emphasize the underlying theme of patterns defining the self and the importance of converting/translating patterns as a form of genesis. People’s brains dream of/remember their favourite places (now just a pattern of neural firing), translated to symbols by their brain, translated by me into other symbols that mean something similar but are better interpreted by a machine, which translates those symbols into visuals - based on the training it has received from a collection of text and image associations from the internet (i.e. how others have made associations between the symbols and images).

As Snell discusses, this feels like a conversation or collaboration with an alien mind; a gestalt mind, and in some ways, an ancient mind. The techniques for machine learning are still in their infancy so in some sense they mirror the early evolution of vision and symbolic processing. The low hanging fruit (at least for digital minds) of information processing will likely be the first “discovered” both by researchers and evolutionary processes.

It is this connection to, and greater understanding of, the fundamental information processing inherent in our universe that excites me about machine learning. This piece tries to represent a bit of that thrill.

Everything is tied together by the music – itself a translation to symbols by Mac, interpreted by the musicians, converted to air pressure pulses by their instruments, recorded digitally, and then translated again into visuals to match the rhythms and feel of the music. This long path of translation, encoding and decoding, has correspondence to the processing done in the machine learning algorithms (and in all intelligence). It is in the loss and regeneration of information that the magic happens. More poetically, sharing with each other leaves the imprint of your soul on one another and the things you create.

Laura and I had taken around 100 videos of local landscapes, and now I had an opportunity to use some of our footage to represent the inanimate and the beginning and ending of life. A bit of magic that we discovered at Ashbridge’s Bay early one fall morning and a trip to High Park on Boxing Day with close friends. Visual magic then made to dance, making my own neurons dance each time the motion in the video felt intrinsically tied to the music. I spent a few months researching techniques to create a machine learning approach to generate this effect but in the end I had to do it manually. (If anyone wants to continue to pursue an ML approach please contact me!)

The beginning and ending sections showcase this footage and technique and have 3 overlapping layers each representing one of the musicians. Here I am trying to create a cohesive whole, more than a single scene visually, trying to find a path between stillness and chaos, where life emerges and souls combine.

In the end this piece speaks to me because, like Robbies Hobbies, it too is a love letter to both people and places. Thanks to Mac and Sol for their genuine love for this piece and thoughtful, diligent effort bringing it alive.

Footage:

Code:


The Night is Young

The Night is Young still

The Night is Young on YouTube

Composer: Mason Victoria

Mason was inspired by his first camp experience when he was a child. After growing up in the city he saw the stars clearly for the first time. Visually he was inspired by Memo Akten’s fantastic Learning to see: We are made of star dust (#2).

I had taken some nice footage of High Park cherry trees at night, so together with found footage of aurora borealis and stars, put together something that tries to match the wonder of seeing the night’s sky at its most beautiful and colorful. The end of the song is really dynamic, so I was able to switch to something really colourful using GAN generated nebula. This also provided a nice opportunity to connect the manipulated colors of astrophotography and Mason’s interest in synesthesia.

I made the most mistakes filming the musicians of this piece so I had a lot less material to work with than I hoped. It acted as a good constraint, forcing me to really find great Creative Commons footage.

The SWAGAN-based nebula generator was hooked up to Hans Brouwer’s great maua-stylegan2 project (modified to accept SWAGAN models) that allows audio to shape the images generated by a GAN. There was a lot to learn about how to control it, and I just got the basics working in the time available, but the results were quite interesting and this is an area ripe for exploration.

I share Mason’s nostalgia and awe of the night’s sky. Here the eye imagery (generated by a GAN like the nebula) represents this wonder. A gaining of perspective through the sense of being on a giant but tiny planet rotating around a star in space, gazing out and ending the journey of billion year old photons from the deep vastness of space time. A perspective at once so diminishing and encompassing to make your own and every life in this cumulation of continued existence together as the most fortunate beings to ever live. This theme is a staple in science fiction, such as in Alan Moore’s and Larry Niven’s work.

The Night is Young nebula still

Originally I had intended to use footage of stars projected onto Caleb while he played the bass solo. It had a really nice feel in our tests, but for a variety of reasons this didn’t work out, so instead I used the ReReVST project to turn him and then Sarah into stars and nebulae. Similar to Robbie’s Hobbies I was quite frustrated with the video stylization results, but here they were more interesting and usable after a bit of video editing.

Mason and I are usually on the same wavelength, so this piece came together relatively easily, without a lot of discussion, often the case when working alongside an old friend.

Footage:

Code:


Hot Bod

Hot Bod still

Hot Bod on YouTube

Composer: Sarah Thawer

Arranger: Mason Victoria

When I asked Sarah what sort of visuals she was interested in and showed her some examples of what I had done before, she gravitated to the bold, bright colors and intense visuals of Mason’s piece Source Activate in the Coding Chaos concert.

That set the tone and when Sarah wanted her piece to investigate the interplay between people’s multiple persona’s, roles and how others perceive them, I hoped to explore that through the interplay of animal, insect and flower colouring and the interacting dynamics of sexual display, camouflage and identity.

I was really fortunate that my good friend and photographer Andrew Drown has been working on a series about rhododendrons and shared his in-progress photographs with me. I’m a huge fan of his work; both of us love interesting textures.

The piece was strongly inspired by Sarah’s dynamic playing. More than anything the bold dynamism I hoped to achieve with the piece is a reflection of her personal playing style. As a fan of the look of Boccioni’s Dynamist paintings I hoped to evoke the movement and color that Sarah incorporates into her playing.

Underlying the fun and beauty of the textures is a question about identity. If the environment is determinant of your identity, who or what is the self when your environment changes around you? For social animals like humans, your identity may move more or less fluidly in reaction to those around you. I question a singular true self, instead an amalgam and continuum of the communities one belongs to and one’s degree of identity flexibility. In this piece, the instruments take on the shifting selves, each still evoking the persona of the musician while assuming a variety of textures.

The challenge for the animal textures was that Optex, the best project to combine the musician images with animal textures, didn’t support video and I lacked the time to fix that. Plus I wanted to be able to smoothly interpolate between textures and mix them, so instead I took video stills of the musicians, texturized them with Optex then trained a SWAGAN generator on those. Finally I brought the GAN into maua-stylegan2 to make it audio-reactive. Like in The Night is Young I only was able to scratch the surface of what is possible but I liked the combination of mirror effects and distortion done in kdenlive along with the reactive GAN output. At a certain point it doesn’t make sense to get the code to do something you can do faster manually, especially when the goal is a finished piece and not a “basic research” style investigation of the technology or tool building.

This piece required the most amount of exploration and curation of the machine learning tools. While Robbie’s Hobbies required a lot of experimentation, most of them were obvious failures, in this case the results were far more subjective and sensitive to tuning of parameters. The artist as curator I think is going to be a primary art form as machine learning tools develop, where much of the labour, as in collage, will be observation and exploration of the output from the machine; finding what is subjectively interesting and exciting to the artist and then organizing it in relation to the other generated art. For example, each animal texture looked better to me by adjusting the size and orientation of the texture and the strength of the mix between the texture and the original image. Then I coordinated, with only partial control, the timing of the animal sequences to create a relationship between them and/or the music. The design of the generator meant that I could only control the textural “destinations”, but the “path” between them may actually be through other identifiable textures. An interesting concept on many levels - if you’re a tiger, do you go through cheetah to get to octopus? The “map” of textures is created as the GAN learns, and recent research shows that it is more organized than we expected.

Hot Bod animal textures still: tiger and bee

I ended up taking the videos of Sarah at the recording session, so I saw her playing up close. Her enormous talent was obvious, but what stuck with me was her ability and need to feel the music, to not just know what to play but to embody it. Hot Bod’s concept and theme reflects the way she plays and thinks about music, and that let me show you the soul of the music through her playing, not just in this piece but all of them. Thanks Sarah!

Footage:

Code: