What if you can only hear the music playing and isolate the sound of a specific instrument? The Computer Science and Artificial Intelligence Laboratory (CSAIL), a computer science and artificial intelligence research institute under MIT, has developed the PixelPlayer, an artificial intelligence system that can only extract certain musical instrument sounds.
This product is not just picking up the sound by listening to the sound. The pixel player identifies the instruments visible in the video at the pixel level. Then extract the sound of this instrument. Synchronizing video and music at the same time without any human intervention. You can then associate your voice with the person you are playing in the video, simply click on the video to remove the sound of a particular instrument, or adjust the volume separately for each instrument.
If you look at a sample video from CSAIL and you want to hear only the guitar of the duo playing the guitar and the violin, click on the man playing the guitar. This will only extract the guitar sound. This is also true of images played with trumpet and tuba. Other instruments can be reduced or eliminated altogether, and only the trumpet sounds can be raised.
(Use pixel player above video)
Pixel players can analyze over 60 hours of playing video through deep-learning techniques to identify more than 20 instrument sounds. Of course, as explained above, each of these instruments can freely adjust the volume. Of course, the sound quality extracted by the instrument is still different, but if there is more learning data, the number of instruments and quality that can be discerned can be increased. CSAIL says it will soon release pixel player datasets and code.
Where can a pixel player be used? It may be applied to adjust the sound quality or volume of past performance video so that it can easily be tuned and distinguish the ambient noise.
There is no exception in the field of artificial intelligence. Attempts to integrate artificial intelligence into the field of art, such as music, have continued.
The Sony Computer Science Institute (Sony CSL) has released two songs from YouTube using artificial intelligence in 2016. Using artificial intelligence software called Flow Machines, students learn music styles through vast amounts of music data. Artificial intelligence has learned music styles from massive data, then combined them into their own compositions.
The songs that have been processed through this process are composed by recording over 10,000 songs in a database called LSDB. For example, if you hit a singer, you can choose the Beatles style. The music released by the actual researchers was composed by a composer and lyricist who composed and arranged music, but it attracted considerable attention.
Deepjazz is an automatic jazz generator that a domestic programmer made in hacker tones in 36 hours in 2016. Using Python, deep-learning libraries Keras and Theano are used to write jazz in mid-long memory short-term memory (LSTM) through MIDI files. Artificial intelligence that composes jazz.
In 2016, the University of Washington researchers analyzed classical music and released a vast data set called MusicNet. It analyzes 330 free music sources and succeeds in fine-tuning all the pitches, sounds, and timings that are played. It is said that more than one million sound data are collected during this process. After this process, we can say that the data is based on the actual sound. In the end, these data sets were also part of an attempt to rebuild music by incorporating music into artificial intelligence. This analysis can of course help you to better understand the music you have already composed. Until now, the process of understanding the music itself may change if you automatically process the data based on the work of a person’s ear or hand.
As we saw in the example of the pixel player, it is not just the music in the data recognition object. In the same year, the University of Toronto Institute announced an artificial intelligence that recognizes images with just one photo of the Christmas tree and generates music based on them. Based on more than 100 hours of online music, artificial intelligence using neural networks has created a 120-bit-per-minute melody and added drum sounds to chords. It draws attention from the fact that it recognizes the image from the perfection of the piece and produces the music that matches the atmosphere.
You can not only recognize images or videos, analyze and classify the music itself, create data, or even create new sounds. Google’s NSynth Super, released in March, is a synthesizer that can make a completely different sound through machine learning. It is a small product that is only 20 cm in width and height, but it contains a lot of data. It combines flute and snare drum features to create a new sound. The artificial intelligence that meets music has been developed.