What's That Song? Unlocking Music Recognition with Machine Learning

Ever had a tune stuck in your head but couldn’t quite place the name? It’s a common frustration for music lovers. Trying to describe a melody to someone else can be like trying to catch smoke. But what if you could simply hum a few bars and instantly discover What Is That Song? Thanks to advancements in machine learning, this is now a reality.

Think of a song’s melody as its unique fingerprint. Just like no two fingerprints are the same, each melody possesses a distinct identity. At Google, we’ve developed sophisticated machine learning models that can effectively match your hums, whistles, or singing to the correct musical “fingerprint,” allowing you to easily identify what is that song.

So, how does this work behind the scenes to identify what is that song? When you hum a melody into Search, our machine learning models spring into action, transforming the audio you provide into a number-based sequence. This sequence acts as a digital representation of the song’s core melody. Our models are meticulously trained to recognize songs from a diverse range of inputs, whether it’s someone singing, whistling a catchy tune, humming a remembered chorus, or even analyzing studio recordings of tracks.

Crucially, the algorithms are designed to filter out extraneous details that can obscure the melody. Elements like accompanying instruments, the specific timbre of a voice, and vocal tone are all stripped away. The process isolates the essential melodic contour of the song, leaving us with that pure, number-based sequence – the melodic fingerprint we talked about earlier.

This melodic fingerprint is then compared, in real time, against a vast database encompassing thousands of songs from across the globe. The system rapidly identifies potential matches, bringing you closer to discovering what is that song you’ve been searching for. Consider a popular song like Tones and I’s “Dance Monkey.” You can instantly recognize it whether it’s played in its original studio version, sung with different vocals, whistled casually, or even just hummed. Similarly, our machine learning models are trained to recognize the underlying melody of the studio recording, enabling them to accurately match it with a person’s hummed input and tell you definitively what is that song.

This innovative feature is built upon the foundation of our Research team’s groundbreaking music recognition technology. We first introduced Now Playing on the Pixel 2 back in 2017, leveraging deep neural networks to bring low-power, on-device music recognition to mobile. In 2018, we extended this same powerful technology to the SoundSearch feature within the Google app, significantly expanding its reach to a catalog of millions of songs. This new humming-to-search experience represents a significant leap forward. Now, identifying what is that song is possible even without lyrics or an original recording – all it takes is a hum.

What’s That Song? Unlocking Music Recognition with Machine Learning

Comments

Leave a Reply Cancel reply