How to Develop a Shazam-Like Application
Music lovers born before the era of information technologies might have been experiencing hard times knowing that they could never ever hear a tune that stuck in the head again. When there were no smartphones, it would probably require writing song lyrics down to Google it later. Before the times of computers, you could either trouble people with singing a half-remembered chorus or just come to terms with your fate of being ignorant. Now we live in the time of incredible inventions and if you happen to hear an amazing melody whenever you are, you can just pick up your phone and find it by using Shazam.
Launched in 2002 in London, England, Shazam has worked it's way up from the service called 2580 to the free or low-cost mobile application running on most mobile platforms. Starting from September 2014, Shazam has been a pre-installed service in Apples iOS. Moreover, the companies ventured to go too far and integrated Shazam with Apples personal intelligent assistant Siri :) Now an iPhone user can ask what song is playing at the moment and get an answer immediately.
Smartphone users have seen a lot of music recognition applications in recent years, such as Soundhound, Spotsearch, musiXmatch, etc., but all of them have been making an attempt to follow in the Shazams footsteps, sometimes adding their own, unique features, sometimes copying the existing ones. Today we will try to build a song identification application having all the good features of it's most popular fellows. Although, Shazam will serve us as a role model in this research. So lets eventually find out how to develop an application like Shazam!
Technology behind Shazam
Have you ever wondered how Shazam works? Of course you have, thats why youre here! I will try to explain it's operation principle in few words not to waste your time and not to overload you with a pile of technical terms. However, if you are interested in a more nerdy explanation, you can read the article written by one of the Shazam developers Avery Li-Chun Wang. But for now, meet the description of the Shazams sound recognition procedure in plain English.
The whole process of searching and tagging a song is very similar to an ordinary fingerprinting. Yep, thats like checking whether criminals fingerprints match some of the ones stored in the FBI's fingerprint database. But unlike the criminals who are guilty of terrible crimes, a song is merely guilty of being really cool :) See yourself how the identification is performed:
Step #1: Shazam fingerprints a massive catalog of music (beforehand) and then stores these fingerprints in a database.
Step #2: A user makes a 10-second sample of a song which is fingerprinted automatically.
Step #3: The application uploads the fingerprint to Shazams service that runs a search for a corresponding fingerprint in the database.
Step #4: If a match is found, a user gets the info on a song hes interested in; otherwise, he will be informed that the song wasnt found.
Yet another question how does audio fingerprinting work? Not to go into detail (you can still find them in the article mentioned above), we can say that this technology is based on the time-frequency graph building called spectrogram. The basic points you should know about this mysterious process are:
- The Shazams algorithm fingerprints each song by creating a spectrogram which plots three dimensions frequency, amplitude, and time.
- The same algorithm filters this spectrogram by identifying peak intensities, i.e. their frequency and the time they occurred. Thus, the spectrogram can be represented as a 2-column table, where the first column corresponds to the frequency and the second one corresponds to the time.
- When the application creates a quick graph of a song being tagged, it compares it with those already existing in the fingerprint database; in this process, frequency plays the main role. Shazam finds as many songs with similar frequency as possible and then checks if these frequencies correspond in time to the one of a requested song (it can be several frequency matches but the time can differ). If there is a match, a user will get the information on a song.
- Don't try this at home :) The process of creation requires solid programming skills, years of elaboration and tests.
Library for Shazam
Ive been wondering at how quickly and precisely Shazam identifies a song inquired. I mean, how do they manage to get such amount of music to process and add to their personal library? It's turned out that their team always looks for new opportunities to build a partnership with the most famous labels, music, television and advertising companies, and even movie makers. Moreover, they work directly with artists to upload their tracks before their official release. They even have teams of people who scour through music stores for new and unusual songs! Can you imagine the orbit?! Now the Shazam library totals more than 35 million tracks. The number is impressive...
Now, when the hardest part is left behind, it's time to discuss the interface of our future music recognizer.
What does your music recognition app need?
As usual, weve picked up only the most important features which a Shazam-like application should have at the start. So now let me introduce the set of functions without which the existence of your future Shazam is impossible.
#1 Music recognition
You bet! As were talking about UI of our application, it's quite clear that there should be a mark like the well-known Touch to Shazam and a corresponding button to launch a recognition mechanism. But what to do if the desired song is eventually captioned? Obviously, there should be some solution to store the information found. On Shazam, there was a menu called My Tags which allowed seeing the song found previously. Now this menu looks a bit different and is named like "My Shazam". What's more, there are a few more options suggested here. Which ones exactly? Well, go on reading and you will find out.
So what happens upon entering the My Shazam menu? Here should be several things which make the app usage more advanced. When a user selects a song from the list of your tags, she should be capable of:
- playing a track to ensure that this is exactly what she was looking for
- checking lyrics to sing with an artist
- watching the YouTube video to fall in love with a song even
- better buying a track to have an opportunity listen to it over and over again
Technical details
As weve already mentioned, the process of creating such kind of application is very toilful, and this part is by far the hardest one. Luckily, there are some open source libraries that allow us to demonstrate how this solution can be carried out.
See also: How to Monetize Your App, Vol. 1
Together with the SDK allowing you to catch music by a mobile device, Gracenote provides developers with a set of APIs to perform music identification. All of them can be found by going over to the link above.
#2 Audio Visualization
We need to show our user that the application is recording and processing the song being played, thats why we need a recognizable visualization. Everybody knows Shazams rotating and chattering sphere, but we suggest spicing audio visualization up with a pinch of fun. In case youve missed our previous posts, here is a lovely library which Cleveroad developed for such purposes. You can find more of the details by going over to our GitHub page. Meanwhile, enjoy the animation!
#3 Redirection to outside services
Today, you could barely find an application which isnt connected with social networks. Twitter and Facebook buttons are must-haves. Thus, users can share the songs that stressing their current mood, be aware of what their friends shazamed, add these tracks to Tags, listen to them, and buy. Moreover, linking to other services makes your application visible.
#4 Search
Everything is quite obvious here you should give your users a chance to search for songs, artists, videos, and albums.
Moving forward, now lets see how you can develop your music identification app. Below, we have some features which you can add later.
What your application may need in the future
#5 Personal account
Those of you who use Shazam constantly may have noticed that the application has renewed the UI design and added some appetizing features. Developers strive to make their application more personalized, so now we can sign up to Shazam to make our tags save from mixing up with the ones of other people (friends, family members, etc.).
And again, you can suggest signing up via any social network possible (try to apply the most popular ones) or through email.
#6 Visual recognition capabilities
About a year ago, Shazam presented their new video recognition feature called Video Shazam. Now users can scan various QR codes, print ads, and other products to dive into augmented reality content. What does it mean? As long as the app collaborates with the giants like Disney, Levis, Guerlain, Time, HarperCollins, and more, users can find out more about recent events and promos, get coupons, watch exclusive behind-the-scenes videos and photos, and so on and so forth.
If you want to go further than advertising, you can implement this feature just to let your users search for things they noticed in movies or series, like TheTake does, or other stuff like labels, artworks, billboards, etc.
The procedure of photo/video recognition is also based on fingerprinting, but here should be a huge library of screens and pictures. It seems that Clarifai can help you with this. A company has awesome SDKs and APIs to integrate photo recognition with your app.
#7 Social element
Which means the improved newsfeed. Now, besides the ability to see what your Facebook friends shazam, users can find out preferences of the most famous artists. Hence, Shazam makes people closer to their favorite musicians. Furthermore, the upgraded newsfeed enables users to be kept up with the latest news in TV and music by broadcasting fresh news and videos.
The verified account is a part of Shazams socialization program. So if you application has become so enormously popular that attracts popular artist to create an account on it, you will obviously need to subject them to the verification procedure.
You can gather the latest data on the most popular tracks shazamed around the world and present these data as tops or just mark these top songs on the map. These features are called Pulse and Explore on Shazam. The first one enables users to see the top Shazams in real time, the second one reflects the most requested tracks on the map. To implement the Explore feature, you might need Apples and Googles APIs to make your application location aware.
What do you need to earn money?
This is where Shazams approach to making profit differs from the other applications. There are two versions of the same product: Shazam and Shazam Encore. The second one is completely free from in-app advertisements but has a 6.99-dollar tag on it. So lets see what we can do to make money on your song recognizer.
First of all, remember that Shazam Encore was launched long after the release of the free version of the app. So, as usual, before disposing users to buy your product, you need to let them try it for free and see all it's incredible capabilities. (It goes without saying that the application has to work seamlessly.) It's never too late to hang a price tag on your product.
In-app advertisements are what makes the free version of Shazam keep afloat if it comes to earning money. Users can find such ads while browsing through the newsfeed or charts. Thats actually a way for you. You can start with deploying a free application slightly touched by advertising but at the same time offering a great service. In case you want to make your application free of ads, you can suggest buying a subscription or, in the case of Shazam, launch a separate app with more capabilities and refined from annoying promos, sales, coupons, etc.
Well, we hope that you are a bit closer to the secret of the great music identification development now. To form up all the information above into something memorizable, lets make the final statement.
The music recognition feature is a core of our song identifier so it's algorithm should be polished to perfection. Shazam uses the audio fingerprinting which is based on the spectrogram building. Thats a common belief (among inveterate programmers mostly) that this algorithm can be enhanced and elaborated. However, some of them think that you can apply different methods. You can check the link provided in the beginning to see the developers results if this topic has piqued your interest.
The main priority is to make your application work well, so again wed advise going MVP. Later, when people figure out what and how virtually you application does, you can add additional features such as video or photo recognition, and upgrade it's design.
Last but not least: don't try to make your recognizer paid at the start. Wait for a while, see what will happen, and then try integrating this approach. You can always resort to in-app advertisements and freemium monetization models.
I guess thats all for today. It was nice to see you here and it would be nice if you come back next time. If you have any questions or just want to have a chat with highly experienced professionals regarding your business, please contact our managers. They would be glad to hear from you.
There are several features that should be included into a Shazam-like app:
- Music recognition
- Audio visualization
- Redirection to external services
- Search
- Personal account
- Visual recognition
Shazam is used to identify songs based on a short voice sample. It helps users to find songs that they hear on-the-go.
Shazam generates the greatest part of its revenue with ads. It shows ads to users while they use the app. On top of that, Shazam is owned by Apple, so it can redirect the traffic to Apple Music and other music streaming platforms.
- The Shazams algorithm fingerprints each song by creating a spectrogram which plots three dimensions frequency, amplitude, and time.
- The same algorithm filters this spectrogram by identifying peak intensities, i.e. their frequency and the time they occurred. Thus, the spectrogram can be represented as a 2-column table, where the first column corresponds to the frequency and the second one corresponds to the time.
- When the application creates a quick graph of a song being tagged, it compares it with those already existing in the fingerprint database; in this process, frequency plays the main role. Shazam finds as many songs with similar frequency as possible and then checks if these frequencies correspond in time to the one of a requested song (it can be several frequency matches but the time can differ). If there is a match, a user will get the information on a song.
It's turned out that their team always looks for new opportunities to build a partnership with the most famous labels, music, television and advertising companies, and even movie makers. Moreover, they work directly with artists to upload their tracks before their official release. They even have teams of people who scour through music stores for new and unusual songs!
Step #1: Shazam fingerprints a massive catalog of music (beforehand) and then stores these fingerprints in a database.
Step #2: A user makes a 10-second sample of a song which is fingerprinted automatically.
Step #3: The application uploads the fingerprint to Shazams service that runs a search for a corresponding fingerprint in the database.
Step #4: If a match is found, a user gets the info on a song hes interested in; otherwise, he will be informed that the song wasnt found.
Comments