Introduction
Librezam supports multiple backends (Shazam, Audd, ACRCloud, Tencent, NetEase) for music recognition.
By default, only the Shazam backend is used, and other backends are not used unless explicitly enabled from the Backends order settings.
This document explains the characteristics of each backend and how user data is processed.
Shazam
* User data handling
Audio data recorded from the page is sent to Shazam's servers as an acoustic hash that cannot restore the original content.
This acoustic hash is similar to a hash value like SHA-256, which can be used for matching but cannot restore the original content, providing privacy benefits.
Fingerprint creation is performed using the node-shazam-api implementation of the reverse-engineered Shazam algorithm.
Also, the browser's language settings are sent to determine which language to prioritize in recognition results.
The service is operated by a company in the United States.
The source code of node-shazam-api can be found here.
https://github.com/FoxRefire/node-shazam-api/tree/webpack
Apple's privacy policy can be found here.
https://www.apple.com/legal/privacy/en-ww/
- Backend characteristics
Available without registration or rate limits.
Can detect a wide range of many songs
Does not support humming or cover song recognition.
Some songs require longer recording times, but server response time is very fast
Audd
* User data handling
Audio data recorded from the page is sent directly to Audd's servers without conversion.
Also, if the user specifies an API key, the key string is also sent.
The service is operated by a company in the United States.
Audd's privacy policy can be found here.
https://audd.io/privacy/
- Backend characteristics
Up to 10 requests per day are available for free without API key registration, but a paid API key must be registered thereafter.
API keys can be obtained here.
https://audd.io/
Recognition accuracy is not as good compared to other backends, but some songs may be recognized with fewer samples.
Does not support humming or cover song recognition
ACRCloud
* User data handling
Audio data recorded from the page is sent directly to ACRCloud's servers without conversion.
Also, the API key string specified by the user is sent.
The service is operated by a Chinese company registered in Singapore.
ACRCloud's privacy policy can be found here
https://www.acrcloud.com/privacy/
- Backend characteristics
API key registration is required, but up to 100 requests per day are available for free.
Boasts high recognition accuracy and can recognize many songs.
Humming and cover song recognition is also possible. (7.2 seconds or more recording time is recommended when recognizing humming or cover songs)
Tencent
* User data handling
Audio data recorded from the page is converted to raw PCM data of 8000kHz mono s16le that can read the original content and then sent to Tencent's servers.
This backend is implemented based on the music recognition function reverse-engineered from the QQ Music app.
The service is operated by a Chinese company
Tencent's privacy policy can be found here.
https://privacy.qq.com/document/priview/0b0dc16a0f004a35b77b7fd48a0b125b
- Backend characteristics
Available without registration or rate limits.
Shows very high accuracy especially for ACG music, and can detect songs that even Shazam or ACRCloud cannot recognize.
Humming and cover song recognition is also possible. (12 seconds or more recording time is recommended when recognizing humming or cover songs)
Server response speed may be slow depending on the time of day.
NetEase
* User data handling
Audio data recorded from the page is first converted to raw PCM data of 48000hz mono f32le that can read the original content, and then sent to a preprocessing proxy I host on Deno Deploy.
The preprocessing proxy uses proprietary modules to convert to an acoustic hash similar to Shazam that cannot restore the original content, and then sends it to NetEase's servers.
The reason for needing to go through a preprocessing proxy once is that the modules required for conversion are proprietary and cannot be directly integrated into Librezam.
I do not collect any logs in the preprocessing proxy.
This backend is reverse-engineered from the Chrome extension "云音乐听歌".
The service is operated by a Chinese company.
The source code of the preprocessing proxy can be found here
https://github.com/FoxRefire/ncm-recognizer-proxy
NetEase's privacy policy can be found here
https://st.music.163.com/official-terms#
Deno Deploy's privacy policy can be found here
https://docs.deno.com/deploy/privacy_policy/
- Backend characteristics
Available without registration or rate limits.
Shows very high accuracy especially for ACG music, and can detect songs that even Shazam or ACRCloud cannot recognize.
No humming or cover song recognition feature