Speech Recognition Polyfill (STT) av apersongithub
Allows setup-less speech recognition (+ speech to text) in websites such as Google Translate, Duolingo, etc... very configurable. Choose between using OpenAI's Whisper API locally and an optional AssemblyAI's API on the cloud side.
Metadata for utvidelser
Skjermbilder
Om denne utvidelsen
On first install this extension will open the options page, the default model language is English but this is easily changeable. This extension allows per-site customization and a multitude of different models to decipher language. Keep in mind that this is not a complete solution and the API doesn't have full support. Speech Detection is not as instantaneous like in Google Chrome's Cloud API but the added AssemblyAI's Cloud API integration allows for decently fast translation. The extension icon color/indicator changes depending on the process so pin it to your menu to verify the extension is working as intended. A red mic/error icon does not necessarily mean your mic isn't working but rather the speech may have been cancelled by user input, missing cloud API key, or that it is unintelligible (usually its the latter).
Make sure you are using the correct mic and speak loud, slow, and clear otherwise your voice may not be detected or unintelligible. Change the default model to the cloud or slightly larger local ones if you experience problems with voice recognition (this may impair performance). You can also try enabling "boost microphone gain" if you are a soft speaker. Unfortunately, this extension will never support continuous (moreover, instantaneous word-by-word) voice recording for maintainability and headache purposes. There is no speaking time limit unless you enable the debug option.
If you're using Duolingo or similar and are trying to do the speaking practice of the language that you are learning, it is recommended to set the language in the extension to the one you are learning (navigate to the site -> click extension icon -> set language then click "save for site"). This isn't required but it will significantly improve the accuracy of your speech since the model now knows the exact language you are trying to speak. (This isn't exactly necessary for every site, one example is google translate which tells us the exact language that is being used through the input box's data so auto-detect works fine). Look at the images for more help.
The extension will take ~1GB of ram on normal/cloud models and up to ~7GB if you use the biggest model (you don't need to use the biggest model lol). I've implemented decent memory management to compensate.
~~~~~~~~~~~~~~~~~~~~
❗ General Recommendations:
• 8GB of RAM is a minimum requirement since it could easily take up to a decent chuck when utilizing larger models.
• A modern CPU is recommended.
• An internet connection. Even though the model runs locally, the extension re-downloads it either when idle or after closing the tab/opening a new one that utilizes the extension (for memory preservation purposes). This is ultimately better than packaging the large models within the extension for the time being and for most models, the download speed will be near instant for the general population. We also have an option in settings to keep the default model cached without re-downloading every time. Apart from locally you can use the cloud based model which is less hardware intensive. Sorry offline users, I will try to see if you can download the model manually and link that folder for offline use without any external stuff. Basically it'd be a speechfire replacement. The only work around for offline use is enabling the cached option and not closing the browser or changing the model since that would clear the storage.
Make sure you are using the correct mic and speak loud, slow, and clear otherwise your voice may not be detected or unintelligible. Change the default model to the cloud or slightly larger local ones if you experience problems with voice recognition (this may impair performance). You can also try enabling "boost microphone gain" if you are a soft speaker. Unfortunately, this extension will never support continuous (moreover, instantaneous word-by-word) voice recording for maintainability and headache purposes. There is no speaking time limit unless you enable the debug option.
If you're using Duolingo or similar and are trying to do the speaking practice of the language that you are learning, it is recommended to set the language in the extension to the one you are learning (navigate to the site -> click extension icon -> set language then click "save for site"). This isn't required but it will significantly improve the accuracy of your speech since the model now knows the exact language you are trying to speak. (This isn't exactly necessary for every site, one example is google translate which tells us the exact language that is being used through the input box's data so auto-detect works fine). Look at the images for more help.
The extension will take ~1GB of ram on normal/cloud models and up to ~7GB if you use the biggest model (you don't need to use the biggest model lol). I've implemented decent memory management to compensate.
~~~~~~~~~~~~~~~~~~~~
❗ General Recommendations:
• 8GB of RAM is a minimum requirement since it could easily take up to a decent chuck when utilizing larger models.
• A modern CPU is recommended.
• An internet connection. Even though the model runs locally, the extension re-downloads it either when idle or after closing the tab/opening a new one that utilizes the extension (for memory preservation purposes). This is ultimately better than packaging the large models within the extension for the time being and for most models, the download speed will be near instant for the general population. We also have an option in settings to keep the default model cached without re-downloading every time. Apart from locally you can use the cloud based model which is less hardware intensive. Sorry offline users, I will try to see if you can download the model manually and link that folder for offline use without any external stuff. Basically it'd be a speechfire replacement. The only work around for offline use is enabling the cached option and not closing the browser or changing the model since that would clear the storage.
Vurdert til 0 av 0 anmeldere
Tillatelser og data
Nødvendige tillatelser:
- Tilgang til faner
- Få tilgang til dine data fra alle nettsteder
Datainnsamling:
- Utvikleren sier at denne utvidelsen ikke krever datainnsamling.
Mer informasjon
- Lenker for utvidelser
- Versjon
- 1.3.2
- Størrelse
- 95,44 kB
- Sist oppdatert
- 18 timer siden (9. feb. 2026)
- Relaterte kategorier
- Versjonshistorikk
- Etiketter
- Legg til i samling
Utvikleren av denne utvidelsen spør om du kan hjelpe til med å støtte den videre utviklingen ved å gi et lite bidrag.