OCR + Translate 作者： Crivella

3.4 / 5 星

Crivella 的回应

开发者回应

By putting it manually do you mean writing it yourself or copying the output of the OCR result? If the latter you can open an issue on the github page and i'll try to have a look at it (also note that you are using a ~free API there).

The final translation is the results of a 3 step process: Detect text position (BOX) -> Extract text from image (OCR) -> translate (TSL).
Of course changing the model itself or its parameters for any of these will yield different results.

Regarding Korean, the only OCR that I've found that supports it is from tesseract which does not give great results. If you have/know-of a better model you can either open an issue on the GH repo so i can add it to the default ones, or add it for yourself only by following the procedure in one of the discussions "Guide on using new models?" on the GH page for ocr_translate.

8 条评价

评分 1 / 5
来自 tjdnjsrmsdidgkrID， 1 个月前
✅ I checked the HOMEPAGE.
✅ I checked some demos for the extension and the server github page with explanations.
✅ I read the README and the dedicated documentation page.
But it still doesn't work.
I sincerely appreciate the creator's effort and kindness in sharing this (no sarcasm),
but this is a complete disaster and beyond saving.
开发者回应
发布于 1 个月前
I appreciate you taking the time to try this tool and going over the docs.
If you have any specific problem I would appreciate you reporting it on the related github repository as an Issue, so we can try and fix it (and possibly make it easier for future users).
I've not been actively developing the tool for a while but i do want to keep it in a working condition
评分 2 / 5
来自 Firefox 用户 14878395， 7 个月前
issues with translating korean, forgive me for not being to knowledgeable but even using the google translate model its not as good as just putting it into google translate manually for some reason. changing ocr and box changes the quality of translation
Will comeback to this for improvements.
开发者回应
发布于 7 个月前
By putting it manually do you mean writing it yourself or copying the output of the OCR result? If the latter you can open an issue on the github page and i'll try to have a look at it (also note that you are using a ~free API there).

The final translation is the results of a 3 step process: Detect text position (BOX) -> Extract text from image (OCR) -> translate (TSL).
Of course changing the model itself or its parameters for any of these will yield different results.

Regarding Korean, the only OCR that I've found that supports it is from tesseract which does not give great results. If you have/know-of a better model you can either open an issue on the GH repo so i can add it to the default ones, or add it for yourself only by following the procedure in one of the discussions "Guide on using new models?" on the GH page for ocr_translate.
评分 4 / 5
来自 GERFY， 10 个月前
Hard to install for now, but works well!
评分 5 / 5
来自 Firefox 用户 18707959， 1 年前
评分 1 / 5
来自 Khai， 1 年前
Lack guide on usage doesn't know how to use
开发者回应
发布于 1 年前
From the addon page there are link to to the HOMEPAGE (addon github page) which will have some demos for the extension and links to the server github page with explanations both in the README and in a dedicated documentation page.
I've added those links in a more visible way in the addon page
评分 4 / 5
来自 Devil304， 1 年前
No OCR model and no BOX model to be selected.
Translation models can be installed from plugins panel.
Windows 11, python 3.11.9
Thank you for clearing this.
Yes, pls add information about models to documentation, I tried to install google trans but it showed error, then I installed ollama and I didn't know I needed to install easy ocr and hugging face.
开发者回应
发布于 1 年前
Thanks for trying the tool.
Would you mind opening an issue on the github repo (see the homepage for the addon as i cannot add links here) to see if we can get it fixed (the reviews for the addon ins hardly the proper venue for it)?

On the top of my head the only thing that comes to mind is which plugins have been installed. Keep in mind that specific plugins add specific type of models (EG for having OCR and BOX models you might want to have huggingface, and possibly also easyocr). Will try to make it clearer in the doc which plugins adds which kind of models.

There is also the option of using the older v0.5.1 release for the server which does not have the plugin manager but ships with all the deps pre-installed/packaged (there will be some features missing though)
评分 5 / 5
来自 greenlime， 2 年前
this work great for japanese to english.
but i can't make it work with chinese to english. is there any specific combination? i try to use tesseract and helsinki NLP. it is failed when running the OCR. i will try with other combination tomorrow
开发者回应
发布于 2 年前
Thanks for the review!

PS: I see that you report that tesseract is failing when running the OCR. Have you installed tesseract itself, i do not (not even sure if i can) ship that with my tool.
In case you have feel free to open an issue on github (either the main tool or the tesseract plugin and i will try to work it out)

As of now the models that i was able to find that are freely available are not great for OCRing chinese.
I started developing a plugin for using PaddleOCR that gives much better results, but i was getting inconsistent behavior between Linux and Windows (crashes inside the code of PaddleOCR) with different number of version working on one but not the other. Have yet to decide if letting users just install it themselves (using a version that works for them) is a better solution or try to fix this/wait for an update.

Contributions even in the form of good models that people finds on HuggingFace, are welcome ;P
Haven't had much time to work on it lately

- PaddleOCR main repo https://github.com/PaddlePaddle/PaddleOCR
- WIP plugin for who want to try and install it manually https://github.com/Crivella/ocr_translate-paddle
评分 5 / 5
来自 Firefox 用户 18271254， 2 年前
Excellent, can be improved if the option to change font color can be provided, red is not legible when night light mode is ON in Windows.

EDIT: I tried the render options as suggested and it works beautifully, thank you. I didn't realize there was a pre-compiled .exe available in the released section on Github until today, you've actually made it super easy to setup for non-tech savvy folks as well, kudos! For users who have come this far, these are the settings I use for Japanese to English translations and it's fast.

Box model - easyocr
OCR Model - kha-white/manga-ocr-base
Translation model - staka/fugumt-ja-en
开发者回应
发布于 2 年前
Thanks for the review!
The font size and color (plus other rendering options) can be controlled by clicking on the "Render options" menu at the bottom of the popup window and changing the values.

I left it as numeric RGB values (but could be improved) because unfortunately i could not make the default HTML color picker work with the popup menu (opening the color picker would close the popup) and did not want to introduce another 3rd party JS dependency.