Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacBH928

macrumors G3
Original poster
I am having difficulty finding such app. I want to drop a PDF file where the text is OCRed then it translates that text and produces a new translated PDF document similar to the original but translated.

I found apps that will OCR a PDF in English. Another that will out put the OCRed text to a TXT File. Then I can upload the file to Google Translate and that will translate the PDF file but nothing that does a one stop shop for the whole process.

any one knows such a solution? I got UPDF then it didn't have OCR for Intel Macs. PDF Gear will output file as TXT. Textify will not translate.
 
There are several solutions to this via the terminal. A few questions: which languages would you like to have translated to which languages? because you mention OCR: you work with PDFs which contain text as image and therefor require the OCR-step, or do they contain (e.g. selectable) text which you want to extract (and translated)?
a new translated PDF document similar to the original but translated
You want something which holds the translated text but preserves the original page/document layout?
 
DeepL has a macOS app that can translate documents (pdf, docx, txt), but I couldn't get it to work in Monterey 12.7.2.
I think it fails to properly ask for Accessibility permissions.
DeepL for Mac https://www.deepl.com/en/macos-app/
Translating whole documents with the app https://support.deepl.com/hc/en-us/articles/360020613199-Translating-whole-documents-with-the-app
Free user “PDF (.pdf) 5 MB 100,000 characters”
You can try the online version to get an idea (requires free account registration) https://www.deepl.com/translator/files
 
There are several solutions to this via the terminal. A few questions: which languages would you like to have translated to which languages? because you mention OCR: you work with PDFs which contain text as image and therefor require the OCR-step, or do they contain (e.g. selectable) text which you want to extract (and translated)?

You want something which holds the translated text but preserves the original page/document layout?

Thanks for the reply,

1) English -> to Arabic

2) Text can be selectable OR can be from a picture image converted to PDF and needs OCR

3) I was 90% successful by using Textify app then uploading the PDF to google translate.

To demonstrate what I want here is an image:
bbbb.jpeg
 
DeepL has a macOS app that can translate documents (pdf, docx, txt), but I couldn't get it to work in Monterey 12.7.2.
I think it fails to properly ask for Accessibility permissions.
DeepL for Mac https://www.deepl.com/en/macos-app/
Translating whole documents with the app https://support.deepl.com/hc/en-us/articles/360020613199-Translating-whole-documents-with-the-app
Free user “PDF (.pdf) 5 MB 100,000 characters”
You can try the online version to get an idea (requires free account registration) https://www.deepl.com/translator/files

thanks, Google Translate site can do it too albeit without OCR
 
the translation to spanish in the example is quite awful 😂

otherwise: If you have a Google account, you can use Google Drive to upload the PDF and transform it into editable text via 'Open with > Google Docs'. You can use 'Tools > Translate Document' within to translate the PDF in place. There is support to OCR images as @bogdanw already indicated.

Using pdftotext (part of poppler-utils) with the -layout and/or -table flag - or converting to html via ebook-convert from Calibre - then running the result to translate, would be another option.

Preserving the layout of a page while changing the language as well as the writing direction, as you intent to do, requires probably (quite) some human intervention. 🙃
 
Google Translate can translate images directly https://translate.google.com/?sl=en&tl=ar&op=images

my man, thats what I was looking for. Its not perfect but close enough. Thanks!

the translation to spanish in the example is quite awful 😂

otherwise: If you have a Google account, you can use Google Drive to upload the PDF and transform it into editable text via 'Open with > Google Docs'. You can use 'Tools > Translate Document' within to translate the PDF in place. There is support to OCR images as @bogdanw already indicated.

Using pdftotext (part of poppler-utils) with the -layout and/or -table flag - or converting to html via ebook-convert from Calibre - then running the result to translate, would be another option.

Preserving the layout of a page while changing the language as well as the writing direction, as you intent to do, requires probably (quite) some human intervention. 🙃

-I did some editing to the Spanish text to send my idea cross. I have no idea what the Spanish text says.

-The Google Docs trick works with searchable PDF, with none searchable PDF not so much. I am surprised you figured this Google Docs work around.

-I am not a terminal guy. I kind of do not like to install CLI apps because I do not know where the files are installed on my computer to uninstall it later on.

-I downloaded Calibre for MacOS but I got an icon that has a 🚳 . I think it doesn't work on Intel macs.

thanks for the tips and the helps. I appreciate it!
 
The Google Docs trick works with searchable PDF, with none searchable PDF not so much. I am surprised you figured this Google Docs work around.

🙃 I read the manual 😁

-I am not a terminal guy. I kind of do not like to install CLI apps because I do not know where the files are installed on my computer to uninstall it later on.

most of these tools come with a clean uninstalled - HP/Google/now free Tesseract allows to comfortable ocr PDFs with hundreds of pages composed of image scans of text, outside the terminal you don’t have to look far: Adobe provides another free tool to do the same, Acrobat Professional does this of course too.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.