Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Abulia

macrumors 68000
Original poster
Jun 22, 2004
1,786
1
Kushiel's Scion
Anyone have any recommendations for some basic OCR software for the Mac? I did a forum search but didn't come up with much.

OmniPage Pro comes up (of course) as does ReadIris, and ABBYY FineReader. I need something preferibly cheap, functional, and easy to use. I'm not doing anything incredibly complex -- just scanning in plain manuscript pages for editing.

Thanks!
 

MisterMe

macrumors G4
Jul 17, 2002
10,709
69
USA
Don M. said:
Anyone have any recommendations for some basic OCR software for the Mac? I did a forum search but didn't come up with much.

OmniPage Pro comes up (of course) as does ReadIris, and ABBYY FineReader. I need something preferibly cheap, functional, and easy to use. I'm not doing anything incredibly complex -- just scanning in plain manuscript pages for editing.

Thanks!
VersionTracker is your friend.
 

mkrishnan

Moderator emeritus
Jan 9, 2004
29,776
15
Grand Rapids, MI, USA
VersionTracker is actually my homey, but.... :rolleyes:

Just to back up the original poster... there are only three things that I see in that search that are actual OCRs. One is the software the OP mentioned was too expensive. A second is another package that's more expensive than buying a cheap scanner that comes bundled with OCR software. The third is DigitEye, which is widely reputed to be non-functioning garbage. Which, if memory serves me, I can also verify as being of little redeeming value.

So, since I'm interested in the question too...is the best option to find a refurb Epson scanner that's on Apple's Image Capture Architecture list, and buy it for $50? Or is there a better option if you really just need the OCR software and not the scanner?
 

erickkoch

macrumors 6502a
Jan 13, 2003
676
0
Kalifornia
I use ReadIris. It works OK when I need to convert and edit PDF's to Word. I would suggest you download the demo and try and convert a few documents to see if you like it. I'm using version 9 for the Mac, they now have version 11 out.

It's not perfect, docs with lots of columns and complex formatting that you might see on newsletters can get a bit scrambled but overall, I'm happy with it.

Try and download the 30 day demo from http://www.irislink.com. After I did that they sent me an e-mail offering the full version for half price. This was about a year ago and I'm not sure if they still do that or not but it's worth a shot.

Also, there is one nagging limitation to the software that may or may not be present in the latest version: The software limits the size of the document you are copying to 50 pages. I had to download a program that chops a PDF into a smaller number of pages to copy large documents, then convert the smaller ones and put the completed doc back together in Word. Bogus.
 

eRondeau

macrumors 65816
Mar 3, 2004
1,159
382
Canada's South Coast
Try VueScan...

The newer versions of VueScan incorporate basic OCR as well. I downloaded the most recent version and used it to scan a simple text-only printed page. The result was damn near bang-on, except for the "crippling" on the non-registered version. I think the fee is around $50 if memory serves. The other thing is, VueScan was faster and a dream to run compared to the clunky, awful, PC-based scanning software that came with my HP PSC1350.

I'm new to the OCR market too, as I'm thinking about offering to digitize 20-years' worth of documents for a non-profit organization that I'm involved with. (This will probably take all winter!) If I decide to bite the bullet and buy, I'll let you know my review.
 

webraider

macrumors member
Feb 22, 2005
46
0
Presto Page Manager

You guys should try the latest version of Presto PageManager Professional (version 8) for the Mac. It supports twain scanners, has a deskew feature, sorting, and stacking, saving as PDF and various other formats, as well as OCR. It let's you put all the apps you might want to use and you can just drag the file over to the appropriate app. IF it's Word, it will OCR the document and open it in Word. If it's Photoshop, it will just open Photoshop. I like it but I also use OmniPage Pro X that came with my CanoScan.
 

mkrishnan

Moderator emeritus
Jan 9, 2004
29,776
15
Grand Rapids, MI, USA
Please read post dates before replying...

128690078705650763.jpg


FWIW, in the intervening three years since this thread, I myself bought an inexpensive Canon scanner (LiDE-50) for $50 this time last year that does OCR in the process of scanning just fine, using its basic "Canoscan" software.
 

JohnAlexander

macrumors newbie
Feb 18, 2009
2
0
Canoscan help.

Please read post dates before replying...

128690078705650763.jpg


FWIW, in the intervening three years since this thread, I myself bought an inexpensive Canon scanner (LiDE-50) for $50 this time last year that does OCR in the process of scanning just fine, using its basic "Canoscan" software.

Can I ask, how? I have Canoscan 5.0.1.2 but I don't know it well enough to figure it out. Any help you can provide me would be most appreciated.

Thanks in advance. : )
 

ChrisA

macrumors G5
Jan 5, 2006
12,541
1,653
Redondo Beach, California
I use Adobe Acrobat, I love it. It scans all my OCR documents with excellent accuracy. :D

Funny on one hand are those who answer every request for help with "why can't you learn to use the SEARCH funtion?" and then there are those who say "Don't resurrect old threads." I think answering old threads helps those who do bother to use search.

Yes. I've used this also. It does OK even on un-readable poor quality low res scans of a user manual I downloaded. I downloaded a scanned UM for an old out of production product and the scan has hard on the eyes and hard to read. Adobe Acrobat's OCR turned it into a crisp word processing document. Some words it could not figure out so it put in a JPG image of the word taked from the scan. I had to puzzle those out from context

That said if you are looking for high quality and "free" look here. These are Open Source and most OS independent, meaning that can be build on most any OS.
http://freshmeat.net/search/?q=OCR&section=projects&Go.x=0&Go.y=0

When they list the OSes it runs on "posix" is a class of Unix-like OSes that includes Mac OX, BSD, Linux and so on
 

MisterMe

macrumors G4
Jul 17, 2002
10,709
69
USA
Funny on one hand are those who answer every request for help with "why can't you learn to use the SEARCH funtion?" and then there are those who say "Don't resurrect old threads." ...
These are not conflicting positions. The old threads usually have the solution to the problem. If this is the case, then there is no need to post new questions.
 

mkrishnan

Moderator emeritus
Jan 9, 2004
29,776
15
Grand Rapids, MI, USA
Funny on one hand are those who answer every request for help with "why can't you learn to use the SEARCH funtion?" and then there are those who say "Don't resurrect old threads." I think answering old threads helps those who do bother to use search.

It's fine ... it's just that... if you go to some thread written in 2004 where someone was asking how to use that one application from Windows on their Mac and say, "LOL I use Bootcampz ROFL!" then that isn't particularly helpful.... Likewise, the problem with bumping up a thread about OCR from 2005 is that people aren't so likely to be using the scanners from four years ago and asking now how to get OCR, etc, and a lot of the old software in this thread might not be in use anymore. For that matter, did Acrobat even have OCR in 2005?

Can I ask, how? I have Canoscan 5.0.1.2 but I don't know it well enough to figure it out. Any help you can provide me would be most appreciated.

As for this... if you scan documents to PDF they should automatically be OCR'd -- you don't have to do anything special to make it happen. You can just initiate the scan with the document button on the scanner or the document button in the Canoscan toolbox and then save it as a PDF file, and that should be it. If it doesn't work, post back details on what you get and we can help more....
 

JohnAlexander

macrumors newbie
Feb 18, 2009
2
0
It's fine ... it's just that... if you go to some thread written in 2004 where someone was asking how to use that one application from Windows on their Mac and say, "LOL I use Bootcampz ROFL!" then that isn't particularly helpful.... Likewise, the problem with bumping up a thread about OCR from 2005 is that people aren't so likely to be using the scanners from four years ago and asking now how to get OCR, etc, and a lot of the old software in this thread might not be in use anymore. For that matter, did Acrobat even have OCR in 2005?



As for this... if you scan documents to PDF they should automatically be OCR'd -- you don't have to do anything special to make it happen. You can just initiate the scan with the document button on the scanner or the document button in the Canoscan toolbox and then save it as a PDF file, and that should be it. If it doesn't work, post back details on what you get and we can help more....

Thanks for the quick reply! What I'm trying to do is to scan a typed document, like a contract or a letter, into Word so that I can edit it. I don't know if that's possible without add-on OCR software, but your post (to which I replied initially) made me think that it is, unless I misunderstood it.

Thanks again.
 

mkrishnan

Moderator emeritus
Jan 9, 2004
29,776
15
Grand Rapids, MI, USA
Ohh, hmmm, you actually want an editable word file, and not just the ability to copy / paste / search the text? Hmmm, that I'm not sure about. The only OCR'ing I've done via Canoscan is to create PDF files where the words are searchable. I'm not sure what software offers the ability to scan documents directly to editable Word files.
 

Pyrrh

macrumors newbie
Feb 20, 2009
4
0
A Similar, but Different Question

(I was considering making a new topic, but decided to post here instead.)

I originally had a 12-page journal article, separated into 12 gif images. Using Automator's "New PDF from Images" command, I turned them into a single, 12-page pdf file. (I mention this, in case it's better to OCR on the original GIF pages.)

I've found a few OCR programs, but what they do is take the words from the article, and output them into a text file - but that's not what I want.

Specifically, my problem is that Preview will not let me highlight, underline, or otherwise "markup" any of the lines of text in the article. I want Preview to somehow recognize the words and lines of text in this article, so that I can highlight them. (mkrishnan, I gather that this is the same thing as "the ability to copy / paste / search the text," as you mention.)

Here's a big limitation: I would like to spend no money to do this. This means that I won't buy a Canoscan scanner just to do this...unless it's possible to download the software by itself for free, and have it perform OCR on the GIF or PDF files I give it. But if the only software you can think of costs money, then please tell me anyway; if it's not too expensive, I just might buy it someday.

Granted, this problem is by no means essential to my livelihood: I realize that I could just markup the text using 'Notes' and 'Ovals or Rectangles' instead of Highlighting. Highlighting is just really convenient. :eek:

So, any ideas would be appreciated.
 

MisterMe

macrumors G4
Jul 17, 2002
10,709
69
USA
...

So, any ideas would be appreciated.
Software that that meets your specifications has been available for decades. Either OmniPage Pro X for Macintosh or Readiris Pro 11 for Mac OCR software will convert scanned pages into editable documents that preserve formatting and layout.

These applications rely on sophisticated proprietary recognition algorithms. Nobody provides this technology cheap. If you want it, then you will have to pay for it.
 

dmcg123

macrumors newbie
Apr 3, 2009
1
0
Please forgive the shameless plug, but I too looked in vain for low-cost (or even usable) Mac OCR software, and when I couldn't find it started a project to bring OCRopus to the Mac.

VelOCRaptor has a very simple drag and drop GUI to read images and write PDF files with the image and its text. It's free until it's good enough to sell!

Please drop by www.VelOCRaptor.com, try it out, and let me know what you think.

Thanks

Duncan McGregor
 

Sharewaredemon

macrumors 68020
May 31, 2004
2,014
273
Cape Breton Island
It's rather fitting that this thread has been resurrected. As I'm looking for a scanner than can be purchase new with OCR software built in to the bundled scanner software. Is anyone aware of any scanners out there that do this?

(as would be obvious for a mac forum, this needs to work on a mac).

thanks
 

akialoa

macrumors newbie
Apr 14, 2009
2
0
As I'm looking for a scanner than can be purchase new with OCR software built in to the bundled scanner software. Is anyone aware of any scanners out there that do this?

My CanonScan LiDE 700F came with OCR software for creating text searchable PDF files. I believe all (new) Canon scanners have this capability.
 

julian

macrumors newbie
Jun 10, 2004
1
0
FWIW last time I messed with OCR software for scanning to Word our OmniPage and ReadIris windows versions were significantly (10x?) quicker than the mac versions, so we run 'em in Parallels/Bootcamp. Random but significant if time is a factor.
 

cordee

macrumors newbie
Jul 10, 2009
1
0
ocr software and save as PDF

Hi, I'm converting about 30 bankers' boxes of paper into PDF. What I want to do is save each file as a PDF that is keyword-searchable. I DON'T want to convert everything to MS word.

I'm new to this and started the job yesterday with ReadIris 11, but I'm not happy with the results. I can't figure out how to get ReadIris to keep the original formatting and to save as a PDF.

I don't want the formatting or the look of the docs to change - I just want them to be searchable by keyword. I am addicting to using the "search" feature in the Finder to locate files by keyword and if I could search ALL my PAPER files as well, I would be in heaven.

Any suggestions on how to do this? Thanks.
 

MisterMe

macrumors G4
Jul 17, 2002
10,709
69
USA
What are are asking for is simply not possible on a production basis. You need to scan to a raster format, and then OCR the resulting images. At this point, the resulting editable documents need to be proofed because even the best OCR has a non-trivial error rate. For banking documents, the error rate is unacceptably large.

You are going to need a document editor--Word, InDesign, something--to correct the OCR mistakes. Only after the errors have been corrected can you print to PDF.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.