Scanner Help Needed

Discussion in 'Buying Tips, Advice and Discussion (archive)' started by Nawlins, Aug 3, 2003.

  1. macrumors member

    Joined:
    Jun 9, 2003
    Location:
    Chicago
    #1
    I'm looking for a scanner I can use to scan chapters of books, or entire books, onto my computer to mark them in my word processing software with underlining, bold, italics, etc. Scanning pictures would not be an issue. I've been told OCR is unreliable and often is unable to convert the .jpeg files into .txt files effectively. I'm using a 12" Powerbook with OS X and AppleWorks is my word processing software.

    Any ideas?

    Alex
     
  2. macrumors 68000

    Eniregnat

    Joined:
    Jan 22, 2003
    Location:
    In your head.
    #2
    Search this site for advice about scanners and OCR software on this site.

    I can in generic, help you your use of OCR and rendering software.

    Working backwards, the final output from an OCR program can be simple (a text file with line breaks) or complex (i.e. a formatted file that includes information about columns). Where the file ends up is generally not important, and unless you need a very specific kind of formatting, then just have the output be a TXT or RTF file. (You can open both with AppleWorks.)

    Dependent on the kind of OCR program, you can ether import directly from the scanner or use another programs to create the image files. Filtering out noise is easily done with programs like PhotoShop, where you can even simplify flourished fonts by converting the image to simplified line art before saving in any number of file formats.

    The AI OCRs is not perfect. Most OCRs can deal with text fairly well, but columns throw some of them. You might have to select specific areas of text for conversion. File formats shouldn’t have to much to do with the accuracy of the OCR, less the simpler the format the better. B/W text should

    A simple test picture of some text I did came out to this.

    bitmap 24 bit color 200*200 120kb
    bitmap 16 bit color 200*200 20kB
    bitmap b/w 200*200 8kB
    JPEG standard 200*200 3.71kB
    GIF non interlaced 4.00kB

    Wile the GIF is slightly larger than the JPEG, the GIF may offer cleaner edges for the OCR is dithering is not selected.


    I have used Omnipage and it works well. It can preserve formating and its AI is fairly good. I once used a stand alone Kurzweil reader at work, until it was removed. I like the angorythims that they use, but isn't a Mac.

    A final note on output.
    You might be better off having the OCR (Omnipage perhaps) save the files as editable PDF file, as there are lots of dynamic options and you can then share you marked files with almost anybody.
     
  3. macrumors 68000

    Eniregnat

    Joined:
    Jan 22, 2003
    Location:
    In your head.
    #3

Share This Page