By James Powell
OCR market leaders OmniPage Pro 7.0 and TextBridge Pro 96 take radically different approaches to achieving the same end: turning paper into computer files.
I tested the latest releases of these two programs using various printed documents, scanned in with a Visioneer PaperPort Vx at a custom 300dpi setting. The first thing I noticed in both packages was a continued push toward accuracy and ease of use. The second: TextBridge operates within most popular applications, while OmniPage provides a separate environment for OCR chores.
Optical character recognition is a four-step process. First, you acquire the image, either by loading a graphics file (such as TIFF art or a fax image received by your computer) or by scanning in a paper image. Then you view the document, select the area you want to convert to text (known as a zone) and perform the conversion. After recognition you can check for questionable text-characters the program either can't recognize or has less than 100 percent confidence in. TextBridge and OmniPage both perform these functions admirably, using a series of buttons along the screen's top for setting options. There's also a Go button (TextBridge) or Auto button (OmniPage) to step you through the process.
If you have a standard scanner, bringing paper documents into either package is a snap. Something more unique, such as my PaperPort scanner, can complicate matters. OmniPage took the lead here, since it automatically adds an icon to the PaperPort desktop. When you drag and drop the scanned document over the icon, OmniPage is launched, and you'll do all your work there.
The OmniPage desktop, with its new thumbnail panel, lets you see all the documents you've scanned in. Its main desktop/work surface resembles previous versions, but several clever new changes simplify OmniPage operations. Each toolbar button lets you set options or perform just a single step. The new wizard is smart enough to know what you've already done and offer you options in plain English to help you complete your task.
In TextBridge, on the other hand, I dragged the scanned image from PaperPort to an icon of Word or WordPerfect. I did most of my work in the word processor. As an option, I found I could also save my scans as graphics files, then open the main TextBridge menu and work there.
Moreover, TextBridge's desktop is far less intuitive than OmniPage's. To change an option, I had to reload the scanned file. OmniPage lets you experiment without going through that aggravation. TextBridge lets you select the zones you want to scan and helps you identify whether they are text or graphic in nature. It also lets you add new zones as needed. But TextBridge doesn't show which zones it has automatically detected, which can slow the job.
OmniPage does a better job at autodetection, making it much easier-and faster-to scan only selected areas. You can also tell it when a zone contains numbers, which can otherwise complicate the recognition process. However, if you want to add new zones within a file, you'll have to start over; OmniPage only lets you delete or resize recognized zones.
In interactive proofing mode, OmniPage displays the original document, plus a smaller zoomed image and a familiar, word processor-like spell-checker dialog box. TextBridge uses the entire window to display the document at large magnification and places the word in question in a textbox at the screen's top.
You can't run OCR proofing from within your word processing application using OmniPage. TextBridge can. If you plan mostly OCR-and-edit sessions, this is a very smart way to go. You'll be able to see questionable words displayed in color (green is probably okay, while red indicates a potentially serious error). You can also see the word in context and make corrections in a spell-checker dialog box without leaving your word processor. TextBridge will also let you set the questionability level for its recognition engine.
My tests, however, showed that OCR's biggest problem, accuracy, remains. Results from both products were a mixed bag. For straight-out text pages, OmniPage held the edge. Both worked far, far better in exporting the results to Word 7.0 than to WordPerfect 6.1 format.
But neither program could handle a page containing three irregularly shaped columns (an actual WinMag page). TextBridge worked much better when its source document contained smaller type sizes, although it was often stumped by italics and very small fonts at the bottom of a page.
On a test of text in table format, TextBridge got the data into the right cells in the subsequent Word table, but got fewer of the actual numbers right. OmniPage got the numbers right, but put them into a tabbed list, not a table, although it was easy enough to convert text to table manually afterwards.
On the plus side, both programs now export text to HTML: TextBridge can move text directly to HTML 2.0 format, including graphics; OmniPage translated my files to 1.0 format, leaving the graphics out. TextBridge includes a copy of SoftQuad's HoTMetaL Light 2.0 to help you edit your translated code.
You'll need a separate graphics editor with either program. OmniPage supports OLE 2.0's in-place editing within its main window. TextBridge does not.
Neither OCR package is perfect, mostly because results can sometimes be frustratingly incorrect. Although I prefer TextBridge's interaction with Word, OmniPage's superior user interface and slightly better accuracy give it the edge in my book.
-- Info File --
TextBridge Pro 96
Price: $349; upgrade from any OCR package, $129
Pros: Performance; can work within apps
Cons: Interface; output
Platforms: Windows 95, 3.1x
Disk Space: 9MB
WinMag Box Score: 3.0
--Info File --
OmniPage Pro 7.0
Price: $499; upgrade, $129
Pros: Interface; installation and setup
Platforms: Windows 95
Disk Space: 12MB
WinMag Box Score: 3.5