sane-project-website/old-archive/1997-11/0100.html

92 wiersze
4.6 KiB
HTML

<!-- received="Wed Nov 12 16:19:12 1997 PST" -->
<!-- sent="Wed, 12 Nov 1997 19:17:28 -0500 (EST)" -->
<!-- name="Andrew Kuchling" -->
<!-- email="amk@magnet.com" -->
<!-- subject="Re: OCR Software..?!" -->
<!-- id="199711130017.TAA14813@lemur.magnet.com" -->
<!-- inreplyto="OCR Software..?!" -->
<title>sane-devel: Re: OCR Software..?!</title>
<h1>Re: OCR Software..?!</h1>
<b>Andrew Kuchling</b> (<a href="mailto:amk@magnet.com"><i>amk@magnet.com</i></a>)<br>
<i>Wed, 12 Nov 1997 19:17:28 -0500 (EST)</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#100">[ date ]</a><a href="index.html#100">[ thread ]</a><a href="subject.html#100">[ subject ]</a><a href="author.html#100">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Previous message:</b> <a href="0099.html">Michael Burghart: "Re: OCR Software..?!"</a>
<li> <b>Maybe in reply to:</b> <a href="0094.html">Michael Burghart: "OCR Software..?!"</a>
<!-- nextthread="start" -->
<li> <b>Next in thread:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Reply:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Reply:</b> <a href="0107.html">becka@rz.uni-duesseldorf.de: "Re: OCR Software..?!"</a>
<!-- reply="end" -->
</ul>
<!-- body="start" -->
Random thoughts on OCR:<br>
<p>
An OCR program would require a user interface and a recognition<br>
engine. You can see a screen shot of my interface at the previous<br>
mentioned URL; my idea was that scanned data would wind up in a Tk<br>
text editing box, with possible errors (where the confidence value of<br>
the recognition is low) highlighted in red. You would click a 'Next<br>
error' button to move to the next highlighted problem area, edit the<br>
text so it's correct, and continue onward; a Save button would write<br>
the edited data out to a file.<br>
<p>
Recognition is the complicated part, of course. First you need to<br>
scan the image, then it's usually converted from grey-scale to 2-level<br>
black-and-white. Documents are often not perfectly aligned when<br>
they're scanned, so the angle at which they're tilted (called the<br>
"skew angle") has to be measured and compensated for. Then the image<br>
has to be segmented into words, and words into letters; each letter is<br>
then recognized, and usually a confidence value is attached to each<br>
letter. Often there's a post-processing step which uses a language<br>
dictionary to correct errors; for example, if you're scanning English<br>
text, 'rn' might be a scanning error for "m".<br>
<p>
The two major techniques for recognizing letters seems to be either<br>
neural networks, or making a vector from easily measured<br>
characteristics of the bitmap containing a letter; for example, xocr<br>
takes a histogram of the letter at 128 different angles. This<br>
technique dates back at least to the 1970s, but neural networks seem<br>
to be what all modern systems use.<br>
<p>
There's a very helpful volume of IEEE reprints entitled "Document<br>
Image Analysis" edited by Lawrence O'Gorman and Rangachar Kasturi:<br>
ISBN 0-8186-6547-5.<br>
<p>
Hey, I just noticed that Stuart Inglis' page at<br>
<a href="http://www.cs.waikato.ac.nz/~singlis/ocr/">http://www.cs.waikato.ac.nz/~singlis/ocr/</a> has been updated as of<br>
Oct. 31. Inglis' name was forwarded to me by the FSF, and seems to<br>
know what he's doing (he has lots of machine learning expertise), but<br>
the C/C++ code still isn't available from that page. We should<br>
approach him, and get a freeware-OCR mailing list set up.<br>
<p>
<p>
Andrew Kuchling<br>
<a href="mailto:amk@magnet.com">amk@magnet.com</a><br>
<a href="http://starship.skyport.net/crew/amk/">http://starship.skyport.net/crew/amk/</a><br>
<p>
<p>
<p>
<p>
<pre>
--
Source code, list archive, and docs: <a href="http://www.mostang.com/sane/">http://www.mostang.com/sane/</a>
To unsubscribe: echo unsubscribe sane-devel | mail <a href="mailto:majordomo@mostang.com">majordomo@mostang.com</a>
</pre>
<!-- body="end" -->
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Previous message:</b> <a href="0099.html">Michael Burghart: "Re: OCR Software..?!"</a>
<li> <b>Maybe in reply to:</b> <a href="0094.html">Michael Burghart: "OCR Software..?!"</a>
<!-- nextthread="start" -->
<li> <b>Next in thread:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Reply:</b> <a href="0101.html">Colin 't Hart: "Re: OCR Software..?!"</a>
<li> <b>Reply:</b> <a href="0107.html">becka@rz.uni-duesseldorf.de: "Re: OCR Software..?!"</a>
<!-- reply="end" -->
</ul>