sane-project-website/old-archive/1998-07/0079.html

78 wiersze
3.2 KiB
HTML

<!-- received="Sun Jul 19 16:17:31 1998 PDT" -->
<!-- sent="Mon, 20 Jul 1998 03:06:46 +0200 (MEST)" -->
<!-- name="becka@rz.uni-duesseldorf.de" -->
<!-- email="becka@rz.uni-duesseldorf.de" -->
<!-- subject="Re: ocr" -->
<!-- id="m0yy4Pe-000CGqC@hades.beck-sw.de" -->
<!-- inreplyto="35B22E8E.33CA2D2F@kootenay.com" -->
<title>sane-devel: Re: ocr</title>
<h1>Re: ocr</h1>
<a href="mailto:becka@rz.uni-duesseldorf.de"><i>becka@rz.uni-duesseldorf.de</i></a><br>
<i>Mon, 20 Jul 1998 03:06:46 +0200 (MEST)</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#79">[ date ]</a><a href="index.html#79">[ thread ]</a><a href="subject.html#79">[ subject ]</a><a href="author.html#79">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0080.html">Glenn: "Re: ocr"</a>
<li> <b>Previous message:</b> <a href="0078.html">Goetz Bock: "Re: ocr"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
<!-- body="start" -->
Hi !<br>
<p>
<i>&gt; Anyone working on any OCR software? I've to occassional need and have</i><br>
<i>&gt; tried the program which came with my microtek scanner under win95. I</i><br>
<i>&gt; just spent most of the morning playing with it and the best I got</i><br>
<i>&gt; was a nice little windows message box with the useful message "twain</i><br>
<i>&gt; error".</i><br>
<p>
*grin*.<br>
<p>
Hmm - depends - on you willing to pay for it and on the quality you need. <br>
Vividata has its OCRshop available for Linux.<br>
You can even download a 30-day trial version.<br>
Please use some search-machine for the URL - don't have it handy ...<br>
<p>
I have tried recently, but I was not impressed. It did read the numbers<br>
on some tif that came with it right, but it choked at a number of other<br>
tifs I gave to it. Might be a format problem, though.<br>
<p>
<i>&gt; I would have been much better off just typing the text in from the</i><br>
<i>&gt; start.</i><br>
<p>
That has been my impression with _every_ OCR Software I found so far.<br>
<p>
I can type roughly 180 chars/minute, so I can hack in the average<br>
page with around 2000 chars in about 10 minutes. It usually takes<br>
about the same time to acquire a scan, OCR it, and then (and this is<br>
the most time-consuming step) proofread it. Most OCR tends to make<br>
mistakes, that are "auto-corrected" by the eye when you just glance<br>
over the text, like messing up i/l or rn/m.<br>
<p>
<i>&gt; Is OCR all that difficult to do?</i><br>
<p>
Yes. It is one of the most complex challenges for programming. Even voice-<br>
recognition can be considered easier in some cases.<br>
<p>
CU, Andy<br>
<p>
<pre>
--
= Andreas Beck | Email : &lt;<a href="mailto:andreas.beck@ggi-project.org">andreas.beck@ggi-project.org</a>&gt; =
<p>
<pre>
--
Source code, list archive, and docs: <a href="http://www.mostang.com/sane/">http://www.mostang.com/sane/</a>
To unsubscribe: echo unsubscribe sane-devel | mail <a href="mailto:majordomo@mostang.com">majordomo@mostang.com</a>
</pre>
<!-- body="end" -->
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0080.html">Glenn: "Re: ocr"</a>
<li> <b>Previous message:</b> <a href="0078.html">Goetz Bock: "Re: ocr"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>