Google Open Sources Tesseract

Submitted by diegocg 2006-08-31 Google 11 Comments

Google has announced the release of the source of an old OCR software called Tesseract in source. “In a nutshell, we are all about making information available to users, and when this information is in a paper document, OCR is the process by which we can convert the pages of this document into text that can then be used for indexing.”

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

11 Comments

2006-08-31 7:39 pm
mike hess
On the surface, this seems like a very nice contribution, that could be useful in lots of other applications.
But Sourceforge doesn’t have anything listed under “License”, so hopefully, it’ll get sorted out.

2006-09-01 8:11 pm
KugelKurt
It’s licensed under the Apache License 2.0. See http://tesseract-ocr.cvs.sourceforge.net/tesseract-ocr/tesseract/RE…

2006-08-31 7:45 pm
Ronald Vos
Google has stated before they want to organise the world’s information, and now they gain mindshare with people who would add substantive globs of information to the internet.
2006-08-31 7:55 pm
JCooper
…I haven’t had a chance to look at the potential of the code here, but could this be leveraged to provide another string to the bow of desktop search? OCR of images (png, jpg, gif etc) by Beagle would be fantastic – picking out signs and all sorts of text would be a sweet feature!
2006-08-31 8:27 pm
Adurbe
get people to ocr their own works, then google can help u share it with he world 🙂
2006-08-31 9:14 pm
twenex
Nice to know there’s a big company that knows what FOSS is all about.

2006-08-31 10:01 pm
NotParker
The Chinese like them too.
2006-09-01 6:51 am
Soulbender
“Nice to know there’s a big company that knows what FOSS is all about.”
Really? I thought this article was about Google?

2006-08-31 10:09 pm
CaptainPinko
I’m not aware of any other.

2006-08-31 11:25 pm
Anonymous Coward
I dunno…
Synaptic reveals: Clara, gocr, and ocrad
but I have all of the repositories enabled… so I’m not sure how free they are…

2006-09-01 9:12 pm
kadymae
“The University of Nevada in Las Vegas”.
::headdesk::
It’s The University of Nevada, Las Vegas.
—
Apache 2.0 license … interesting. It’s nearly as flexible as the BSD license in terms of what it permits.
—
And in the meantime, I’m interested in seeing who grabs the technology and runs with it and what interesting projects it spawns.
Edited 2006-09-01 21:16