OCR trouble

Zacariaz · Post by **Zacariaz** » Thu Feb 19, 2009 10:16 am

Hey all.
It has been some time since I've been hanging around here and I don't sucpect that I'll be very active for the next periode of time.
However, I have one of those problems, and this is really the only place I know of where people might be smart enough to give some usefull advise.

Anyhow, hope you have some usefull input.

The sample image below is the very essence of my problem.
(part of a screenshot)

I need to extract the informations shown and at first glance it seems simple enough. The information are either reoccuring or numerical, the quality is usually reasonable and it's easy to identify which area of the screenshot is relevant. Yet, I have not been able to find any ocr or simular which even comes close to doing a good job of it.

Originally the idea was that one could just upload a screenshot and a serverside ocr would grap the information and stuff it in to a database. It all seemed so simple.

So, if you know of any software I haven't already tried, have suggestions for solutions, other recourses on the net, etc. please let me know.

Regards

Combuster · Post by **Combuster** » Thu Feb 19, 2009 10:27 am

Experience teaches that either you have to pay a lot for something that doesn't quite work exactly like you want, or you write your own.

Besides, public server side OCR is like waiting for your computer to die of strain and exhaustion.

Zacariaz · Post by **Zacariaz** » Thu Feb 19, 2009 10:32 am

nobody said public

The problem is of course that I don't have the skills to make something like this.

Btw, the sample image is very much like anything this ocr would be presented for, it's not suposed to work like a generel purpose ocr.

Again, I know nothing of ocr's.

Osbios · Post by **Osbios** » Thu Feb 19, 2009 5:46 pm

Is the background of the text changing?

Do you know what font that is? You could use some "brute force" like detection if you know all the exact images of the letters and the pixel positions of the text lines.

And btw. what game is that?

DeletedAccount · Post by **DeletedAccount** » Thu Feb 19, 2009 6:47 pm

Hmmm ... sounds like pattern matching with artificial neural networks

, Hope you will find information if you search based on the above .

Regards
Shrek

clange · Post by **clange** » Thu Feb 19, 2009 7:24 pm

I would break the problem into two parts:

1. binarize the image
2. do the ocr

then implement step one yourself and use off-the-shelf software for the OCR part.

Step one is probably easier then expected at first glance - especially if you input images are "nice" (I get back to this part later).

Start by converting the image to grey scale. The weight factors for the RGB components doesn't matter in the beginning so use normal NTSC weights or just weight them equally. Later you can test which weight factors will suite your purpose best.

Filter the image. I would use median filtering to remove pixel noise and if the input source is jpeg (or other lossy compression formats - especially block based) also smoothen the image (Gaussian filtering or other). You can also adjust brightness and gamma correction.

Detect which parts of the image contains text (google for text detection in videos - it is quite similar to finding text in images with screen resolution). I can't remember which method I used but it resulted in a map of the image describing which parts contained possible text. From this map it is quite easy to build a structure describing which rectangles in the image contains text.

If you input images are nice (text of equal height and (mostly same) color, uniform background - it doesn't have to be the same color as long as there are a minimum of patterns) you will easily be able to detect where the text is. From here it is quite simple to select which grey scale values map to pixels and which maps to background when you also know if the pixel is within a text area.

This will give you a binary image to feed to standard OCR software.

The text detection step and foreground/background discrimination might not be needed. Start by playing around with the weight factors and the image adjustments and just feed the grey scale image to OCR. It sometimes gives improved results to scale the images up. The OCR software I tested seemed to be much happier with scan resolutions (300 dpi +) than screen resolution (75-100 dpi) but it was a while ago.

I have successfully extracted and analyzed text like I described above. If you have any questions I might be able to answer (I can't go into many more details than this since the implementation is proprietary).

I hope this will be an inspiration.

clange

Zacariaz · Post by **Zacariaz** » Fri Feb 20, 2009 4:30 pm

@ osbios
As the "windows" is more or less transparent, I think the answer to your question is yes.
As for the font, that's a bit harder. The answer is of course no, I do not know, but Whether a commonly used font is used, I have no idea.

@ shrek
Exactly my though.
www.numenta.com (dunno how to categorise or describe it) was actually my first thought, but again, I'm simply not smart enough.

@ clange
I haven't experimented with "binarizing". It sound like something that might work, and I'll certainly try it out.

It's friday and I'm kinda at a party, so a longer reply will have to wait, but thank you for your replys.

Regards

bewing · Post by **bewing** » Sun Feb 22, 2009 6:54 am

Are those 5 line items constant? (ie. do you need to grab the "white" text?) Or is the "blue" text the ONLY thing you need? Does the blue text ever change color at all? Is that blue text anti-aliased? That color seems very specific. If it is as simple as it looks, I would just create 5 bitmaps -- one for each line of text. Every pixel that is not that exact color of blue, I would set to black. All of the blue pixels, I would set to white. It would probably be easiest all around if it was a 2bit monochrome bitmap. THEN try passing it through a few generic OCR programs and see what happens (I will bet that they do a perfectly good job, on non-colored text). If they still don't get it right, you may need to magnify and enhance the image of the text before passing it in. But do start by seeing if you can pull just the text you want into a simple black-and-white bitmap.

Zacariaz · Post by **Zacariaz** » Wed Feb 25, 2009 8:42 pm

Almost forgot about this thread

anyway, the white text is constant and the blue text may change.
The numerical values will of course stay numerical, but will change, as for the text, it will be reoccuring and it would be possible to create a relatively short list of names/term, whatever you call it, to test against in case of errors.

As for AA, this is an option, so the answer will be both yes and no. Also, the colors may varie slighty dependend on how the gamma is set.

I've made some tests, just greyscaling the images (could figure out how to binarize it) and I actually made a big difference, though theres still the problem identifying the relevant area as it will be a complete screenshot that will be processed.

Well, making progress, but this is still way out of my legue. I wonder if I might be able to hire a proffesional for the job, though it would probably be very expensive...

bewing · Post by **bewing** » Wed Feb 25, 2009 9:03 pm

I'll pm you my email address. If you will send me a few complete screenshots (especially with changing gamma), I will see if it is easy to convert the image to monochrome. If it is, I will give you a quote for writing the code for you. I don't think it will be expensive.

Zacariaz · Post by **Zacariaz** » Wed Feb 25, 2009 9:35 pm

@ bewing:
I 've send you a pm with a link to some sample data.

@ all:

Just in case that some of you should feel like messing around with this, I'll provide the link here too: www.myhideout.eu/ocr/sample.zip

OSDev.org

OCR trouble

OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble

Re: OCR trouble