Windows 7
(1)
Vista
(1)
SendMessage
(1)
APIs
(1)
Photoshop
(1)
Petzold
(1)
Thaks
(1)
GETTEXTEX
(1)

copy text on the screen as text?

Asked By Cal Who
20-Jan-10 02:41 PM
A long time ago I wrote a program that copied the screen.
But that produced an image of the screen.

I am guessing that that is the best one can do but I would like to copy text
on the screen as text (i.e. as a string)

Do you know if that is possible?

Or maybe, you know why that it is not possible - if so, of course, I'd like
to know that.

Thanks

Cal Who wrote:Hi as a thought you could capture the screen shot as an image as

j1mb0jay replied to Cal Who
20-Jan-10 03:08 PM
Hi as a thought you could capture the screen shot as an image as you
mentioned that you have done before. You could then pass the image
through an OCR SDK such as http://www.leadtools.com/sdk/ocr/default.htm
to get the characters from the images.

Hope this helps.

j1mb0jay

http://www.dotnethelp.co.uk

Cal Who wrote:Anything's possible.

Peter Duniho replied to Cal Who
20-Jan-10 03:20 PM
Anything's possible.  :)

After you capture the image, you can use OCR techniques to convert to
text.  Depending on the target you are actually trying to capture, it is
even possible you could hack your way into the process and retrieve the
text data straight from the window or underlying data structures.

The former is not really even that much of a hack, depending on how much
you know about the window hierarchy of the process, the specific OS
version (Vista and Windows 7 have stronger cross-process security), and
the data you are trying to get, because many windows will just give up
their text contents when sent a simple window message (e.g. WM_GETTEXT,
LB_GETTEXT, EM_GETTEXTEX, etc.)

I guess the real question is, why are you trying to do this?  One
obvious application would be some sort of accessibility assistant.  But
there are already third-party applications out there to do that.

Actually, speaking of that, Windows does have some kind of accessibility
hooks that programs like that use.  I do not know anything about the
specifics, but it is possible you could make progress toward your goal
using those.

Pete

j1mb0jay wrote:Actually I tried OCR before I posted but the screen resolution

Cal Who replied to j1mb0jay
20-Jan-10 04:30 PM
Actually I tried OCR before I posted but the screen resolution is so that
the image was not good enough.
After your post I changed the screen to 800x600 and tried again.
Used Photoshop to resample to 300 and then the OCR worked great.

Thaks
Peter Duniho wrote:I may try that just for fun.
Cal Who replied to Peter Duniho
20-Jan-10 04:38 PM
I may try that just for fun.
Do you think I should try code like we did before managed code came about (I
still have my Petzold book).
Do people still code the old windows api
If not, can you point me to an example - anything - not necessarily related
to my original post.

What started this is I did a search for files and wanted to send the result
list to someone but folders do not allow you to copy the displayed text.

Thanks
Cal Who wrote:I am not aware of support in .
Peter Duniho replied to Cal Who
20-Jan-10 04:57 PM
I am not aware of support in .NET that would specifically allow that.
But then, since I am not familiar at all with the accessibility stuff,
for all I know that might have .NET support.  For sure, the
SendMessage() approach would still require the unmanaged API, but of
course you can use p/invoke to access it from your managed code.


Sure, happens all the time.  In the context of .NET, p/invoke is the
more common way to do that, but a C++/CLI wrapper is another possible
approach.  For COM stuff, .NET will even handle most of the heavy
lifting for you, via COM interop.

There are still lots of programmers out there who do not do anything at
all with managed code.  it is all the old-school unmanaged Win32 API for
them (for a small handful of them, they have no choice?for the rest,
they do not know what they are missing! :) ).


I do not have any examples off the top of my head.


Well, for that specific task, I can think of a number of alternatives
that are FAR superior to trying to scrape text from an Explorer window,
depending on the exact nature of the search:

? Use the DIR command
? Use the FINDSTR command (similar to grep)
? Write your own file search utility that provides a text export feature

The first two are just regular "user with a command prompt" solutions.
Just pipe the output to a file if you want to save the output.

Frankly, even if it comes down to needing the third approach, that is
going to be WAY easier than text scraping.  For an experienced C#/.NET
developer who has a specific search criteria already in mind, I'd guess
the basic search logic could be written in 30-60 minutes, with maybe
another 15-30 minutes to get the output to a text file.  And that is a
conservative guess; for a relatively simple search, it should take an
experienced programmer less than 30 minutes to do the whole thing.

For an inexperienced C#/.NET programmer, I suppose it could easily be
three or four times that long, or even days for a complete newbie that
has to spend most of their time just researching what APIs are involved
and how to use them.

But either the screen-scraping (done programmatically?obviously using
existing tools is much simpler) or direct-to-window/accessibility
techniques would be MUCH more complicated.  For a programmer for whom it
would take days to implement the custom search utility approach, it
could easily take months or a year to accomplish the more complicated
technique.  :)

Pete
* Cal Who wrote, On 20-1-2010 22:38:you could try to access Windows Search
Jesse Houwing replied to Cal Who
20-Jan-10 05:05 PM
*  Cal Who wrote, On 20-1-2010 22:38:

you could try to access Windows Search directly:

http://dedjo.blogspot.com/2007/08/how-to-use-windows-vista-search-api.html

http://msdn.microsoft.com/en-us/library/ee872109%28VS.85%29.aspx

--
Jesse Houwing
jesse.houwing at sogeti.nl
Post Question To EggHeadCafe