Discussion:
How to extract text from text boxes
(too old to reply)
PT
2009-07-08 16:58:44 UTC
Permalink
I scanned an article to a PDF file. Then I used the Nuance program
“PDF Convert” to OCR the PDF and save it to a Word 2003 file.
Everything worked fine – the text transferred accurately.

Here’s the problem

Each paragraph of the text is now enclosed in a separate text box,
making it impractical to edit.

Is there a simple way to extricate the text from the text boxes and
end up with a normal. God-fearing Word document.
Terry Farrell
2009-07-08 19:15:59 UTC
Permalink
If that is really Frames rather than Text Boxes, then there is a
RemoveFrames command listed under All Commands. The command can be added to
a toolbar but if you need further instructions, we need to know which
version of Word you are using.
--
Terry Farrell - MSWord MVP
Post by PT
I scanned an article to a PDF file. Then I used the Nuance program
“PDF Convert” to OCR the PDF and save it to a Word 2003 file.
Everything worked fine – the text transferred accurately.
Here’s the problem
Each paragraph of the text is now enclosed in a separate text box,
making it impractical to edit.
Is there a simple way to extricate the text from the text boxes and
end up with a normal. God-fearing Word document.
PT
2009-07-08 22:18:16 UTC
Permalink
I did some checking. The box has the same borders as a text box.
It’s a crosshatched border with small circular “handles”

But just in case, I accessed the “Remove Frames” command and put it on
the toolbar. But as soon as I click in a “frame”, the toolbar command
grays out.

So assuming the converter put each paragraph into a separate text box,
how can I remove the boxes, while retaining the enclosed text?
Post by Terry Farrell
If that is really Frames rather than Text Boxes, then there is a
RemoveFrames command listed under All Commands. The command can be added to
a toolbar but if you need further instructions, we need to know which
version of Word you are using.
--
Terry Farrell - MSWord MVP
Suzanne S. Barnhill
2009-07-09 00:14:03 UTC
Permalink
Round handles are text boxes; frames have square ones. You can convert the
text boxes to frames and then use Remove Frame. In fact, just pressing
Ctrl+Q will usually remove a frame, since the frame is unlikely to be
defined as part of the paragraph formatting. But you will also lose any
other directly applied paragraph formatting.
--
Suzanne S. Barnhill
Microsoft MVP (Word)
Words into Type
Fairhope, Alabama USA
http://word.mvps.org

"PT" <***@gmail.com> wrote in message news:2c6e9b7e-90a6-40b8-ac6c-***@d15g2000prc.googlegroups.com...
I did some checking. The box has the same borders as a text box.
It’s a crosshatched border with small circular “handles”

But just in case, I accessed the “Remove Frames” command and put it on
the toolbar. But as soon as I click in a “frame”, the toolbar command
grays out.

So assuming the converter put each paragraph into a separate text box,
how can I remove the boxes, while retaining the enclosed text?
Post by Terry Farrell
If that is really Frames rather than Text Boxes, then there is a
RemoveFrames command listed under All Commands. The command can be added to
a toolbar but if you need further instructions, we need to know which
version of Word you are using.
--
Terry Farrell - MSWord MVP
Graham Mayor
2009-07-09 04:30:55 UTC
Permalink
Most OCR software makes a complete hash of converting to Word. Text boxes
and frames are typical examples. OK you get a Word document but the document
is not editable without a lot of work. Finereader 9 works better than most,
but perfect it isn't.

You might have been better scanning into Microsoft Office Document Imaging
(included with 2003, though not installed by default). This will not cause
the formatting issues, because there will be no formatting in the resulting
file, but its text reading ability is reasonably good.
--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Post by PT
I scanned an article to a PDF file. Then I used the Nuance program
“PDF Convert” to OCR the PDF and save it to a Word 2003 file.
Everything worked fine – the text transferred accurately.
Here’s the problem
Each paragraph of the text is now enclosed in a separate text box,
making it impractical to edit.
Is there a simple way to extricate the text from the text boxes and
end up with a normal. God-fearing Word document.
Loading...