Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Gaming > Development Programming Algorithms > Re: do***ent pr...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 2 of 3 Topic 601 of 679
Post > Topic >>

Re: do***ent processing

by "Jim Langston" <tazmaster@[EMAIL PROTECTED] > Nov 27, 2006 at 03:18 PM

"DoctorC" <enco@[EMAIL PROTECTED]
> wrote in message 
news:456ae3ac$0$17950$f69f905@[EMAIL PROTECTED]
> Hi,
> I need some suggestion about do***ent processing techniques.
> I need to im****t do***ents in html, DOC and PDF formats and would like
to 
> parse them and automatically create fields to fill the do***ents.
> Any idea how to do it?

"im****t do***ents..." "automaticallycreate fields to fill the
do***ents..."

html, DOC and PDF are 3 different animals.

The easiest would probably be HTML, since it'll probably have tags specify

what are actually fields (if my HTML memory servers me, it might be 
something like <field=...> but don't quote me on that).

The problem with DOC and PDF is there is nothing really stating what a
field 
is.  Lets take a PDF which are (usually) graphic images.  If they are 
graphic you'll need some type of OCR (Optical Character Recognition) to
read 
the text.  At least with DOC you already have that.  But then what?  How
do 
you know what a field is?

We, as humans, see:
Name
and we know we're supposed to put our name their.  How is you software 
supposed to distinguish that as a field though?  How does it know:
Enter your name:

is a field and
Do not write below this line:
isn't?
 




 3 Posts in Topic:
document processing
DoctorC <enco@[EMAIL P  2006-11-27 15:15:33 
Re: document processing
"Jim Langston"   2006-11-27 15:18:50 
Re: document processing
DoctorC <enco@[EMAIL P  2006-11-28 07:45:41 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Fri Jul 18 19:48:16 CDT 2008.