Jump to content

Null character in OCR'd text


This topic is 4023 days old. Please don't post here. Open a new topic instead.

Recommended Posts

I stored a PDF in DocuBin. The text from the PDF is stored in the "_c_Text" field in table "Version".

 

For the PDF that I'm using, the resulting text in _c_Text has all sorts of invisible characters which I would like to remove.

 

Btw, the file I'm referring to is the standard Apple "About Stacks.pdf" file. I've uploaded it here, although I don't think this problem is due to this particular PDF file.

 

These characters aren't visible in a text editor, but if you use the arrow key to move across the text characters, you can tell they're there, because the cursor all of a sudden stops moving.

 

 

In TextWrangler, if you turn on 'Show Invisibles', the characters are shown as "?". In a Hex editor, they are shown as 00.
 
I tried using char ( 0 ) and the Substitute FileMaker function, but that did not work.
 
Can you provide a way to get the scanned (OCR'd) text that is stripped of these invisible characters?
 
Thanks.
 

About Stacks.pdf

Link to comment
Share on other sites

DocuBin does not perform any OCR actions on inserted files. The invisible characters you are seeing from your PDf, About Stacks.pdf, have to do with how this particular PDF was constructed. I have uploaded a different PDF you can test with that does not have these invisible characters

 
As a technical clarification: when a file is inserted into DocuBin, we extract the text from the file (if there is any) with our 360Works Scribe Plugin, and stick it in the field Versions::Text. Versions::_cText is only a calculation (¶ & Version::Text) that we use on the layout for several off-topic reasons.

 

Michael

fmp12_tutorial.pdf

Link to comment
Share on other sites

This topic is 4023 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.