Jump to content
cchaski

Scribe Extracting Text from PDf inserts space inside words

Recommended Posts

I am using Scribe to extract text from a pdf. The pdf has a filesize of 705542 so it is fairly large. The Scribe result is fairly good but it has a serious problem: a space is inserted into words; see example of words like NE W YORK AND NEW JERSEY, airp orts, Inte rnational, Internat ional

Why does Scribe do this? Is there a way to fix it? (It is close to a dealbreaker if you can extract text but then not use the text that's extracted)

Many thanks in advance for your replies, as I certainly hope I can use Scribe since I've used 360Works products for years and Textractor before Scribe.

Carole

  

  

  

 

Share this post


Link to post
Share on other sites

Hi Carole,

What version of Scribe are you using? Can you download and install this build and see if you get the same results? If you do, please send an email to support@360works.com with a description of the problem and steps to reproduce the issue. Also, please provide a pdf that you seeing this behavior with.

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • By Franziska B
      Dear 360 Works Team

      In our FileMaker Solution we want to create an Excel report including text, numbers (dates) and images via Scribe.

      The FileMaker export of text and numbers works fine. But we have a problem with the export of images in the xlsx.

      After the Creation of the Excel reports all text- and number elements are in the .xlsx but the image not.


       
      When we open the generated .xlsx, Excel shows the attached error massage:


      For the export of the pictures, we test following script commands:  
       
      ScribeDocValue[Name:“F5“;Value: table:: cell]
       
      ScribeDocWriteValue("Table1!F5", table::cell)
      SetVariable[$PicXYZ; Value: ScribeDocWriteValue("Table1!F5", table::cell)]
       
      Do we make a mistake, or could this be a known bug?
       
      Our Systemconfig:
      FileMaker Pro Advanced 16.0.4.403
      Excel 2010 and Office 365;          
      360 Works Scribe 3.08 and Scribe 3.09
      Server:
      FM Server 16.0.3.304
      Microsoft Windows Server 2008 R2 Standard
      Version 6.1.7601 Service Pack 1 Build 7601
      Could you help me please?
      Thanks


    • By Dean Ingram
      I am trying to get a page count from a PDF file stored in a container field. ScribeDocLoad is successful. Next I'm attempting to use ScribeDocReadValue to get the PDFPageCount. The variable $LastPage is set to ScribeDocReadValue( "PDFPageCount" )
      This parameter, or any other of the metadata values listed in the documentation, results in an error like "Reading values is not supported for the file PDFPageCount"
      The documentation reads like this:
        Am I fundamentally misunderstanding what this is supposed to do? The PDF file has no formal "fields." Its just a scanned PDF. 
      D
    • By Stickybeak
      I am exploring accessing my solution through web direct using fmphost.
      solution works fine in server mode.
      however in web direct it says I do not have a sufficiently new version of scribe: I need > 2.15.
      problem is that I’m running 3.08 on my local Mac.
      is this an issue about file locations? Do I need to upload scribe or something?
    • By kims
      I am working on a script that will build a document based on a value from a drop down list.
      I have a layout that contains a Document Subtype. If a certain subtype is selected from the drop down list for this record, then I want my script to be able to pull from a specific container holding the appropriate document for that type. Then I can use Doc Append to combine the two documents. Each document would be custom then to the subtype.
      I'm pretty new at FileMaker so I'm still trying to figure a lot of things out and still trying to understand how to put things together and why it will/will not work.
      I was originally using Case but then I realized that was probably not the correct thing to do. It would either append both types of documents or one, but it wasn't always the correct one.
       
      Any guidance would be greatly appreciated.
       
      If this helps, I want something that will do this:
       
      If Subtype = a, b, or c, then append Doc 1
      If subtype = d, e, or f, then append Doc 2
      and so on...
    • By Stickybeak
      I have been running a document creation and management solution for 2 years.
      My scribe maintenance subscription ran out so I bought a new license.
      Boom the whole things has collapsed.
      Scribe is throwing errors when a field in the database does not appear as a merge field in the document the attached being a typical error.
      This solution is vital to my practice but 360Work have no taken 2 working days with no solution - not responding to my emails when I tell them that their work arounds aren't working.
      This was suggested "ScribeSetErrorCapture(true)" - but that does not even appear in my steps in the script window.
      Utterly frustrated.
      Help!

  • Who Viewed the Topic

    11 members have viewed this topic:
    Wally Sew-Atjon  Will Xu  dezkev  Jeroen de Haan  Nippon Maru  flyjar  rivet  Ron Cates  Lola  oscar23445  Lape 
×

Important Information

By using this site, you agree to our Terms of Use.