Jump to content
cchaski

Scribe Extracting Text from PDf inserts space inside words

Recommended Posts

I am using Scribe to extract text from a pdf. The pdf has a filesize of 705542 so it is fairly large. The Scribe result is fairly good but it has a serious problem: a space is inserted into words; see example of words like NE W YORK AND NEW JERSEY, airp orts, Inte rnational, Internat ional

Why does Scribe do this? Is there a way to fix it? (It is close to a dealbreaker if you can extract text but then not use the text that's extracted)

Many thanks in advance for your replies, as I certainly hope I can use Scribe since I've used 360Works products for years and Textractor before Scribe.

Carole

  

  

  

 

Share this post


Link to post
Share on other sites

Hi Carole,

What version of Scribe are you using? Can you download and install this build and see if you get the same results? If you do, please send an email to support@360works.com with a description of the problem and steps to reproduce the issue. Also, please provide a pdf that you seeing this behavior with.

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • By Aaron Lamperti
      Anyone had any luck getting Scribe to substitute text within a Text Box in a word document?
    • By Jonathan Ackerman
      is it possible to append text or images to the end of a loaded doc. (not just another document)-
       
      i.e. something like--
       
      $result=ScribeDocAppend ("new stuff")
       
      it seems the function only looks to append other files, not text
      what i need is to be able to add custom text to the end of some documents-
       
      not sure how to do this.
       
      thanks!
    • By Christian Chojnacki
      Hello, there,
      I'm trying to get the Scribe plugin running on my Filemaker Server 17.
      But I always get the error message:
      Error 474 Plugin could not be loaded: 360Works_Scribe.fmx64
      and
      Error 701 The FileMaker Script Engine process was terminated abnormally.
      I tried Scribe 3.08 and Beta Scribe-3.08009 but both don't work.
      The server runs on a Windows Server 2008 R2 Standard
      JRE 1.8.0_171 (32 & 64 bit) & JRE 1.8.0_171 (32 bit) is installed.
      The permissions are enabled to full access for "Everyone".
       
      Server restart, reassign permissions and restart services did not help.
      Does anyone have any idea why? I can send other log files if needed.
      Thank you and greetings
      Christian
      360Plugins_ServerScripting64.log
    • By Franziska B
      Dear 360 Works Team

      In our FileMaker Solution we want to create an Excel report including text, numbers (dates) and images via Scribe.

      The FileMaker export of text and numbers works fine. But we have a problem with the export of images in the xlsx.

      After the Creation of the Excel reports all text- and number elements are in the .xlsx but the image not.


       
      When we open the generated .xlsx, Excel shows the attached error massage:


      For the export of the pictures, we test following script commands:  
       
      ScribeDocValue[Name:“F5“;Value: table:: cell]
       
      ScribeDocWriteValue("Table1!F5", table::cell)
      SetVariable[$PicXYZ; Value: ScribeDocWriteValue("Table1!F5", table::cell)]
       
      Do we make a mistake, or could this be a known bug?
       
      Our Systemconfig:
      FileMaker Pro Advanced 16.0.4.403
      Excel 2010 and Office 365;          
      360 Works Scribe 3.08 and Scribe 3.09
      Server:
      FM Server 16.0.3.304
      Microsoft Windows Server 2008 R2 Standard
      Version 6.1.7601 Service Pack 1 Build 7601
      Could you help me please?
      Thanks


    • By Dean Ingram
      I am trying to get a page count from a PDF file stored in a container field. ScribeDocLoad is successful. Next I'm attempting to use ScribeDocReadValue to get the PDFPageCount. The variable $LastPage is set to ScribeDocReadValue( "PDFPageCount" )
      This parameter, or any other of the metadata values listed in the documentation, results in an error like "Reading values is not supported for the file PDFPageCount"
      The documentation reads like this:
        Am I fundamentally misunderstanding what this is supposed to do? The PDF file has no formal "fields." Its just a scanned PDF. 
      D
×

Important Information

By using this site, you agree to our Terms of Use.