Jump to content

cchaski

Newbies
  • Content Count

    6
  • Joined

  • Last visited

Community Reputation

0 Neutral

About cchaski

  • Rank
    newbie

Profile Information

  • Gender
    Not Telling

Recent Profile Visitors

2,220 profile views
  1. I am using Scribe to extract text from a pdf. The pdf has a filesize of 705542 so it is fairly large. The Scribe result is fairly good but it has a serious problem: a space is inserted into words; see example of words like NE W YORK AND NEW JERSEY, airp orts, Inte rnational, Internat ional Why does Scribe do this? Is there a way to fix it? (It is close to a dealbreaker if you can extract text but then not use the text that's extracted) Many thanks in advance for your replies, as I certainly hope I can use Scribe since I've used 360Works products for years and Textractor before Scribe. Carole
  2. cchaski

    SCRIBE File As Text and Unicode

    Thanks, Ryan! I made sure that the Russian document was encoded in Unicode (not UTF-8, just plain old Unicode on the Windows options) and SCRIBE worked to extract the text. I am still working on the Chinese document--this is a docx document, but it still is coming through as half-width boxes. Is there a way to error trap this problem --check the encoding of a file before extracting text? Thanks, Carole
  3. cchaski

    Data Analysis Intro

    This is a most welcome forum: thank you jbante very much for starting it! I have been using Filemaker for computational linguistics since 1995 because it was the only database I could find that was truly cross-platform (Mac/Windows) at the time (mySQL came later). I use SPSS for my statistical analysis, and for some of my modules I have been able to implement the magic numbers (ie coefficients) from SPSS into the Filemaker scripts. But what I REALLY want is linear discriminant function analysis with two-class classification INSIDE Filemaker. I do not know how to code the math, so I end up exporting into Excel, then importing into SPSS. I have not used R. I can do some programming in Python but I have not really used numpy --although I am interested in this option too. DTREG is a great machine learning analysis tool written in C, and I have wondered if it would be possible to write FM plugins from DTREG, but I don't know how to do this. I have stayed with SPSS because I find that it is more precise than other tools; i.e. I get better classification accuracy wiht SPSS discriminant function analysis than I do with other tools. I am using IWP (still using 12 Server Advanced and 12 Pro), so the exporting is a pain (have to use Remote Scripter, etc etc), so that my modules can be web-accessible to research affiliates in other countries. If anyone can help me with thi$, plea$e contact me at cchaski at linguisticEvidence.org or 302-856-9488 as I definitely need help and I am willing to pay for the consulting. Many thanks, Carole
  4. Hi, I use a lot of 360works plugins happily but I have found something I can't figure out with the Scribe FileAsText function. Here's the set-up: FM 12 Server Advanced -Middle Eastern version from winSoft (I need Arabic text among others for this db) FM 12 Pro Advanced -Middle Eastern version from winSoft (I need Arabic text among others for this db) IWP -hosted database on a Windows 2008 Server; SuperContainer is installed on this server. SC is working fine. text fields are set for Unicode storage. I use SuperContainer to browse and upload a file with Russian text. I use a script that contains error trapping and the two essential functions: #extract the text into a variable Set Variable $text; Value: ScribeFileAsText(uploadedFile) #put the extracted text into the text field with Unicode storage Set Field (unicodeTextField; $text) When I run this script to "extract the text," I get no error. Instead, I get gobbledy gook: it's got what looks like some half-width dread Unicode boxes interspersed with numbers, Roman letters, question marks and other alphanumeric symbols: it is just a mess .... hexcode? it is certainly something I've never seen before. However, if I copy and paste the Russian text into the unicodeTextField, it looks fine --good Russian Cyrillic text. So that makes me think that the function ScribeFileAsText is not working with Unicode? I have also tested this with Chinese --again, big problem: the extracted text looks like half-width Unicode boxes, not good Chinese ideographs. This also happens when I paste in the text from a Word docx. I have also tested this with Korean --no problem: the extracted text is good Korean hangul. Any ideas how to fix the Scribe output of FileAsText? Are there settings on the FM server for special fonts that I need to set? Or on the Windows server? I use a Toshiba laptop to interface with the server and it can show all the writing systems I've mentioned just fine in Word docx (Roman, Cyrillic, Chinese and Korean). Many thanks, Carole
  5. Hi, I have a databse hosted in IWP that uses Supercontainer for people to upload documents. The process works fine when I use FMPA 12 to access the hosted database, but when I go through the web browser (the IWP approach), the process does not work. A button "Upload Doc" calls a script that goes to the upload layout, generates the SC id code for the record, shows the webviewer with upload and delete buttons. This works fine when I am using FMP to access the database, but this does not work when I use IWP. When I click the button in IWP, nothing happens. Any ideas?? Thanks in advance!
  6. Hi, I have a databse hosted in IWP that uses Supercontainer for people to upload documents. The process works fine when I use FMPA 12 to access the hosted database, but when I go through the web browser (the IWP approach), the process does not work. A button "Upload Doc" calls a script that goes to the upload layout, generates the SC id code for the record, shows the webviewer with upload and delete buttons. This works fine when I am using FMP to access the database, but this does not work when I use IWP. When I click the button in IWP, nothing happens. Any ideas?? Thanks in advance.
×

Important Information

By using this site, you agree to our Terms of Use.