begreen Posted July 24, 2004 Posted July 24, 2004 Hi, Newbie here. I have to import many text files on a regular basis from a webform and want to extract data from each of them. The text files each contain a series of comma delimited text. There may be several of these text files in the folder at the time of import. Right now, I import them and get the text in a unique field per record. I would now like to add several fields that extract the text based on its position relative to the commas. For example: Record #1 has "My","dog","has","fleas", record #2 has "So","does","my","cat" and so on. I want each word to go to a separate calculated field. Is patterncount the best way to extract each word? Are there any good examples of how to use this? Thx, begreen
Lee Smith Posted July 24, 2004 Posted July 24, 2004 Hi begreen, Welcome to the Forum. As you might suspect, this wouldn't be a new problem to users. You will find that if you first do a search of the Forum using a keyword it will often provide the answer without having to ask a question. The Search Mechanism is somewhat hidden, but it is in the Pull down menu at the top right of your screen, labeled "Search and Active Topics. I just performed a search for Pars (short for Parse and Parsing), and it returned 200 hits. I also did a search for Extract and it return 200 hits also, including this recent post, that would be a good place to start, Recent Post HTH Lee
Fenton Posted July 24, 2004 Posted July 24, 2004 From the example given, this is just comma-delimited text. If the data is always in the same order, with no missing cells (ie., an empty value still has a placeholder), then it's a straight import into FileMaker, as separate fields in a record. No need to import into one field then take apart, comma-delimited import does that already. If the data is not as above, then I don't see how you could "parse" it, which requires some kind of "label" (like "name: Fenton Jones", "phone: 619-692-1529", etc.). Otherwise how can you tell what's what. Is there more info on this?
begreen Posted July 24, 2004 Author Posted July 24, 2004 Thanks Lee. I did many searches and didn't come up with good links. (never did on "parse" - good tip. the link to the recent post led to the tutorial. I'll start digging into it now.
begreen Posted July 24, 2004 Author Posted July 24, 2004 Hi Fenton, If there was only one text file at a time, I would do a straight import. But there are often several text files in the folder. These come in via an email program and are stored in a folder until a script triggers FM to look in the folder, import the files, then another applescript moves the files into a "done" folder. The info in the text files is coming from a webform and is always in the same order. If a webform field is blank, then the text file shows ,"", for that field. I am going to dig into the tutorial that Lee pointed me to. This is not intuitive to do the first time. Are there any books that you could recommend? - begreen
Fenton Posted July 24, 2004 Posted July 24, 2004 Ah, AppleScript. It's not obvious, as your OS says Windows XP. I'm fairly sure you can read the text files, (convert to tabbed text?) and write them directly to FileMaker with AppleScript. Some people (BruceR) could write such a script off the top of their heads. But I'll have to do it after lunch -) You can read the "parsing" posts, but this is a different type of parsing entirely.
begreen Posted July 24, 2004 Author Posted July 24, 2004 I have both platforms working, but I'm using the Mac for the scripting because AppleScript is easier for me. You did give me an idea though. I could set up a system that took the textfiles into a folder and gave them unique numbering scheme, then (I need help here) told FileMaker to import text1.txt, then text2.txt, then the next until all done. Then Applescript could move them into a "done" folder. What I would need is a way for Applescript to tell FM to work through a list of numbered text files one at a time. On the parsing front I have been reading suggestions and a tutorial. Even if I don't use this, it's a good learning experience. I think I'm getting close using the middle text function together with the position function. So far it looks something like this: Middle ( source ; Position ( source ; "," ; 1 ; 2 ) +2 ; (Position ( source ; "," ; 1; 2 ) -3) - Position ( source ; "," ; 1; 1 ) ) I'm trying to eliminate the , and "" from the calculated field. But it only works part time. It's ok for "My","dog","has","fleas' (calculates: has), but not for "So","does","our","cat" (calculates: our"). I have to step out, but will return to this later to solve my error.
Fenton Posted July 24, 2004 Posted July 24, 2004 I'm back, but haven't done anything (story of my life -) I don't think you need to do anything to the files, as you're just wanting the lines of text they contain (right?). As I said, you can set an entire record of FileMaker from AppleScript, with just a list of the data; no need to pass control to FileMaker and Import. AppleScript also easily walks through a list of file in a folder, no need to rename them to keep track. The only real question is how a comma-separated line of a text file becomes a AppleScript list of data. They look about the same, but I don't know for sure. Another question is whether there is more than 1 line (1 record) of text in each of the text files, and, if so, what's the "end of line" character. Perhaps you could post a folder of a few of these little files, so we can see.
Lee Smith Posted July 24, 2004 Posted July 24, 2004 Hi begreen, I want to also point out that sometimes the easiest way to hand the mixed up text, is to work on it in a Text Editor such as BBEdit or TextWrangler on the Mac. The have some incredible tools for finding and replacing text, using Grep, Perl, and POSIX. If you can share some examples of the text, maybe there are better ways to skin that cat. HTH Lee
Ugo DI LUCA Posted July 24, 2004 Posted July 24, 2004 From what I read, you'd simplify the process by substituting the quotes, as a first step. Then your source file ("My","dog","has","fleas') would read My, dog, has, fleas. Parsing the third word would be Middle(source, Start, Length) with Start = Position(source, ",",1,2)+1 and Length = Position(source, ",",1,3)-1-Position(source, ",",1,2) If you want to trim remaining leading spaces, use the Trim( ) function so that your final calc will look something like : Trim(Middle(source,Position(source, ",",1,2)+1,Position(source, ",",1,3)-1-Position(source, ",",1,2)) It's not clear exactly what your goal is (always the 3rd word ?), but if you were to extract each word, you'd probably use a script with a loop and an incrementing counter. Trim(Middle(Tricksource,Position(Tricksource, ",",1,gCounter)+1,Position(Tricksource, ",",1,gCounter+1)-1-Position(Tricksource, ",",1,gCounter)) where Tricksource is Source &"," and used to introduce a fake ending ",", as the calc above targets the next ",". HTH
Fenton Posted July 24, 2004 Posted July 24, 2004 OK, I've got something. It's not real pretty. I had a bit of a struggle with the quotes (basically getting rid of them). It's all AppleScript. It would work on plain .csv files only. The "fixed folder" version checks for that. It moves "done" files; it doesn't move non-csv files. You guys still on version 6 could take the included AppleScript and put it in a FileMaker 6 file. The only thing that's needed (AFAIK) is 3 text fields (and the included folders).
Fenton Posted July 24, 2004 Posted July 24, 2004 Oops. It broke when there was an empty "cell" at the beginning or end. I modified that, so here's a new one. [Removed the file. See later post.]
Fenton Posted July 24, 2004 Posted July 24, 2004 Or you could do the "file name switch," and call an FileMaker script to import the file (do script FileMaker script "Import csv"). Or run it all from a FileMaker Loop, getting the file names, and only using AppleScript to do the name switch in between. I really thought it was going to be easier to convert .csv to an AppleScript list. That's why I went the full AppleScript route. I think it would be the fastest, but perhaps not the most troublefree.
Fenton Posted July 25, 2004 Posted July 25, 2004 Oops again. The original file had a "hard-coded" path to the import folder, so it would only work if you edited that path to your own. This one gets the location from the position of the database file, so it should work for others. I began to do the "switch names then import" routine. But I ran into serious timing issues; basically the files are so small that the Finder cannot rename them fast enough between imports, so FileMaker can't find the file. I believe it would work if you split it into separate FileMaker scripts, rather than running it from one AppleScript and one FileMaker script. csvImportText.zip
begreen Posted July 26, 2004 Author Posted July 26, 2004 Thanks Fenton. I'll take a look at this tonight. I should be able to easily rename the files to end with .csv. I've tried Uno's suggestion. It partially works, but not all the way. I am getting text that spans the comma and in his example I got the 4th word, not the 3d. But, getting closer. Here is a sample of the email I get from the form. I will post the text file attachment that comes with the email to the list as well. email: From: [email protected] Date: Sun, 08 Feb 2004 10:44:12 -0800 To: [email protected] RELEASENUMBER: x45 LANGUAGE: English SYSTEMINFO: PowerMac G5 OS: Mac OS 10.2.X QUICKTIME: 6.1 BUG TITLE: Could not access help files DESCRIPTION: Can't open the Dogmeat Help files from x45. STEPS: 1) Set Safari as your main browser. 2) Open Dogmeat. 3) Go to Help > open help 4) Enjoy the error. SEVERITY: 1 ERRORMSG: Could not access help files. HARDWAREINFO: Dual 2 GHz G5 Kennsington Optical Elite mouse SYSMEM: 1.5 gig OTHERCOMMENTS: this is a test message comment 26481_200428104412.txt
begreen Posted July 26, 2004 Author Posted July 26, 2004 Hi Uno, This is getting closer. I'm getting the 4th word with your example. However, when the first field info between commas is an email address, then it does not work as well. I'll attach my test as an example. This has been a busy weekend, so I haven't had time to revisit the problem, but will on Monday morning. - begreen fm7test.zip
Fenton Posted July 26, 2004 Posted July 26, 2004 First, I apologize to anyone who had trouble with my earlier file (seems few). I used a non-standard (though common) AppleScript command, "list files", which is only available via the Satimage Scripting Addition (OSAX). Anyone serious about AppleScript should have this, it's free: http://www.satimage.fr/software/downloads/Satimage261.sit They also have an entire free script editing app, Smile, but that's another story. The above OSAX has a great "change" command that can use grep in AppleScript (for those attempting to avoid Shell command line as long as possible -) Also some astounding array and math stuff, which is beyond me. I used that command instead of painstakingly building the names from the folder name & the file name. I didn't realize until later that it was not in Standard Additions (probably will be some day). Also, I thought I'd redo my file to run the process from FileMaker. This is more like what you'd do if you used a plug-in. It may be more reliable. It lets FileMaker just Import the .csv files. AppleScripts run before and after the import, to get the files, set the file, rename the file and move it. You have to do it with this "piecemeal" method. If you try and run it all from AppleScript, calling a FileMaker Import, it seems to run into timing problems, as neither waits for the other. This method doesn't use the Satimage Addition. You end up having to put the names together anyway. (I could redo the earlier one also, but probably won't). All the AppleScripts are inside FileMaker, but I included the originals, if you want to look at/edit them in Script Editor. I don't really see the point in "parsing" a .csv file. FileMaker can import those things. It looks like we've got another far more difficult form to parse. From what I see, you might just want to hard-code the thing, due to the multiple line entries for some of them, like "STEPS" and "HARDWAREINFO". It would be possible to come up with a more generic Loop I suppose. But, 1. It would be MUCH slower, and, 2. Only a few entries are funky like that. The questions, as always, are: Do you know what ALL the possible labels are going to be? Are they always all present, in the same order? Can the STEPS all go into 1 field? ImportwFM.zip
begreen Posted July 26, 2004 Author Posted July 26, 2004 Ahh. Yes, I got the SATII error and thought it was me doing something wrong. I'll look at the sat addition. Downloading the new attachment now. Agreed that parsing shouldn't be necessary, but I have been working within my meager skills. Your AppleScript is above what I have tried. Thanks to the help from all, I am moving at a much more rapid and direct pace now. To answer your questions: Do you know what ALL the possible labels are going to be? Yes Are they always all present, in the same order? Yes Can the STEPS all go into 1 field? Yes Because of the predictable nature of the text (csv) attachments, I think you are correct in that hard-coding it should be possible. I'll try the new .FM7 file and will see how it works now.
Fenton Posted July 26, 2004 Posted July 26, 2004 I am confused about what .csv files have to do with the Form (which is no way a csv format). You didn't even mention the form earlier. If the data is mixed together, that's 3rd story. Which is it? It matters. Post or attach an example of an actual text file.
begreen Posted July 26, 2004 Author Posted July 26, 2004 There is an actual text file posted a couple of threads back in my post to Uno. The webform mails me when a report comes in and has the text file as an attachment. (I have no control over this process). I've been working with the sample that you sent and have modified it to include the fields in my text files. However, it never gets to the import stage as I step thru the script. My bad I'm sure.
Ugo DI LUCA Posted July 26, 2004 Posted July 26, 2004 See your file corrected. I like bith the energy and the way Fenton is trying to put this up though. test.zip
begreen Posted July 26, 2004 Author Posted July 26, 2004 Ah yes. That is very instructional. Now I see the correct pattern. I didn't realize that one could embed the substitute function within the main calculation. That is a neat trick. This all has been very helpful. Thanks! PS: I agree that Fenton has a more elegant approach. He has great AppleScript skills. I am trying to make that work as well. This has been a tremendous learning experience.
begreen Posted July 26, 2004 Author Posted July 26, 2004 Fenton, I'm getting close, but it appears to be stuck in a loop. Here's the project. I've changed the TextImport.csv to match my current fields. begreen_ImportwFM.zip
Fenton Posted July 27, 2004 Posted July 27, 2004 OK, I see, we're NOT really talking about a form with labels, we're still talking about a .csv file(s) that have the "fields" from a form (but without the labels, or column names). So the Import .csv is the way to go, not parsing. There are however 19 fields to your csv "form" (when I opened one with FileMaker); 2 more than you earlier "form." The last one is the "name of the file.txt." But the next to last has either "Yes" or "No," and I have no idea what it is. Names like "unknown", "field_2" and "field_3" are not very useful to those of us trying to line them up. We are close to a solution; but it would be really good if you could give us at least one fully populated example, with the correct names and field order on a layout (same as the .csv). Then it's pretty much done.
Fenton Posted July 27, 2004 Posted July 27, 2004 Now I'm going to contradict myself. I guess we're talking about EITHER an text form (received via email) OR an attached .csv text file, both with the same information, but in different formats. You could process either one (or process both and compare them -) For the email, you could get the POP3it plug-in. It allows you to receive email directly into FileMaker. It can be set to automatically check the email at an interval (that part's not working for version 7 yet :-(, but the rest is). It can run a script afterwards, which would process the form. I have such a parsing script; but it may need tweaking. There's also Dacon's Mailit plug-in, which does both SMTP and POP3, but it doesn't automatically check either (I don't think). In other words, do it all with no intervention. You would want a dedicated email address for this. This method has the advantage that it's isolated from other emails, and it just dealing with text, not files. Or, alternatively, run the process to import the .csv files. That would require the AppleScript, which we've pretty much got. How it's triggered, and the location of the text file attachments is another story. I imagine this could all be automated, via Rules in Mail, or some such. I've done very little with Mail myself, as I use Eudora (and don't receive forms).
Fenton Posted July 27, 2004 Posted July 27, 2004 Oh, the Import script in your file doesn't work, because I think you broke the scripts by importing them into a file before you had the fields and/or table name in place. Look at the field names, several are surrounded by /* field name */ <- comment tags that are FileMaker's way of saying, "I don't know what this is! But I'll leave 'til you figure it out." Often fixed by just removing them. It'll tell you if they don't exist. But I'm holding off on any more fixes until you get the field names cleaned up and the form (csv) fields placed in the correct order on a layout, so we know what we're doing.
begreen Posted July 27, 2004 Author Posted July 27, 2004 My FM form was just a test. Sorry for the rough edges. I left in the extra fields, but have deleted them now. I've posted t again here. Now there is only one extra field labeled "title" for info to be added later. Yes, the .csv text file is the "imported form" without labels or columns. It's just comma separated text. The TextImport.csv in the new zipped file has all the final fields. I am including a text version that has the field order of the information in the file. Sorry if the info was confusing. I am doing this in between several other tasks today. bug_imp.zip
Fenton Posted July 27, 2004 Posted July 27, 2004 This seems to work. I added a little extra record routine, so that the "import" script (which is actually multiple Imports) presents the new records when it's done, like any other import would. I also added a routine to delete blanks; there was at least one blank line in each .csv file. BugImporter.zip
begreen Posted July 27, 2004 Author Posted July 27, 2004 Awesome! I now see the /hey dude what field?/ marks. I thought that the /field/ was a Mac specific requirement of FM7 that I had missed. I've only been in FM 7 for a week. From this point I can handle setting up a timer to trigger FM to start the script. This is really remarkable. Thank you for all the help. i have learned so much between you and Uno. It's very exciting. I will now try to digest the rest of the tutorials and catch up to a semi-neophyte level. FM is such a powerful tool that I am simply amazed of what it is capable of in the right hands.
Recommended Posts
This topic is 7424 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now