mistery Posted October 17, 2012 Posted October 17, 2012 Hello I am posting a filemaker file that I made to show how I am trying to separate pasted data from the yellowpages online into separate field. The problem is that there are a lot of variations of text and that is why I can't figure this out. I am a complete newbie here and wanted to make a contribution to this forum by posting an example that I am trying to solve and hopefully people could help me make the calculation to make this happen. I have included the url address for each record to show where it came from and the pasted information from that page resides in the field called yellowpage copy, I did the first record manually but the others I didn't have a clue about. I included a button to execute a calculation so there are probably many ways to do this. I have an instruction field to help us all understand the thinking that went behind the record to get it parsed. Please understand this task of parsing text from the yellowpages is daunting to a new person and I thought this would be a good example with a lot of variations for the good of all of us in the forum. Thank you very much. Yellow pages text parsing example.fmp12.zip
Lee Smith Posted October 17, 2012 Posted October 17, 2012 I would probably use the Web Viewer to automate this, but you can also parse the text from "yellowpage copy" using a calculation (either in the field, or as a script step) such as. // extract papagraph Let ( [ n = 3 ; t = "¶" & Yellow pages text parsing example::yellowpage copy & "¶" ; Start = Position ( t ; "¶" ; 1 ; n ) + 1 ; End = Position ( t ; "¶" ; 1 ; n + 1 ) ] ; Trim ( Middle ( t ; Start ; End - Start ) ) ) n = 3 is the position of the third paragraph You will need to change this to n = 4 (etc.) to do the next paragraph. You will need to play with this to adjust for each paragraph. I prefer to do this as a script step, as it is more forgiving. :)
mistery Posted October 17, 2012 Author Posted October 17, 2012 Thanks for your answer but I don't see how it is possible to automate with a web viewer? I don't understand that at all. Thanks
Lee Smith Posted October 17, 2012 Posted October 17, 2012 Actually, this is two different ways of approaching your need. The calculation I posted can be used either in a field, or as a script to populate a field. A second approach would be to use the Web Viewer to obtain the source information from a webpage. There are pros and cons to this approach, one of cons would be a change to the webpage, and it will effect the results. Sense you show yourself as a novice, I recommend that use the first approach.
mistery Posted October 18, 2012 Author Posted October 18, 2012 I worked on some scripts and that was a good start. Now I have had some problems with record three. I have written my questions there. but they are. I have a script to get the business and the address but the city and state and zip are linked together. I was able to extract the zip and state but it still remains in the city . Don't know how to get rid of the state and zip in the field city.... 2. In Ithaca, NY > Mason Supplies & Materials > Dolph Buzz quarry I don't know how to extract the "mason supplies & materials that are between 2 ">" symbols and send that to the category field. 3. IF it says local the Phone number follows. How do I get the 13 characters after the word "local" 4.If it says "Visit" the next paragraph is the website. It is always the next paragraph after the word "visit" I don't know how to configure this. 5. The same will hold true for the word "email" the next paragraph after that is the email address. I have enclosed the amended file. Thanks Yellow pages text parsing example.fmp12 2.zip
mistery Posted October 18, 2012 Author Posted October 18, 2012 Hi I can't get two things. 1. How do I get the text between 2 words when I know the before and after words? 2, how do I get the next line after a specific word. Ie. the word "visit ". Y Thanks
LaRetta Posted October 19, 2012 Posted October 19, 2012 Hi Mistery Well, take a look at this idea (attached). It will not work in your first example where you have the same symbol without modification allowing User to specify second occurrence of the same character (it wouldn't be that difficult to modify though). Also, without recursion, it will not replace multiple occurrences of those strings within the same text field. You can also use xWords but if you have symbols you wanted to keep on either side, you would lose them. I suppose one could use Substitute() but ... well this is what comes to me tonight, LOL. contains both v7 and v12 GetTextBetween.zip
LaRetta Posted October 19, 2012 Posted October 19, 2012 Wow. I had just taken a look and see another very recent post. Really you should follow the same thread so for future please stick it out there. If something is unclear just ask and if you don't get response, just BUMP it ( post again so it jumps to everyone's attention). :-) And if you need the second or different occurrence of either side let me know.
mistery Posted October 19, 2012 Author Posted October 19, 2012 Thank you LaRetta I was able to get it done with the same symbol using substitute function and changing it into paragraphs.. Thanks for alerting me to my second posting attempt. Your answer really helped me...
Steve E. Posted October 20, 2012 Posted October 20, 2012 Mistery: As an "entry level" FMer, you might want to spend some time just playing around with the "If", "Case", and various "Left", "Middle", and "Right" functions; and the "If" and "Else If" script steps so you get a feel for these. If you already have, ignore this message. Im on FM 11 and can't open your file.
mistery Posted October 20, 2012 Author Posted October 20, 2012 How do I get the very next line after a word that I know? In the file if the word "email" shows up it is always followed by a colon Like this EMAIL: [email protected] There is always a new paragraph and it is just the next line. Does someone know how to do that? Thanks
Lee Smith Posted October 21, 2012 Posted October 21, 2012 Having a prefix will help you for this one. However, your description does not match what is in the field. EMAIL: [email protected] EMAIL: is really Email: (accuracy is a must when you are identifying key information like this.. If there isn't a return after the email, it will also break. Try this calculation. Let ( [ text = Yellow pages text parsing example::yellowpage copy ; prefix = "Email:¶" ; suffic = ¶ ; start = Position ( Text ; prefix ; 1 ; 1 ) + Length ( prefix ) ; end = Position ( Text ; suffic ; start ; 1 ) ] ; Middle ( Text ; start ; end - start ) )
mistery Posted October 21, 2012 Author Posted October 21, 2012 Thanks Lee Somehow it didn't work. I have a url with the "Email:" http://www.yellowbook.com/profile/american-arborist-corp_1635550376.html?classId=0 I don't know what's wrong?
mistery Posted October 21, 2012 Author Posted October 21, 2012 Lee I tried your calc in the email script with record 4 and it didn't work. I am sending it Yellow pages text parsing example 2.fmp12.zip
Lee Smith Posted October 21, 2012 Posted October 21, 2012 That is because there isn't a paragraph return following it. EMAIL: is really Email: (accuracy is a must when you are identifying key information like this.. If there isn't a return after the email, it will also break.
mistery Posted October 21, 2012 Author Posted October 21, 2012 Lee I just checked my file. There is a return after Email: in record 4 . Just to test it for sure I copied it and pasted in back with a return from my keyboard. When I copied it into word it says there is a paragraph return. I screen captured it to show you.
Lee Smith Posted October 21, 2012 Posted October 21, 2012 I don't use word. I use TextWrangler because of it's tools. In TextWrangler, none of the records that have either an email or visit have a return after the address.
mistery Posted October 22, 2012 Author Posted October 22, 2012 I don't have text wrangler but what should I do to make it work?
Lee Smith Posted October 22, 2012 Posted October 22, 2012 Have you added a return and see if that makes it work.
Lee Smith Posted October 22, 2012 Posted October 22, 2012 I don't have text wrangler Go here and download it for free.
mistery Posted October 22, 2012 Author Posted October 22, 2012 Ithaca, NY > Mason Supplies & Materials > Ithaca Stove Works Ithaca Stove Works 414 N Meadow St Ste A Ithaca, NY 14850-3247 Local: (607) 272-2650 0 Be the first to review Visit: www.ithacastoveworks.com Email: [email protected] // That is what my record 4 copies This is the script i tried with the return Let ( [ text = Yellow pages text parsing example::yellowpage copy ; prefix = "Email:¶" ; suffic = ¶ ; start = Position ( Text ; prefix ; 1 ; 1 ) + Length ( prefix ) ; end = Position ( Text ; suffic ; start ; 1 ) ] ; Middle ( Text ; start ; end - start ) ) But it returns nothing
Lee Smith Posted October 22, 2012 Posted October 22, 2012 Here is the file with two scripts, one for the email and the other for the URL. Again, you must add a return to the end of them when one is missing. Yellow pages text parsing example 2.fmp12.zip
mistery Posted October 22, 2012 Author Posted October 22, 2012 Thank you Lee! I think it works fine but I discovered the problem. I tried using the scripts with examples copied from the internet as they were. When I copied the addresses as they were they didn't have a return after the address ONLY in the case of Visit when followed by an email. Then the Visit naturally had a return after their web address because it was followed by "Email". But unless I add the return to the new examples the scripts don't work. I changed the script names in the example I am sending you from set email to email and set visit to website I put in notes to show what I am getting at in the notes field. The problem is that natively there is no return after "Visit" unless followed by an email and if there is an email and a web address it has no return after the email in the form that I can copy. not sure what to do about it. yellow pages with and without.fmp12.zip
Lee Smith Posted October 24, 2012 Posted October 24, 2012 Hi mystery, My computer is at the repair shop (cashed on Sunday) so I'm unable to look at your file. I'm hoping someone else will jump in and help in my Adsense. Lee
mistery Posted October 24, 2012 Author Posted October 24, 2012 Oh Im very sorry to hear that Lee. I will wait for you . I hope it is not a big expense and it gets back to normal. Good Luck with it.
mistery Posted November 5, 2012 Author Posted November 5, 2012 I would like to know more about getting the web viewer to parse the addresses if I can.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now