hownow Posted February 2, 2015 Posted February 2, 2015 Hi I have been working with my database and I have run into a problem. I can copy a whole web page which has been the way I collect the large text. I cannot scrap normally because there is a snare with some encryption scheme in the source code. I have a copied a screen copy of the web page as below: I place in a field called "webpageText" I want to capture ONLY the text in RED below and send it to a field called " portionText" It is basically the same on each web page The common denominator on all the pages is the start text is " Ads Company Info" the ending text is "Map" I found a script by Set Field [ YourTable::MyINFO ; Trim ( Substitute ( YourTable::YourTextField ; [ "I used to go" ; "" ] ; [ "Now I stay home." ; "" ] ) ) ] But this doesn't seem to work Below is the web page text USA SIC Directory (Untill 2014) Home State City SIC Code Submit Your info Main SIC Category Real estate agents and... Business Services Educational services Membership organizations Eating and drinking pl... Miscellaneous Services... Non Engineering, accountin... Miscellaneous retail Home furniture, furnis... Wholesale trade Printing and publishing Building construction Social services Amusement and recreati... Health services Personal services Hotels, rooming houses... Building materials, ha... Construction Executive, Legislative... Nondepository credit i... Insurance agents, brok... Automotive dealers and... Communications Post Service Depository institutions Legal services Security and commodity... Motor freight transpor... Navigation: Home > Alaska > Fairbanks > St Matthew's Episcopal Church St Matthew's Episcopal Church St Matthew's Episcopal Church located at 1029 1st Ave,Fairbanks,Alaska,USA,It is Churches company, Tel is 9074562934 (+1-907-456-2934),fax is 9074565235 (+1-907-456-5235),address is 1029 1st Ave.This company SIC code is 866107,SIC Name is Membership organizations,You can find more St Matthew's Episcopal Church contact info like fax,email,website below. Ads Company Info St Matthew's Episcopal Church SIC Code: 866107 SIC Category: Membership organizations SIC Name: Churches State: Alaska - AK City: Fairbanks Country : United States Ads Contact Info Address: 1029 1st Ave Zipcode: 99701-4351 (99701) Tel: 9074562934 (+1-907-456-2934) Fax : 9074565235 (+1-907-456-5235) Website : www.stmatthewschurch.org Email : [email protected] Map This is map of St Matthew's Episcopal Church, address:1029 1st Ave,Fairbanks,Alaska,United States. If you have error address, please submit another address using the form in the map, then search again. Company Name:St Matthew's Episcopal Church Address: 1029 1st Ave,Fairbanks,Alaska,United States Search Address: Other Related Companyies Alaska Native Minorty TV & Radio Ministry Holy Assumption Russion Ortho Silver Salmon Creek Lodge Auke Bay Bible Church Holy Assumption Russion Ortho Association Of School Boards Holy Assumption Russion Ortho traini Jack Randolph - Jack Randolph Insurance Wrangell Chamber Of Commerce Copyright © 2009-2015 Privacy policy - DMCA Policy - Contact Us
the Otter Posted February 2, 2015 Posted February 2, 2015 If you’re going to be doing this kind of thing often (and if you’re using FileMaker Pro Advanced), it would make sense to use a custom function. However, if you’re just looking for a one-off solution to this specific scenario, we can use the following: Let ( [ startText = "¶Ads¶Company Info" ; sStart = Position ( fieldName ; startText ; 1 ; 1 ) ; sLength = Length ( startText ) ; sEnd = sStart + sLength ; endText = "Map" ; eStart = Position ( fieldName ; endText ; sEnd ; 1 ) ] ; //end define Let Middle ( fieldName ; sEnd ; eStart - sEnd ) ) //end Let Let’s analyze what we’re doing here: The Let function allows us to define variables in our calculation. By using brackets, we can define multiple variables in a single let function. The portions of the code preceded by // are comments and thus optional, but help us read what’s going on here. startText is the text you want to find first. (Note: the pilcrow character, ¶, represents a carriage return in FileMaker calculations.) sStart is the position where the first occurrence of startText appears in the field, starting from the first character in the field. sLength is how many characters are in startText. sEnd is the position of the first character in startText plus the number of characters in startText, so it gives us the position of the first character after startText. endText is the text you want to find second. eStart is the position where the first occurrence of endText appears in the field, starting from the first character after startText. To get the portion of the text you want, we use the Middle function, which takes three parameters: the text string (in this case, the field) you want to search; the number of the first character you want to start with, and how many characters you want to return. In this case, the first character we want to get is the one immediately following the startText, so sEnd (i.e. sStart + sLength); and the number of characters we want is however many characters are between the two strings, so eStart - sEnd. As an aside, you could also simplify this by using a Web Viewer and the GetLayoutObjectAttribute function to get the contents of the web page automatically, instead of copying and pasting. Would this help?
hownow Posted February 2, 2015 Author Posted February 2, 2015 Thank you for a such a great explaination! I think GetLayObjectAttribute woud work but I thought that just got the source code. Yes that would be easier if that does the same thing. I am going to try that today I will get back with the results here Again Thanks
hownow Posted February 3, 2015 Author Posted February 3, 2015 Can't see how to copy the text of the webpage using GetLayObjectAttribute.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now