rivet Posted February 28, 2005 Posted February 28, 2005 I have a string of css and I would like to remove all characters inbetween and including the opening and closing tags. '<code>'(code could be any length of characters). I figured there a reursive text substitution would do it. Any ideas?
-Queue- Posted February 28, 2005 Posted February 28, 2005 How about Set Field [yourfield; Replace( yourfield; Position( yourfield; "<code>"; 0; 1 ); Position( yourfield; "</code>"; 0; 1 ) + 7 - Position( yourfield; "<code>"; 0; 1 ); "" )] or Set Field [yourfield; Let( B = Position( yourfield; "<code>"; 0; 1 ); Replace( yourfield; B; Position( yourfield; "</code>"; 0; 1 ) + 7 - B; "" ) )]
rivet Posted February 28, 2005 Author Posted February 28, 2005 What I would like is to have it done all withing the calc. I know FMP7 can now do recursive calc, yet I have not been able to figure them out yet. __ see attached CleanTest.fp7.zip
comment Posted February 28, 2005 Posted February 28, 2005 If I understand this correctly, you have a text with <some code> in it, and you would like to remove that as well as <another code>, and perhaps <yet another one> to get: "If I understand this correctly, you have a text with in it, and you would like to remove that as well as, and perhaps to get:" If so, yes - you need either a custom function or a looping script.
-Queue- Posted February 28, 2005 Posted February 28, 2005 You could make yourfield an auto-enter calc with 'do not replace' deselected. Let( B = Position( yourfield; "<code>"; 0; 1 ); Case( B; Replace( yourfield; B; Position( yourfield; "</code>"; 0; 1 ) + 7 - B; "" ); yourfield ) If you only want to remove the code tags, a simple Substitute( ) function would work also.
BobWeaver Posted March 1, 2005 Posted March 1, 2005 I'm assuming Rivet wants to remove the code tags along with all the stuff between the opening and closing tags. So, how 'bout this: Evaluate ( """ & Substitute(YourField;["<code>";""&Left(""];["</code>";"";0)&""])&""" ) This will remove multiple occurrences all at once. Example: Input: "Here is some sample text.<code>Oh oh, here is some nasty code that needs to be removed</code> Now what was I talking about? Oh yes, I was saying<code>Ha ha more evil code stuff</code> that this text is code-free." Output: "Here is some sample text. Now what was I talking about? Oh yes, I was saying that this text is code-free."
rivet Posted March 1, 2005 Author Posted March 1, 2005 wow...that is it - thanks (in the words of Morpheus 'You are the one')
comment Posted March 1, 2005 Posted March 1, 2005 Nice one, Bob! You managed to guess what the question was AND came up with a neat trick - double score!
BobWeaver Posted March 1, 2005 Posted March 1, 2005 Thanks, Comment. I suspect we will be seeing a lot of neat stuff based on the Evaluate() function. BTW, It occurred to me that if the original text contains quotation marks, the formula will fail. So here is a an updated version to correct the problem: Evaluate ( """ & Substitute(TextField;["";""];[""";"""];["
comment Posted March 1, 2005 Posted March 1, 2005 I was trying to make a more readable version of your formula, and while doing that I inadvertently solved the other problem as well: Evaluate ( Substitute ( Quote ( TextField ) ; [ "<code>" ; Quote ( "& Left (" ) ] ; [ "</code>" ; Quote ( " ; 0 ) & " ) ] ) )
-Queue- Posted March 1, 2005 Posted March 1, 2005 ******* spiffy! I'll definitely be stealing that one in the future.
rivet Posted March 9, 2005 Author Posted March 9, 2005 PartII My ultimate goal is that I would like to be able to determine which words/characters in a field of text have been styled with bold, and tag it with my own tags. So I figured if I converted the text with the GetAsCSS function and then create a recursive custom calc that would clean out all tagging but the bold which would be retagged with something else (ie '{b}' '{/b}'). This solution (from previous post) will clean all the CSS tagging: Evaluate ( Substitute ( Quote ( TextField ) ; [ "<" ; Quote ( "& Left (" ) ] ; [ ">" ; Quote ( " ; 0 ) & " ) ] ) ) But before that I need the recursive scan for tag substitution of all lines with a SPAN Style of Bold in it. EXAMPLE: The quick brown fox jumped [color:"red"]over the lazy dog GetAsCss Function: <SPAN STYLE= "" >The quick </SPAN> <SPAN STYLE= "font-weight: bold;" >brown</SPAN> <SPAN STYLE= "" > fox jumped </SPAN> <SPAN STYLE= "color: #AA0000;font-weight: bold;text-decoration:underline;" >over</SPAN> <SPAN STYLE= "" > the lazy dog</SPAN> Desired Result: The quick {b}brown{/b} fox jumped {b}over{/b} the lazy dog.
rivet Posted March 9, 2005 Author Posted March 9, 2005 PartII My ultimate goal is that I would like to be able to determine which words/characters in a field of text have been styled with bold, and tag it with my own tags. So I figured if I converted the text with the GetAsCSS function and then create a recursive custom calc that would clean out all tagging but the bold which would be retagged with something else (ie '{b}' '{/b}'). This solution (from previous post) will clean all the CSS tagging: Evaluate ( Substitute ( Quote ( TextField ) ; [ "<" ; Quote ( "& Left (" ) ] ; [ ">" ; Quote ( " ; 0 ) & " ) ] ) ) But before that I need the recursive scan for tag substitution of all lines with a SPAN Style of Bold in it. EXAMPLE: The quick brown fox jumped [color:"red"]over the lazy dog GetAsCss Function: <SPAN STYLE= "" >The quick </SPAN> <SPAN STYLE= "font-weight: bold;" >brown</SPAN> <SPAN STYLE= "" > fox jumped </SPAN> <SPAN STYLE= "color: #AA0000;font-weight: bold;text-decoration:underline;" >over</SPAN> <SPAN STYLE= "" > the lazy dog</SPAN> Desired Result: The quick {b}brown{/b} fox jumped {b}over{/b} the lazy dog.
rivet Posted March 9, 2005 Author Posted March 9, 2005 PartII My ultimate goal is that I would like to be able to determine which words/characters in a field of text have been styled with bold, and tag it with my own tags. So I figured if I converted the text with the GetAsCSS function and then create a recursive custom calc that would clean out all tagging but the bold which would be retagged with something else (ie '{b}' '{/b}'). This solution (from previous post) will clean all the CSS tagging: Evaluate ( Substitute ( Quote ( TextField ) ; [ "<" ; Quote ( "& Left (" ) ] ; [ ">" ; Quote ( " ; 0 ) & " ) ] ) ) But before that I need the recursive scan for tag substitution of all lines with a SPAN Style of Bold in it. EXAMPLE: The quick brown fox jumped [color:"red"]over the lazy dog GetAsCss Function: <SPAN STYLE= "" >The quick </SPAN> <SPAN STYLE= "font-weight: bold;" >brown</SPAN> <SPAN STYLE= "" > fox jumped </SPAN> <SPAN STYLE= "color: #AA0000;font-weight: bold;text-decoration:underline;" >over</SPAN> <SPAN STYLE= "" > the lazy dog</SPAN> Desired Result: The quick {b}brown{/b} fox jumped {b}over{/b} the lazy dog.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Hmm, gets more complicated all the time. Well, you can use this: Evaluate( Substitute(Quote(TextField); ["<SPAN";""&Let(; ["</SPAN>";""];Case(b;"{b}"&T&"{/b}";T))&""]; [">";"";"BOLD");T=""] )) It works with the sample text you provided, but I'm not sure how this will interact with other tags that are embedded in the text.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Hmm, gets more complicated all the time. Well, you can use this: Evaluate( Substitute(Quote(TextField); ["<SPAN";""&Let(; ["</SPAN>";""];Case(b;"{b}"&T&"{/b}";T))&""]; [">";"";"BOLD");T=""] )) It works with the sample text you provided, but I'm not sure how this will interact with other tags that are embedded in the text.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Hmm, gets more complicated all the time. Well, you can use this: Evaluate( Substitute(Quote(TextField); ["<SPAN";""&Let(; ["</SPAN>";""];Case(b;"{b}"&T&"{/b}";T))&""]; [">";"";"BOLD");T=""] )) It works with the sample text you provided, but I'm not sure how this will interact with other tags that are embedded in the text.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Actually, I do know how it will interact. It won't work unless you remove all non-SPAN tags first, then run it through the function I gave. So, your general search for "<" will have to be replaced with individual functions which look for and strip each different type of tag.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Actually, I do know how it will interact. It won't work unless you remove all non-SPAN tags first, then run it through the function I gave. So, your general search for "<" will have to be replaced with individual functions which look for and strip each different type of tag.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 Actually, I do know how it will interact. It won't work unless you remove all non-SPAN tags first, then run it through the function I gave. So, your general search for "<" will have to be replaced with individual functions which look for and strip each different type of tag.
BobWeaver Posted March 10, 2005 Posted March 10, 2005 ...and then again... I think this should work without interference with other tags: Let([ a=Substitute(Quote(TextField); ["<";""&Let([T="<"]; [">";">"; S=PatternCount(T;"SPAN"); X=PatternCount(T;"</"); B=PatternCount(T;"BOLD")]; Case(X;T;S AND B;"<SPANBOLD>";S;"<SPAN>";T))&""]); b=Evaluate(a); c=Substitute(Quote(:; ["<SPANBOLD>";""&Let(; ["<SPAN>";""&Let(; ["</SPAN>";""];T&:&""]) ]; Evaluate© ) So, you should be able to run it through this first, to apply the bold tags, and then you can run the result through the other function to strip off all the rest of the tags. And they say you can't do regex in Filemaker. Ha! <<Edit note: Corrected the formula; I missed a slash & had a reference to a missing field. >>
BobWeaver Posted March 10, 2005 Posted March 10, 2005 ...and then again... I think this should work without interference with other tags: Let([ a=Substitute(Quote(TextField); ["<";""&Let([T="<"]; [">";">"; S=PatternCount(T;"SPAN"); X=PatternCount(T;"</"); B=PatternCount(T;"BOLD")]; Case(X;T;S AND B;"<SPANBOLD>";S;"<SPAN>";T))&""]); b=Evaluate(a); c=Substitute(Quote(:; ["<SPANBOLD>";""&Let(; ["<SPAN>";""&Let(; ["</SPAN>";""];T&:&""]) ]; Evaluate© ) So, you should be able to run it through this first, to apply the bold tags, and then you can run the result through the other function to strip off all the rest of the tags. And they say you can't do regex in Filemaker. Ha! <<Edit note: Corrected the formula; I missed a slash & had a reference to a missing field. >>
BobWeaver Posted March 10, 2005 Posted March 10, 2005 ...and then again... I think this should work without interference with other tags: Let([ a=Substitute(Quote(TextField); ["<";""&Let([T="<"]; [">";">"; S=PatternCount(T;"SPAN"); X=PatternCount(T;"</"); B=PatternCount(T;"BOLD")]; Case(X;T;S AND B;"<SPANBOLD>";S;"<SPAN>";T))&""]); b=Evaluate(a); c=Substitute(Quote(:; ["<SPANBOLD>";""&Let(; ["<SPAN>";""&Let(; ["</SPAN>";""];T&:&""]) ]; Evaluate© ) So, you should be able to run it through this first, to apply the bold tags, and then you can run the result through the other function to strip off all the rest of the tags. And they say you can't do regex in Filemaker. Ha! <<Edit note: Corrected the formula; I missed a slash & had a reference to a missing field. >>
BobWeaver Posted March 21, 2005 Posted March 21, 2005 Just one last followup. I was playing with a variation of the last formula for use in a current project, and I noticed a couple of things you need to watch out for: 1. When using the GetAsCSS() function to generate the original tagged text, it will 'escape' certain reserved characters such as <, >, &, ", and all non-ASCII characters (codes >127). So, you will need to use Substitute to convert them back, Example: Substitute(CSSText; [""";"""] ["’";"
BobWeaver Posted March 21, 2005 Posted March 21, 2005 Just one last followup. I was playing with a variation of the last formula for use in a current project, and I noticed a couple of things you need to watch out for: 1. When using the GetAsCSS() function to generate the original tagged text, it will 'escape' certain reserved characters such as <, >, &, ", and all non-ASCII characters (codes >127). So, you will need to use Substitute to convert them back, Example: Substitute(CSSText; [""";"""] ["’";"
rivet Posted March 21, 2005 Author Posted March 21, 2005 Bob this has been a great help, I am still wrapping my head around it. Thanks for the update.
rivet Posted March 21, 2005 Author Posted March 21, 2005 Bob this has been a great help, I am still wrapping my head around it. Thanks for the update.
comment Posted March 21, 2005 Posted March 21, 2005 There are more problems like that. For example, if the text contains a CR, the CSS code will convert it to <BR>. Now this is considered to be code, and therefore is removed. I believe the code indeed needs to be pre- and post-processed to catch these cases. BTW, your formula returns the text broken into separate lines, following the FMP convention of coding CSS into separate lines. So, not only are the original breaks removed, I am getting a lot of new ones. This can also be solved by removing the code's CR's in preprocessing. There is another problem for which I don't yet see a solution: Filemaker ends each codeline with a space and CR. The spaces are not inside < > brackets, so they are considered a part of the original text. Now, if the code is known to originate in Filemaker, it can be dealt with (but then, why would anyone bother). If the source of the code is unknown, it is unpredictable. Someone might write </SPAN> (some spaces here) <SPAN>.
comment Posted March 21, 2005 Posted March 21, 2005 There are more problems like that. For example, if the text contains a CR, the CSS code will convert it to <BR>. Now this is considered to be code, and therefore is removed. I believe the code indeed needs to be pre- and post-processed to catch these cases. BTW, your formula returns the text broken into separate lines, following the FMP convention of coding CSS into separate lines. So, not only are the original breaks removed, I am getting a lot of new ones. This can also be solved by removing the code's CR's in preprocessing. There is another problem for which I don't yet see a solution: Filemaker ends each codeline with a space and CR. The spaces are not inside < > brackets, so they are considered a part of the original text. Now, if the code is known to originate in Filemaker, it can be dealt with (but then, why would anyone bother). If the source of the code is unknown, it is unpredictable. Someone might write </SPAN> (some spaces here) <SPAN>.
BobWeaver Posted March 21, 2005 Posted March 21, 2005 As for the <BR> line breaks, yes that's something else created by the GetAsCSS function that needs to be processed. How these things are handled will depend on the circumstances of the specific application. In my own particular project, I was able to account for the space-CR at the end of the code line by including it in the search/replace. So, it effectively gets deleted. And I had already pre-processed the linebreaks in the original text, so any <BR> tags that occurred later were spurious and could be deleted. Finally, I was basing my formula on the assumption that the tagged text was well formed. If it's not, then there's no way of fixing that. Any other text processor would be equally unhappy finding a closing tag before an opening tag. But, if the source text is generated by the GetAsCSS() function, that should never happen.
BobWeaver Posted March 21, 2005 Posted March 21, 2005 As for the <BR> line breaks, yes that's something else created by the GetAsCSS function that needs to be processed. How these things are handled will depend on the circumstances of the specific application. In my own particular project, I was able to account for the space-CR at the end of the code line by including it in the search/replace. So, it effectively gets deleted. And I had already pre-processed the linebreaks in the original text, so any <BR> tags that occurred later were spurious and could be deleted. Finally, I was basing my formula on the assumption that the tagged text was well formed. If it's not, then there's no way of fixing that. Any other text processor would be equally unhappy finding a closing tag before an opening tag. But, if the source text is generated by the GetAsCSS() function, that should never happen.
comment Posted March 21, 2005 Posted March 21, 2005 Any other text processor would be equally unhappy finding a closing tag before an opening tag I am afraid that got lost in the translation. I meant, even in a well-formed marked-up text, you can have a closing tag followed directly by a an opening tag. A browser is supposed to ignore any spaces or CR's in between. For example: <h1> my heading </h1> (any number of spaces/CR's here) <p> My real text... or: <SPAN STYLE= "" >This is a sample which </SPAN> <SPAN STYLE= "font-weight: bold;" >will</SPAN> <SPAN STYLE= "" > fail</SPAN>
comment Posted March 21, 2005 Posted March 21, 2005 Any other text processor would be equally unhappy finding a closing tag before an opening tag I am afraid that got lost in the translation. I meant, even in a well-formed marked-up text, you can have a closing tag followed directly by a an opening tag. A browser is supposed to ignore any spaces or CR's in between. For example: <h1> my heading </h1> (any number of spaces/CR's here) <p> My real text... or: <SPAN STYLE= "" >This is a sample which </SPAN> <SPAN STYLE= "font-weight: bold;" >will</SPAN> <SPAN STYLE= "" > fail</SPAN>
BobWeaver Posted March 22, 2005 Posted March 22, 2005 Sorry, yes, I misunderstood what you were getting at. I have been doing a bunch of things lately with rtf files where the rules a bit different, and was thinking about two different things at the same time.
Recommended Posts
This topic is 7243 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now