Jump to content
Sign in to follow this  

Text Parsing Bug?

Recommended Posts

The result of this calculation should be "9999/STATE/CITY/STREET".


Instead, it returns "01/9999/STATE/CITY/STREET".

However, it returns the correct result for these 2 variations:



How do you get around this? (I am trying to extract the name of the Folder enclosing "STATE/CITY/STREET".)

Share this post

Link to post
Share on other sites
comment    1,390

RightWords ( "ABC/01/9999/STATE/CITY/STREET"; 4 )
 returns "01/9999/STATE/CITY/STREET", because

WordCount ( "ABC/01/9999/STATE/CITY/STREET" )
 returns 5, and

WordCount ( "01/9999" )

returns 1.

Certain word delimiters follow special rules when placed between two numerical characters, or between a numerical character and a non-numerical one. The rules for the forward slash character are the same as for the comma and the hyphen (see here and also here).

One way to get around this is to substitute the "/" with another word delimiter, parse out the words you need, and restore the "/". Another way is by using the position and occurence of "/". This might be more reliable, since it doesn't assume folder names are single words.

Share this post

Link to post
Share on other sites

It’s not as easy as it appears. Consider the path

/Volumes/Film/Video/01/0042/CAMERA 3/CLIPS/

Here are some solutions that do not work.

(1) Parse with word-based functions such as RightWords().

Consecutive directories with numeric names, as in /01/0042/, are considered a single directory named “01,0042”. Note the comma. Slash, comma and hyphen are not word delimiters if they appear between two numerical characters.

(2) Exploit the position and occurrence of "/" to count the words.

In OS X, “Film/Video” is a legal name for folders and files!

Not recommended, but the Finder allows you to do it.

(3) Substitute for the "/" something you’re sure the user will never use in a folder name.

That would be “:”. Guess what? “01:0042” is also considered a single word! And using a space as a delimiter fails on the name “CAMERA 3”.

I am stumped. The only solution I see is to impose folder naming rules on the user: No spaces, no consecutive numbers in directory paths, etc.

Share this post

Link to post
Share on other sites
comment    1,390

First, it would be helpful to know where your path is coming from. Because if it is coming from within Filemaker, the slashes INSIDE names will be already substituted by "%2F". If it's coming from Applescript, they will replaced by ":". In both cases, slashes will be present ONLY as path steps.

If your path data can contain both slashes-in-names AND slashes as path steps, then we are done here. Because there's no way to tell them apart. Not in Filemaker, not in any other application, not by any other means - with perhaps the exception of a crystal ball.

Now, I didn't suggest you use the position and occurence of "/" to count words. You should find the position of the 2 delimiting slashes you are interested in, and plug that into Middle() function. For example, if I wanted to extract the name of the grand-grand-parent:

Let ( [

len = Length ( Path ) ;

start = Position ( Path ; "/" ; len ; - 4 ) + 1 ;

end = Position ( Path ; "/" ; len ; - 3 )

] ;

Middle ( Path ; start ; end - start )


This will return "9999" in your original example.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  


Important Information

By using this site, you agree to our Terms of Use.