Jump to content
Sign in to follow this  
Emma in England

Need GREP ability

Recommended Posts

The customer (big UK company, NOT amenable to suggestions) supply data in a very unstructured way. Basically, I need to look at a field like this:

513/3831

Acoustic Solutions 256MB Pink MP3 Player

£17.99-£8.99

save £9

513/3161 Silver

5133178 Pink

Philips Silver MP3 Player

£29.99-£14.99

half price

and extract catalogue numbers, i.e. 513/3831, 513/3161, 513/3178 - note that they don't always put in the '/'!

I have been doing this successfully using AppleScript with Satimage's OSAX. But the main user of this system is on a PC (though she'd love a Mac!), and I have read David Kachel's excellent White Paper and can see that if I can do this within FM, I should.

Any pointers? Is there an array structure in Filemaker that I could set to {1,2,3,4,5,6,7,8,9,0}? (Sorry if this is a very dumb question, but I'm still waiting for a fat text book to make its way across the Atlantic).

Share this post


Link to post
Share on other sites

Although you say it's unstructured would the catalogue number always be 3 characters followed by 4 characters ( with or without the / )?

Does the number always start a new line ?

I would be looking to create 1 record in a temp table for the first 8 characters of each new line in your field,substitute out the "/" if it's there, and see if your left with just a number.

Share this post


Link to post
Share on other sites

Solved my own problem, though my head is aching badly now! I've written a script which splits the field into values, filters them and then checks each word - if it has a full stop (period), it's a price and is ignored. Otherwise, if it has 7 digits, it's a catalogue number.

The worst line looks like this:

Position(MiddleWords(Filter(GetValue ( GREP::Text to GREP ; $x );"1234567890 .");$y;1);".";1;1)=0 and

Length(MiddleWords(Filter(GetValue ( GREP::Text to GREP ; $x );"1234567890 .");$y;1))=7

which scares me. Style tips welcome!

And now I've got to transfer it into the actual database.

Share this post


Link to post
Share on other sites

Nice try, Robert, but I'm afraid not - sometimes they just type a list of products all on one line!

I seem to have double-posted my question, for which apologies. I have developed a tortuous calculation which solves my problem - see other post!

Share this post


Link to post
Share on other sites

Unfortunately I don't have Filemaker Advanced! But feel quite proud to have solved it anyway. Thanks for the file - I'll keep it for future reference.

Share this post


Link to post
Share on other sites

I wonder why you need to deal with values, if you're going to check each individual word anyway. And of course, as Robert points out, any word with 7 digits is a potential false positive - "18/5/2007" for example.

Share this post


Link to post
Share on other sites

Hi Comment,

I have to deal with values otherwise I get false catalogue numbers appearing across line breaks :)

save £9

513/3161 Silver

becomes

save 95133161

- and the 9 confuses everything! Pity as it added a layer of repeats, but there you go.

You're quite right about the other dangers. However, I've checked two publications' worth of files, and these sort of combinations do not occur. It is possible, and then a false record will be created, but as it will not find any associated text or price details it will get spotted and can be deleted. To be honest, that's the least of my worries right now!

Share this post


Link to post
Share on other sites

Line breaks are also word delimiters. Using your example above, MiddleWords ( yourtext ; 3 ; 1 ) should return "513/3161".

If that's not the case, you have a VERY strange line-break character. I don't think it's very likely, but even if so, you could substitute it for another character that WILL break words.

Share this post


Link to post
Share on other sites

I'd begun to forget what I'd done... OK: the first calculation on the field is the 'filter':

Filter ( text ; "1234567890 ." )

- and that lost all the line endings. But I've just had a play with it and realise I can add a return character to my filter, which should mean I can ignore values and just do words! Thanks! (disappears for about 3 hours to work out what on earth those calculations were and struggle with opening and closing parentheses....)

Share this post


Link to post
Share on other sites

The order, I think, should be this:

1. Extract the current word (MiddleWords);

2. Examine the word (Filter and PatternCount);

3. If passed, create a record;

4. On to the next word.

Share this post


Link to post
Share on other sites

Thanks, that was the clear thinking I needed. I've tidied the script up considerably and eliminated a whole variable. (I love variables)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

By using this site, you agree to our Terms of Use.