Jump to content

This topic is 7900 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Sample file showing a method for finding occurences of words within a specified distance of each other. Refer to the script documentation to see how it works.

This is in reference to the following thread:

http://www.fmforums.com/threads/showflat.php?Cat=&Board=UBB6&Number=40809&page=0&view=collapsed&sb=5&o=31&fpart=1

Link to comment
Share on other sites

  • 5 weeks later...

Bob, a "proximity" find, an interesting concept.

So I looked at the first record and saw that big came 16 words after dog.

(...dog [1]jumped [2]over [3]the [4]fence, [5]and [6]before [7]Henry [8]was [9]able [10]to [11]stop [12]him, [13]he [14]chased [15]the [16]big...)

So I set "Find Word 1: dog", "Find Word 2: big", and "Proximity: 16". When I clicked the find button, all eleven records were found. (Record 7 = "...dog [1]jumped [2]over [3]the [4]fence, [5]and [6]before [7]Henry [8]was [9]able [10]to [11]stop [12]him, [13]he [14]chased [15]the [16]little... "

I have not studied the calculation or the script, but something is not right with this large a proximity. Is there a limit to the proximity which is checkable?

Link to comment
Share on other sites

Bob, I had a good night's sleep and awakened thinking that I had not tested the solution quite enough. So I entered the following "Find Word 1: big", "Find Word 2: little" and "Proximity: 18". Interestingly, this returned 4 records.

Now in two of those records "little" comes first in the sentence and preceeds "big", e.g.,

"...little [1]white [2]dog [3]jumped [4]over [5]the[6]fence, [7]and [8]before [9]Henry[10]was [11]able [12]to [13]stop [14]him, [15]he [16]chased [17]the [18]big ..."

Then on further examination of those records I recognized their similarities and I altered one (the last of the found set) by deleting the word "him". Showed all records and, leaving the search criteria the same, performed the find. Once again four records were found, including:

"...big [1]white [2]dog [3]jumped [4]over [5]the[6]fence, [7]and [8]before [9]Henry[10]was [11]able [12]to [13]stop, [14]he [15]chased [16]the [17]big ..."

I'm sorry to say that there seems to be a problem. I hope you find this useful.

Link to comment
Share on other sites

There quite possibly are problems with the method. I threw it together rather quickly in response to the question in the original thread. Although I tried testing it for a quite a few different situations, I'm sure I left out a few. I guess I posted this more as a concept than a finished solution (okay I'm copping out here). Anyway, I'll have a look at it again when I have a few minutes, and try to figure out the problem.

Link to comment
Share on other sites

Oh yes, I should have pointed out that the order of the search words is insignificant. So, in the first test that you pointed out, if the word 'big' occurs either 16 words before or after the word 'dog' then it is a match. Since many of the records contained the phrase 'big white dog', they would meet the criteria, and as a result, all 11 records in my sample file correctly match the criteria.

I didn't check out the situation mentioned in your second post. Could it be related to this same word order thing?

It's actually easier to make the method order dependent, but I didn't think it would be quite as useful. For example if you wanted to find all records that have references to Thomas Edison, you could enter Thomas as word 1 and Edison as word 2 and a distance of 2. This would then match all of the following:

Edison, Thomas Alva

Thomas Alva Edison

Thomas Edison

which is probably what the user wants.

Link to comment
Share on other sites

"...which is probably what the user wants." Yes, you are right there. And in that regard (e.g. Edison) it works well. I just saw these patterns and was testing what were no doubt extremes. But the explanation and caveats which you provide certainly are reasonable, esp. if the limits or usage is clearly defined for the client.

Link to comment
Share on other sites

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.