Jump to content

Extract most occurred word in a text field


This topic is 2318 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Hello, 

I m trying to find 5 most occurred word in a text field.

for exemple  text : "Hello John how are you ? Jonh may I ask you to go out to night? I like going out at night, you remember."

so the result will be : You (x3), John (x2), Out (x2), Night (2x) Hello (1) 

 

could some one help me with that :angel:

 

Thank you

Link to comment
Share on other sites

:) Thanks Lee Smith

I need because I want to generate key words from a description.

For exemple i m selling a printer, the word printer will be repeated few time in the description, so the script will extract few most repeated word to make them key words from that description. And one of the key word will be "printer" and other most repeated.

I m not sure if I m clear with my ideas :angel:

Link to comment
Share on other sites

I think I m getting close, I change a bit the "pattern.fmp12"

I add a new table with two fields "keyword" and "occurence"

so now instead of answer "John: came up 1 times"  > "John" go to field (keyword) and "1" go to field "occurence", then I ll tri them by bigger occurence and get top 5 word occurred in a text :)))) and then generate list of keywords ))) 

I remarked some issu with "the" I should keep the "the" with the word instead of making it as a different word. 

 

Ibobo :) 

 

pattern.fmp12

Link to comment
Share on other sites

Hi ibobo,

My Bad, I thought this was a file I made a long time ago for a similar need. However,  the credit for this file goes to rwoods who posted it in response in this thread. Pattern count/Keyword density probelm,

I must have deleted my file or I'm not remember what I named it. It used a calculation and not a script, and I provided the words to count. :(

I'll continue to look or will recreate it for you.

Lee

Link to comment
Share on other sites

I took it a little further. Fun exercise. I didn't comment my script but if you stop and think about each step you'll see what's going on. The key is to keep track of the top count as you go. Also FYI I left the word list field in there so you could watch it while debugging but I'd probably move that to a variable, the field isn't really needed. Let me know if you have any questions.

pattern.fmp12

Link to comment
Share on other sites

Hi Lee, Hi Fitch :) 

 

thanks a lot to helping me ! , that work's perfectly )) 

there is one adjustment,  i don't know what is the best ? to ignore the word "for", "and" , "the", "to" or make it as one word ? so it will count occurence of combined word as one word ?

 

Screen Shot 2017-12-14 at 09.48.26.png

Edited by ibobo
Link to comment
Share on other sites

I think using Substitute to ignore them is fine, unless you really want to count them -- then I'd probably put them in their own $wordList "bucket" (or put them in the $count = 1 bucket).

Link to comment
Share on other sites

This topic is 2318 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.