Concatenating from two records

Dchall_San_Antonio · June 6, 2007

First of all I have to acknowledge that I have a permanent mental block against databases. I could probably figure this out in Excel and be done with it but I have friends relying on a FileMaker solution. Here's the situation: I have a pile of CDs with lots of document files on them. The files on each CD are indexed in a .txt file on the same CD. The software that creates the files and index has a limit of 64 characters and spaces per field. If a field contents exceeds 64 characters, then the program creates a duplicate line with the same document number, but the info in that field picks up where the text left off in the previous line. Furthermore there are several fields in which this can happen. This results in an index that is 1,500 lines long for about 500 actual "records."

To process the data since the .txt file is not tab or comma delimited, I import the txt file through Excel into FileMaker. The fields are specified lengths so Excel parses the data coming in very easily. The import into FMP is just as easy. Each line of text becomes one record in FMP. Now what I need to do is recombine the data fields which should have never been separated from each other. Here's an outline of what I think needs to happen:

1. Sort the data based on the document number.

2. Find the first record and the document number.

3. Check the document number in the current record and compare with the document number from the next record.

4. If the two document numbers do not match, go to the next record and repeat from step 3 above.

5. If the two document numbers match...

5a. Compare the contents of each field in the first record with the contents of the second record.

5b. If the contents of record 1 field 1 are the same as the contents of record 2 field 1, check the next field

5c. If the contents of record 1 field 1 are different from the contents of record 2 field 1, then append (concatenate) the contents of record 2 field 1 into record 1 field 1.

5d. Continue going field by field through record 1 and record 2 concatenating fields where necessary and not where not.

5e. Continue checking the document number of record 1 with the document number on subsequent records until all the records with the same document number have been concatenated back to record 1.

6. Delete the records which are now superfluous.

7. Go to the next record and repeat from step 3 above.

8. When the last record is reached, quit.

I think this logic works. With 1,500 records per CD and about 60 CDs, all this has to be hands-off.

I can write a script to do steps 1 and 2, but then things break down. I'm very proud that I was able to do the first two steps without help, but as another db-challenged person wrote, "I searched for the answer but I don't even know what questions to ask to figure this out."

I will accept any suggestions on this.

jteich · June 6, 2007

I would use a different approach. Use a second table with only one field containing unique document numbers. You can easily create and fill this table by exporting the document numbers sorted and grouped.

Now you can establish a relationship (you will love databases after this : via the document number.

A simple additional calculation in the new table will do the whole rest of the job using the list() function, substituting the line breaks by nothing or a blank.

-jens

aldipalo · June 6, 2007

Couldn't you just concatenate the cells in excel into 1 cell and just import that cell? Then you wouldn't have all the extra records. Am I missing something?

Dchall_San_Antonio · June 7, 2007

Quote
Couldn't you just concatenate the cells in excel into 1 cell and just import that cell? Then you wouldn't have all the extra records. Am I missing something?

Merging the cells in Excel would work, but there are 1,000 records on each CD that will need to be fixed and multiple fields in those records, times 60 CDs so that's easily 100,000 manual changes to make. Did you have a script in mind to do that in Excel?

Dchall_San_Antonio · June 7, 2007

Quote
I would use a different approach. Use a second table with only one field containing unique document numbers. You can easily create and fill this table by exporting the document numbers sorted and grouped.

Now you can establish a relationship (you will love databases after this via the document number.

A simple additional calculation in the new table will do the whole rest of the job using the list() function, substituting the line breaks by nothing or a blank.

-jens

Could I then export the new calculated table into a new flat file? The reason for this is that the people who are going to use this are not going to accept anything but a very simple file structure that they can understand. The most they will do is make different reports with it, but they have to have the data where they can find it (i.e. a flat file). Their eyes start to shoot blood when they hear me talk about relational databases.

Genx · June 7, 2007

Probably not the best name for the topic..

Anyway:

Freeze Window

Sort Records.

Loop

Set Variable[$id ; yourField]

Go To Record[ Next; Exit After Last]

If[$id = yourField]

Delete Record

Go To Record[Previous]

End If

End Loop

I ah, am pretty sure that will work, but make sure you try it with a small amount of dud data.

Dchall_San_Antonio · June 7, 2007

Thanks Genx. That gives me some terms to go look up.

Dchall_San_Antonio · June 7, 2007

Quote
Probably not the best name for the topic..

Anyway:

Freeze Window

Sort Records.

Loop

Set Variable[$id ; yourField]

Go To Record[ Next; Exit After Last]

If[$id = yourField]

Delete Record

Go To Record[Previous]

End If

End Loop

I ah, am pretty sure that will work, but make sure you try it with a small amount of dud data.

Probably not the best name for the topic..

As I said, I don't even know the terminology to ask the question. But thanks for ringing in.

Okay I'm lost but we all knew that would happen. Now I know what Freeze Window means. That brings us to Set Variable. I found that in the help and discovered the $ means it's a local variable as opposed to a global. $id probably refers to the document number assigned by the creator?? Or is it a new variable called id and the part you called yourField would be called fldDocumentNumber in my world??? Would you mind elaborating on how that statement works?

Then we go to the next record. If $id equals the value previously assigned to $id, then we delete that record?: What about the data I need to retrieve and append from the spill-over fields? or were you leaving that part as an exercise for the interested student? Heh, heh. So then we back up to the previous record and hit the If statement again. But wait! If we return to the original If statement with the previous (original) record, and test for the fldDocumentNumber, it is always going to be the same and then the original record will be deleted in the next step. Do I have that right? Then assuming I delete that record, what happens next? It looks like the routine kicks out of both the If and the Loop statements.

Dud data: All I have is dud data until I can figure out how to index it properly. In any case the master text files are burned forever onto the CDs so I'm good to go.

Genx · June 7, 2007

Okay, i thought you just wanted to delete duplicates... well its similar anyway. The only issue here is that once you concatenate the value to fieldX of the previous record, if you have a third duplicate record it won't work because you've already changed the field value to be record + duplicate record... so even if the third record had the same value as the second record or the first record it won't be properly picked up. So essentially this script won't work exactly as desired and I started writing a different one to use GetNthRecord and store the results in a variable but thought it might take a while so decided not to write it yet... Anyway, hope this helps to some extent.

Freeze Window

Sort Records.

Loop

#$id is just a name, you could call your variable whatever the heck you wanted. The fact that it is a local variable means it only persists during the script and is destroyed after that script.

#All we do here is store the current document id so we can compare it to the next record

Set Variable[$docNumber ; docNumber]

Go To Record[ Next; Exit After Last]

#Here we are running a check if $docNumber is the same as the current "unique" document name.

If[$docNumber = docNumber]

#seeing as we don't want to immediately delete, run your comparisons here as such:

SetVariable[$x ; fieldX]

SetVariable[$y ; fieldY]

SetVariable[$z ; fieldZ]

Delete Record

# You are now on the record after the record you deleted with your initial record being 1 record prior (i.e. the original before the duplicate)

Go To Record[Previous]

Set Field[fieldX ; If( $x <> fieldX ; fieldX & "¶" & $x ]

Set Field[fieldY ; ... ]

Set Field[fieldZ ; ...]

End If

End Loop

Dchall_San_Antonio · June 7, 2007

I don't know if this will help or not. Some of my people are MS Access users and some are FMP users, so I have the same problem to solve with MS Access. The answer I got for Access was the following, which seems to work.

Quote
Sub LoadXLData()
Dim Db As DAO.Database

Dim Rs1 As DAO.Recordset

Dim Rs2 As DAO.Recordset

' Define Vars

Dim tInstNumber

On Error GoTo Err_Command41_Click

Set Db = CurrentDb

Set Rs1 = Db.OpenRecordset("DHIndexXLDataImport", dbOpenDynaset)

Set Rs2 = Db.OpenRecordset("DHIndexDataResults", dbOpenDynaset)

Rs1.MoveFirst

' Initialize 1st rec

tInstNumber = ""

With Rs1

Do While Not Rs1.EOF

' Step through records checking for new Inst Numbers

If Rs1![inst Number] <> tInstNumber Then

' Create/Append a new record.

Rs2.AddNew

Rs2![NumberID] = Rs1![NumberID]

Rs2![inst Number] = Rs1![inst Number]

Rs2![Grantor] = Trim(Rs1![Grantor])

Rs2![Grantee] = Trim(Rs1![Grantee])

Rs2![instrument] = Rs1![instrument]

Rs2![Records] = Rs1![Records]

Rs2![Volume] = Rs1![Volume]

Rs2!

= Rs1!

Rs2![File Time] = Rs1![File Time]

Rs2![Description] = Trim(Rs1![Description])

Rs2![File Date] = Rs1![File Date]

Rs2![inst Date] = Rs1![inst Date]

'Save Inst Number to compare later

tInstNumber = Rs1![inst Number]

Else

' Update Last record - same Inst Number so I'll just add data to the last record

Rs2.MoveLast

Rs2.Edit

' Add more fields if you want ...

If Trim(Rs2![Grantor]) <> Trim(Rs1![Grantor]) Then

Rs2![Grantor] = Rs2![Grantor] + " , " + Trim(Rs1![Grantor])

End If

If Trim(Rs2![Grantee]) <> Trim(Rs1![Grantee]) Then

Rs2![Grantee] = Rs2![Grantee] + " , " + Trim(Rs1![Grantee])

End If

If Trim(Rs2![Description]) <> Trim(Rs1![Description]) Then

Rs2![Description] = Rs2![Description] + " , " + Trim(Rs1![Description])

End If

End If

' Save the record.

Rs2.Update

Rs1.MoveNext

Loop

End With

Rs1.Close

Rs2.Close

Db.Close

MsgBox ("Update Complete!")

Exit_Command41_Click:

Set Rs1 = Nothing

Set Rs2 = Nothing

Set Db = Nothing

Exit Sub

Err_Command41_Click:

' An unexpected error occurred, display the normal error message.

MsgBox Err.Description

' Set the Response argument to suppress an error message and undo

' changes.

End Sub

The guy who wrote that is an ex-co-worker who used the field names from an old database, so Inst Number = fldDocumentNumber today. But that code worked and shrunk 1,547 records down to 461. I suppose I could export the Access result to Excel and import back into FMP. I sort of understand the gist of what that code does, but don't ask me to provide comments (HAH!). But the point is, does it help with the development of a FMP solution?

Genx · June 7, 2007

Well, not really, its not that i don't have the concept to write that much code for you.. its that i don't have the time right now. But if no one else is willing to give it a shot i'll jot something down on saturday for you... but it's perfectly doable.

For anyone who wants to give it a shot I recommend that you use GetNthRecord, Two Loops And a variable for each comparison field.

Dchall_San_Antonio · June 7, 2007

Quote
...its not that i don't have the concept to write that much code for you.. its that i don't have the time right now. But if no one else is willing to give it a shot i'll jot something down on saturday for you...

Which brings up the rhetorical question, "Is THIS what you do all day?" It seems like you have an immediate and insightful answer for all questions related to FMP. At least every time I have had a question, it was not long before you got on it and the answer was forthcoming. I sincerely appreciate your talent for this and your willingness to share.

Genx · June 7, 2007

I just remember how helpful these forums were to me when i first started out (lol, you should take a look at some of my original topics).

AudioFreak · June 7, 2007

Quote
I just remember how helpful these forums were to me when i first started out (lol, you should take a look at some of my original topics).

I think a lot of us feel the same way. I don't do FM only for a living, but I did gain a lot of knowledge from here which is why I try to give back a little to others starting out.

Genx · June 7, 2007

Lol and to the question:

Quote
"Is THIS what you do all day?"

No... but it does keep me quite entertained, and often looking at other peoples problems helps me find fairly obvious solutions to my own that i just couldn't see before. And of course the practice helps -- I actually have to look things up sometimes lol (but helps to know what your looking for).

Take now for example, I have an exam at 9am in oh 6.5 hours, but... reading tends to get boring after 18 hours straight :

Genx · June 9, 2007

Hey,

Okay, I started out trying to do something really dynamic with this, but then remembered you cant set fields dnymaically so that was a pointless waste of time..

Anyway, the attached is the example in the least number of script steps i could think of.

sample.zip

Dchall_San_Antonio · June 9, 2007

Thank you, you're a good man. Preliminary review indicates it does what I was hoping to do. Now I have something to play with on Monday.

Hope your testing went well.

Dchall_San_Antonio · June 11, 2007

One thing that gripes me is brain dead software. How many programs are there in any Windows format where Copy and Paste do not work? Soooo, why can’t you copy the text from a FMP script and paste it into Word?

Anyway...

Genx: I have added my (copious) comments to your comments in the script. Would you please take a look to see if I have the logic and purpose of the script correct? The script works for your example but I will have to extend it to other fields for my purpose, so I want to try to know what I’m doing. Your example (and my going back and forth to the Help file) taught me a lot. Thank you again. Unfortunately the Help file does not tell you why the steps are used, just how to use them. My comments are an attempt to describe why the steps are used.

I have a question here that relates to another question I asked on another forum. Instead of the delimiter you suggested in your script, can a carriage return be used instead? I would like the data from each concatenated field to start on a new line. This isn’t perfect for me but it is better than the imperfections induced by other delimiters.

Another question. The last time I coded anything was in 1980. At that time I would have been fried in oil for using a GoTo statement. One problem with them is they arbitrarily break the structure of the code and will take you places without regard for whether any of your conditions are met. They were also hard to trace in a debug. The preferred method of hopping through the code was to increment a counter and rely on the loop, If, or end of a subroutine to take you where you should be. So rather than saying,

Go to Record/Request/Page [Next], and cycling to the top of the loop, we would have said

docID=docID + 1, and then let the loop statement step back to the beginning. Has the prohibition on GoTo statements been lifted?

commented_Sample_Script.zip

Genx · June 11, 2007

Quote
Instead of the delimiter you suggested in your script, can a carriage return be used instead?

Yes - you can use whatever character you want, that is the reason i set it up as a variable. In this case, just change it to the little pilcrow charcter.

Ah... it's the only way to move to the next record? Given that the records have been sorted, there is no issue with you mistakinly going to another record, it will always go to the next one and not somewhere quite odd. i.e. if you have records (1,3,2,4,5 - based on the current order of the records) and you're on record id 2, Go To Record [Next] will NEVER take you to any record but record 4 -- Anyway, I wouldn't know, I wasn't born until 1989 lol.

Okay, about the script:

Firstly -- just a general note, the reason we use go to record [First] immediatley after the sort is that, if we are on record one before the sort, and then the sort leads record one to end up somewhere in the middle, then we'll start from the middle -- so we just need to explicitely bring ourselves back to the start.

Now, the rest of it is pretty much right (see it's not that hard , but make sure you add additional clear variable steps e.g. Set Variable[$names ; ""] for each variable that you use.

Dchall_San_Antonio · June 12, 2007

I'll try the dodgy little pilcrow character and let you know how this goes. Thanks.

Dchall_San_Antonio · June 12, 2007

Melbourne, we have a problem.

The script deletes all files and then keeps running until I hit the Esc key. I believe I specified everything I needed to but maybe I missed something.

I'm attaching a 1,500 record file with two fields. The script takes a minute or so to run, so you can watch the number of records shrinking. In MS Access I think I ended up with 411 records after running the script for that one. Certainly when the number shrinks to less than 400 you can escape out of the script any time.

LimitedFeb2006LOIndexImportWithNumberFields_Converted_Copy.zip

Genx · June 12, 2007

LOL!!

Open Script Maker -- Now look at theses 3 script steps:

Exit Loop If

Set Variable

Set Field

-- Note that they are surrounded by /* */ -- This means that that entire line is commented out... i.e. they effectively say:

Exit Loop If nothing

Set Variable to nothing

Set Field to nothing

Get Rid of the /* */

They get put in by FM when you copy from one file to a completely unrelated file to save you from generating errors lol.

Genx · June 12, 2007

I just ran it and it brings it down to exactly 411 records.

Good Luck!!

Just double check those three script steps with the example files.

Dchall_San_Antonio · June 12, 2007

Quote

Get Rid of the /* */

They get put in by FM when you copy from one file to a completely unrelated file to save you from generating errors lol.

BA$TARD$! Do they do stuff like that just to keep people out of their club? -rhetorical question

I'll run it tomorrow morning. I can't believe I just blew another day. Well I learned something, and then something else, so all was not lost.

Lee Smith · June 12, 2007

Quote

--snip-- Do they do stuff like that just to keep people out of their club?

Believe it or not, this is a big improvement over the past versions of FileMaker, where they used instead.

At least now, you know what was there, and it is easier to fix it for the current file.

Lee

Dchall_San_Antonio · June 13, 2007

Lee, so this is the new and improved version, huh. Well, it is much easier to use than the other brand, so I'm not really complaining.

Genx, there were some quotes that had to come out, too. My first attempt had appended the text from one of the script lines. Pretty funny!! Oh and adding the dodgy little paragraph return character cured my problems on that other forum.

I seem to be on my way now. Thank you guys! I'm sure I'll be back to chat again soon.

Genx · June 13, 2007

Lol, dodgey little paragraph character. Someone one these forums ages ago worked out it was called a pilcrow :o

Lee Smith · June 13, 2007

Quote
Lee, so this is the new and improved version, huh. Well, it is much easier to use than the other brand, so I'm not really complaining.

Sounds like a complaint to me.

If your files are prepared correctly, it will minimize this problem. When they are not, then at least FileMaker tells you what you forgot to do, rather than just saying you forgot something.

Lee

Lee Smith · June 13, 2007

Read here for more information on the Pilcrow Sign

Lee

Dchall_San_Antonio · June 13, 2007

Wheeee HAH!

Ahem. Pardon my enthusiasm, but it works on the three fields I needed to concatenate. Thank you for bringing it up to the point where the interested 5th grader could have finished it. I needed the hand holding.

NOW. Now that I have the records put back together, I can see the flaws in the original text file and the database it came from. I can use the results as they are, but it is not as pretty as I had hoped. The original problem, if you'll recall, was that the originator of the text file exported data from their database to a text file. When one of three particular fields overflowed, they created a new line of text to contain the overflow text. The problem which is now visible is that the new line concatenates the data from all three fields whether the data in one field or another is new or not. For some reason, when they export the data to text, they repeated the data in my three target fields even when it did not overflow. So if one field overflowed to three lines, the data from the other two fields was (superfluously) repeated in each of the two new lines. (am I getting clearer or muddier? - the last three sentences were three ways of trying to say the same thing, so pick your favorite )

Genx, and anyone who runs the sample I uploaded earlier, will see that the names in the "grantor" field repeat sometimes. It would be nice to eliminate the duplication within each field.

My only question at this point is whether to open up a new topic in the appropriate forum or should I just continue with the issue here? Since Lee and Genx are the only two replying and you two are both familiar with the issues already solved?

Dchall_San_Antonio · June 13, 2007

Quote

Sounds like a complaint to me.

If your files are prepared correctly, it will minimize this problem. When they are not, then at least FileMaker tells you what you forgot to do, rather than just saying you forgot something.

Lee

Here again, I am so unfamiliar with the program, and database programs in general, that I don't know what I should complain about or let go. Apparently I should be happy because things have improved so much.

Besides this extra duty of concatenating someone else's database to save someone else money, I also make maps. I've used ArcVIEW and I've used Manifold. In that realm, I KNOW the progress being made and can appreciate the ease of use in recent versions of the software. But if you are unfamiliar with the recent progress, you might complain about the interface on those programs. Same here with me and FMP. Just tell me what to be happy with and I can adjust my temper meter.

comment · June 13, 2007

Try another approach.

ImportWithNumberFields.fp7.zip

Genx · June 13, 2007

Lol Michael you always make me feel like a fifth grader (I'm learnding )

But seriously if you've got a lot to loop through I suggest you try comment's method instead.

Dchall_San_Antonio · June 13, 2007

That certainly is a different approach. To summarize, a second table (empty) and a value list (also empty) must be created before running the script. While the first table allows the document numbers to be duplicated, the second table requires the document numbers to be unique. Once the db is set up you can run the script. The script imports the records from the first table to the second, and, since the second table requires the document numbers to be unique, the “duplicates” are deleted. However, somehow a magical thing happens when using the value list. When the script is finished, the Grantor field has been essentially concatenated without any duplicate values.

Here’s the script provided by comment...

1. Freeze Window

2. Go to Layout [“CleanUp” (CleanUp)]

3. Import Records [No dialog; “NameOfCurrentFile.fp7”; Add; Mac Roman]

4. Replace Field Contents [No dialog; CleanUp::Grantor; ValueListItems ( Get (FileName ); “Related Grantors”)]

5. Commit Records/Requests []

6. Show All Records

Here are my step-by-step comments in an attempt to understand the script.

1. Stops the computer from refreshing the screen at every step along the way.

2. Another table called “CleanUp” was set up beforehand and not populated. Step 2 in the script puts you there to do the work. The CleanUp table defines the document number as indexed and the number must be unique.

3. This step imports the records from the specified file to “add” to the current table.

4. This one replaces the contents of the field called Grantor with the contents of the Value List called “Related Grantors.” The value list was also established beforehand.

5. Commit seems to be another word for update or save. This step permanently changes the data in all the imported records.

6. Unfreezes the window and shows all the records.

· Am I correct in assuming that to concatenate three fields I would have to have three value lists established beforehand?

· How does the value list get populated in the first place?

· How does the value list know to use only the data in the records with the same document ID number? I see it works, but why doesn’t the value list fill up with the fldGrantor data from every record? or should I just leave well-enough alone?

· Won’t the value list be alphabetized whether I want it to be or not? If all I want to join are names it is not so bad, but one field I need to concatenate is general comments. Those really should be in order.

comment · June 13, 2007

Quote
Am I correct in assuming that to concatenate three fields I would have to have three value lists established beforehand?

Yes.

Quote
How does the value list get populated in the first place?

Once it's defined, it's populated - or at least, potentially populated.

Quote
How does the value list know to use only the data in the records with the same document ID number?

The value list is defined to show only values from related records. Only documents with the same DocumentID are related.

Quote
Won’t the value list be alphabetized whether I want it to be or not?

Yes. If you need a different order, then this method is not suitable. You could use the List() function, but that will not delete the duplicates. You may need a custom function to accomplish this.

Sign In

Concatenating from two records

Recommended Posts

Create an account or sign in to comment

Create an account

Sign in

Important Information