Audit Trail by Parsing

Brian C · February 24, 2005

I HAVE ATTACHED A NEW VERSION - PLEASE DOWNLOAD

The attachment now has this solution in both v6 and v7 of filemaker.

Last Upload Update: 3/1/2005 4:50PM PST

Update made to description to make it easier to understand.

----

This idea can be applied to auditing, validating multikeys, parsing value list data for conversions, etc. Any field that has data with a delimiting character can be evaluated in this manner.

For this example I decided to outline how do create an audit process.

The advantage of using this approach for auditing purposes is that in the files you want to audit you only need to create 5 scripts, 6 fields (3 calculation 3 global), and 1 relationship. The other benefit is that none of the fields outside of the multipurpose "c_RecNbr" (Record Number) field are stored in any way.

For easing the process of adding this to other tables, once you have created this process in one table, you can then duplicate the same structure into your other tables and import the scripts to make your life easier.

Brian C · February 26, 2005

I'm curious to know any feedback regarding this audit process. Specifically any strengths or weaknesses to this approach compared to other audit trail ideas seen in the past.

SteveB · February 26, 2005

I looked at your files, but you havn't provided enough information to determine what is going on. There are a number of other solutions that track field changes (Ray Cologon of Nightwing, Ugo in Paris, etc.) While all seem to work, they become very difficult to implement when a file has lots of fields, record deletes and creates. Core Solutions just published one which was designed for V7.

I need to do one to synch to a PDA, so I'd be happy to give you any feedback.

Steve

Brian C · March 1, 2005

I have Attached a new version of the files for BOTH v 6 and v7 of Filemaker Pro.

The entire reason why I made this solution is that I wanted to find a way to make it EASY to make a change if fields change in a table.

To add or remove a field from this audit process you only need to make one change to ONE unstored calculation field. The scripts take care of the rest.

Let me know what you think of the current incarnation of this process.

Thanks!

Brian C · March 2, 2005

I have just updated the files in both version so that my description of how the process works makes more sense and is more gramatically correct.

Brian C · March 3, 2005

I made mention in my solution of how you could further parse the data once it reaches the audit table.

In my footnote, I used the example of using a pipeline "|" character to further delimit the field name and field data.

Adding this idea to the original which separates each of the fields by using the tilde "~" your data should look something like this after being evaluated:

Field |name| Data |John Smith| ~

After the data has been sent to the Audit table you could then use the following to further extract each of the two items:

To extract the field name:

Middle(g_AuditFrom, (Position(g_AuditFrom, "|", 1 , 1 ) +1 ), (Position(g_AuditFrom, "|" , 1 , 2) - (Position(g_AuditFrom, "|", 1 , 1 ) + 1)))

To extract the FROM data:

Middle(g_AuditFrom, (Position(g_AuditFrom, "|", 1 , 3 ) +1 ), (Position(g_AuditFrom, "|" , 1 , 4) - (Position(g_AuditFrom, "|", 1 , 3 ) + 1)))

To extract the TO data:

Middle(g_AuditTo, (Position(g_AuditTo, "|", 1 , 3 ) +1 ), (Position(g_AuditTo, "|" , 1 , 4) - (Position(g_AuditTo, "|", 1 , 3 ) + 1)))

Notice that the only thing that changes in each of these calculations is the source of the data (the field name) and what occurance is used in the position functions to determine the beginning and end of each occurance of the pipeline "|".

By using this you can now store just the data and the name of the field. This saves a bit of space in the long run if your tables are very large. This also makes it easier to do a find on specific fields from a specific table.

Now one more footnote:

Something else you can do if you really want to push this farther:

If you decide you want to further separate the data so that text, numeric, date, and time data are stored in their correct field types, you could add a field for each of these types and farther tag your data with a marker that can be extracted and evaluated.

Sample Tags:

T = Text

N = Number

D = Date

I = Time

So your data would look something like the following example after being evaluated:

Tag |D| Field |date| Data |12/10/2004| ~

Using this you could use the same calculations to extract the tag "D" (date) and evaluate it as being a date by using an IF statement. As a result, you then just send the data to the proper field.

SteveB · March 5, 2005

Hi Brian: I downloaded and looked at your Audit Trail. I agree with virtually everything you've done. The calc field c_AudStore1CR can either be eliminated entirely, or replaced by the Substitute function (substitute '~', '

Brian C · March 7, 2005

The Substitue() function is a good idea. Thanks for suggesting it.

When I made the sample files I didn't want to make it too complex so that everyone has the opportunity to learn text parsing. I created the c_AudStore1CR field to do a direct dump for "create" and "delete" to keep things easy. This is fine if your table doesn't have a lot of fields and you're not concerned with anything beyond tracking simple user changes to data. However, if you want to do more than just track changes, or if you have a lot of fields to store, I don't think this approach should be used.

In my own solutions I don't use this approach. I parse the data within the CREATE and DELETE steps within the compare script. I then send the parsed data over to the audit table to have new records created for each field that contains data. I don't create a record for a field that is empty in either of these steps.

Once the data reaches the audit table, I parse the data further so that the tag, field name, and field data can be stored in separate fields. Since I have 4 different types of data, I also have 4 different field types to have this data stored in. By examining the tag, I selectively determine which field the data gets sent to (text,number,data,time).

This helps out later when attempting to do searches on specific types of data.

Since, the field name is now stored in its own field, you can also create a value list based on this field to make searching specific fields easier.

MEMORY LIMITATIONS IN FMP v6:

A text field has a "64k" limitation of 64,000 characters.

A calculation field has a 30k limitation of 30,000 characters on the calculation itself.

A calculation field has a 64k limitation for the data it references. If the evaluated result of a calculation exceeds the 64000 character limit, FMP will not evaluate the calculation, and the result will be empty.

A global field has a 64k limitation for the data stored in it.

MEMORY LIMITATIONS IN FMP v7

A text field has a 1 billion character limitation per repetition.

A calculation field has a 30k limitation of 30,000 characters on the calculation itself.

A calculation field has a 1 billion character limitation for the data it references. If the evaluated result of a calculation exceeds the 1 billion character limit, FMP will not evaluate the calculation, and the result will be empty.

A global field is limited by available memory up to a theoretical limit of 1 billion characters.

The limiting factors depend greatly on the version you are using, how much data is actually stored in each field, and/or how long the calculation is.

To help with setting up your audit steps, I recommend creating an unstored calculation field that contains LENGTH(c_AudStore1). Use this to report the character length of your evaluated calculation to determine when you're approaching the 64k limit as you build the calculation.

I also recommend implementing maximum field lengths. This allows you to gauge how much extra room you will need to reserve so the calculation result won't exceed the memory limit of 64k.

If the data you're tracking is very small, it's possible for the input calculation to approach the 30,000 character limit before you exceed the evaluated limit of the calculation. To keep things to a bare minimum in the calculation field, I recommend removing the descriptive tags which exist only for tutorial purposes.

Example:

"Tag |T| Field |Name| Data|" & name & "|~"

Could become:

"T|Name|" & name & "~"

Which would evaluate to the following if the field had "John Smith" stored in it:

T|Name|John Smith~

This means if you're using any calculations that extract the data once it reachs the audit table, you'll need to fix them so that they extract the data correctly since the format has changed.

Since your solutions have quite a lot of fields, it sounds like you'll probably approach these limts and need to expand this process.

I recommend the following:

Create another unstored calculation field to add your additional fields to.

Create a global field to store the snapsnot of this new field.

Duplicate the same Initialization step for copying the data to the global field.

In the compare script, duplicate the script steps inside of the UPDATE if statement.

Change the references in this duplicated set of script steps from the first calc and global fields to the newly created calc and global fields.

When the script runs and the first loop finishes, just make sure you have 2 set field steps to reset the counters to 1 again before entering the next loop for doing the comparison on the next set of fields.

-----

If you want I could modify the current example I've posted and post a new set of files that include the other processes mentioned in this post.

Brian C · March 7, 2005