Jump to content
Sign in to follow this  
Jodin

need ID's that are simple and powerful

Recommended Posts

we have a large solution that will be bound and distributed, and we hope to update about every 6 months over the next few years. i've been working on my import/upgrade scripts and keep reading about all the troubles with the built in auto-increment ID's that filemaker uses. we have been using them up until now, putting some descriptive text in front of the number. example - our equipment table numbers each record Equip100, Equip101, Equip102, etc.. because we have to reference the equipment by it's number occasionally (like in value lists in related files, etc.) i wanted to keep the number relatively simple.

all the solutions i've seen for custom generated keys are long seemingly random looking numbers. my users are already upset they have to see "Equip100" at all and would rather just use the name field. i've yet to find an easy way to completely shield them from the primary key in a highly relational system. if i move to a generated ID in the form of something like 234-2321-23435 they'll surely complain more, even if i know it would make my import/upgrade process much easier.

is there a way to auto-generate ID's that are more sequential and recognizable by users without using FM's built in serial numbering?

thanks in advance,

jodin

Share this post


Link to post
Share on other sites

Sure, just find the largest ID number and add one when creating a new record. Create a self relationship using a calculation field Constant = 1, create a new records using a script, and include the script step:

Set Field (ID, Max(SelfByConstant::ID) + 1)

-bd

Share this post


Link to post
Share on other sites

thanks for the tip but this isn't quite working for us. i set it all up and it works great for single users but in a multi-user environment it broke down because if two people started to make a new record at roughly the same time it couldn't update the Self relationship quick enough. i trapped for duplicate ID's first once and then twice and then looped the script to keep trying ID's until one was valid (based off of field level validation) but both times we got an occasional duplicate ID. this was with two people running the script repeatedly as fast as their workstations would allow.

i'm thinking that we can continue to use the built-in Filemaker serial numbers because the next time we bind a solution (for our first upgrade) it will be bound under developer 5.5 and better let us use the "reset serial number". whaddya think about that?

jodin

Share this post


Link to post
Share on other sites

kurt -

i checked out your solution and although it looks awesome, i had hoped to keep our serial numbers more sequential. i really love the 'look and feel' of the built-in sequential numbers and have never had an issue as long as you don't have to do an upgrade in the field. but this time we do.... so i'm kinda stuck.

for now we are waiting on word from FM on whether they can guarantee that the "set next serial number" script step works when bound with dev5.5 and if so, when will we have the full 5.5 dev in our hands. if we can bind a copy of our solution under 5.5 before a customer needs an upgrade/replacement we should be okay.

thanks,

jodin

Share this post


Link to post
Share on other sites

Thanks Moon!

Something that I did not really address was the idea of implementing an "end-user ID" into my scheme, mainly because my scheme does not require such a thing. However this is really something that everyone should be aware of.

The user might need and want some kind of ID, which for a variety of reasons is totally unsuitable for relational design. In this case you can easily present the ID to the user, let them design it, change it, delete it, or whatever, but since the database is using a real ID behind the scenes, the user never has to experience a problem.

Again, thanks for the props, and perhaps I will update the article to include some of your ideas on the use of "end-user IDs" as well as some examples of why simple sequential serial number or end-user controlled ID are so dangerous.

Share this post


Link to post
Share on other sites

I thought I would offer my two cents' worth of opinion on the issue of unique ID fields.

First, it is the DATABASES, not the end user, that require unique key fields to operate properly. The end user often can be presented with a different ID field to help them identify records that are not as complex or foolproof as Capt Kurt's scheme. User-oriented IDs can be as simple as a concatenation of initials, a serial number, or any other scheme you may decide the user needs to help navigate and search for records. It is when you get into related files that you need to ensure a rock-solid identifier to keep things matched up. I have found in my experience with clients that ID numbers seldom get (humanly) referenced in their use of database solutions. Rather, REAL values, such as names, dates, and non-abstract numbers are the what us humans use for our interaction. I think that usually unique idenentifiers used for relationships can be very safely hidden from the end user, and a simpler scheme can be presented if necessary for the user.

I want to add an endorsement for how Capt Kurt's ID scheme. I spent a great deal of time analyzing its impact and have embraced it as the best method of creating unique record ID I have encountered. It may be overkill for simple solutions, but I have systems that are in place and updated periodically with twenty to forty-five related files in them, which used to make it a monster to administer when a new clone was introduced, as each clone database had to have its record ID manually changed to to match what the imported file's sequence demanded. It was the single obstacle to making user-accessible updating possible. Converting the solutions to using the twenty-digit method developed by Capt Kurt makes it a snap.

A word on the conversion process. The word is WORK, in capital letters. It took me more than a week of concentrated planning, work, and testing to convert my forty-file database from serial number to CKS (Capt Kurt Scheme). It is not a trivial task, as the old serial number system has to be maintained and every current relationship must be analyzed and temp files created to retain old keys until each realationship is proved and tested. I have two other big systems to convert that are currently on the back burner as they get updated less often, but in spite of the size of the task I will do them, too, as I find the effort worth it. It goes without saying that all new databases I create now use CKS for unique key fields. For most record IDs where users want to see a number, I use the first six digits for display. No complaints from any of my users, and I have queried them all.

Bottom line: Don't compromise unique IDs because of a perception that users depend on them. They don't. If they think they do, give them a simple one that will serve their purposes, but use as ironclad a scheme as you can for your related Key fields. CKS is the very best I have seen for this, and I am very grateful to Kurt for sharing his excellent method with us and for his outstanding documentation in his FAQ. Thanks, Kurt. I am in your debt.

Share this post


Link to post
Share on other sites

To captkurt...

I tried the link to your ID scheme but only got an error 404 -- page not found. Could you possibly repost this? I've just inherited a very large, totally UNdocumented database here at the university and I could use all the help I can get locking the relationships up tight.

Thanks.

Share this post


Link to post
Share on other sites

<AOL>Me too!</AOL>

Or should I say, me neither... the link is dead.

Share this post


Link to post
Share on other sites

Okay, all those endorsements have me ready to check out CKS... but the link doesn't go anywhere. Anyone know the correct link?

thanks

Share this post


Link to post
Share on other sites

Thanks. Very interesting. But it raises a bunch of questions for me... not sure if I should ask here or there... I figured here and you can post a nice pretty summary there, if my questions aren't too stupid...

As a newbie, I am unaware of the problems you mention but don't describe:

1) What's the problem with numeric IDs vs. Text?

I intend to print the ID as a barcode, so I must restrict to numerals in any case.

A number enforces that its numeric and seems more efficient. Are there precision problems or issues with large integers? What are those functions you can do with numeric Text that you can't do with Numbers?

2) What's the problem with serial numbers?

I assume if I check that they must be unique, that it does that check before accepting a serial number. So, as long as I deal with imports and duplicating records (which you must anyway), it seems like it should work fine.

3) Why is non-sequential important?

Is there some reason non-sequential itself is important? Or were you just distinguishing from the serial numbers?

4) For guaranteed unique'ness, is the RecordID not good enough? Documentation says that it never changes. If it never changes, then nothing else could get that same ID in the same database. Thus, the Date/Time additions seem pointless. Unless you are just doing that to make it horribly unlikely that imports would ever collide since getting the same recordID at exactly the same second would be unlikely. But then, why not just use the time.

5) Most of my files will never reach a million... if they do, I'll be a wealthy guy. So, a 10-digit ID is overkill. 20 is ridiculous... and makes for gigantic barcodes. Is there any reason I should want such extra large IDs?

6) What's the value of adding a couple random characters?

Thanks for educating me,

Brian

Share this post


Link to post
Share on other sites

1) What's the problem with numeric IDs vs. Text?

It is not so much a problem as it is functionality. There is NOTHING that you cannot do with text, while there are lots of useful functions for deling with keys, multi-keys and compound-keys which require text.

Also this is NOT an ID that should ever be used for anything other than internally linking your records together. This is NOT what should be printed on a barcode. Use a simple sequential serial number for that.

2) What's the problem with serial numbers?

Is "1" unique? Nope. Is "032456-234321-987690" unique? Pretty much, as defined by my key strategy. Simple sequential serial numbers are unique ONLY the the given database, plus they are of a given range and that implys meaning. Meaning is bad for keys, because it implys change.

3) Why is non-sequential important?

Is there some reason non-sequential itself is important? Or were you just distinguishing from the serial numbers?

Sequential implys a range a values, it also implys meaning. Non-random, non-sequential strings of 20 characters have no pattern, no meaning, no range of values. This is what you want from a key in a database.

4) For guaranteed unique'ness, is the RecordID not good enough? Documentation says that it never changes. If it never changes, then nothing else could get that same ID in the same database. Thus, the Date/Time additions seem pointless. Unless you are just doing that to make it horribly unlikely that imports would ever collide since getting the same recordID at exactly the same second would be unlikely. But then, why not just use the time.

Share this post


Link to post
Share on other sites

Also this is NOT an ID that should ever be used for anything other than internally linking your records together. This is NOT what should be printed on a barcode. Use a simple sequential serial number for that.

But my barcodes must be unique... they go on member ID cards and on product tags... thus, they might as well also be the ID used in my database. Why not?

And given that, it returns me to your previous point:

It is not so much a problem as it is functionality. There is NOTHING that you cannot do with text, while there are lots of useful functions for deling with keys, multi-keys and compound-keys which require text.

Yes, & is simple for implementing compound keys. But given I restrict my barcode numbers to have the same number of digits (e.g. 100000 to 999999), conversion to text is easy enough.

Is "1" unique? Nope. Is "032456-234321-987690" unique? Pretty much, as defined by my key strategy. Simple sequential serial numbers are unique ONLY the the given database,

I was only trying to be unique for a given database. And strictly speaking, yours aren't guaranteed to be unique either. Sure, the odds are much better, but its not a guarantee. Why try to be unique across all databases? (Again, not being argumentative... just trying to figure out what I am missing... what am I going to run into in the future that I don't know about now.)

plus they are of a given range and that implys meaning. Meaning is bad for keys, because it implys change.

I have to argue against this! My codes have no meaning... yours have meaning! Mine are simply an arbitrary number with N digits. Yours are the creation date, creation time, and database reference ID. The latter is meaningless, but the former two are meaning-rich. Sure, its "scrambled"... but its there and has meaning. I could output a report into Excel and sort by creation date... using your ID field.

This key will survive any kind of import or synchronization with no problems.

Practically, yes its unlikely to collide. But theoretically, this is false. Let's say I make a copy of the database. As soon as its done, I go to make a change on my copy and someone else is freed up to make a change on the original. We both add the new record at the same Time... since its an exact copy, it could get the same ReferenceID... same ID.

On import, either way, I would re-generate the IDs. Isn't that standard practice?

In reply to:

------------------------------------------------------------------------

6) What's the value of adding a couple random characters?

------------------------------------------------------------------------

It was suggested by another member, who felt that some additional randomization was needed. I think that it is a good idea, although largely unnecessary.

I assume the goal is to ensure there is no "meaning". This seems like textbook warnings taken too far. In the old days, people used meaningful codes for IDs. And people would use them as informational fields as well as IDs. BAD... because when the information changed, the ID had to. When the ID couldn't, confusion resulted. The lesson learned is NOT that a sequential field is bad because it has "sequential meaning" or that you need "random numbers"... the lesson is to never let the user use ID fields for any real content. A sequential numbering *is* the recommendation... it has no useful info for the user.

As for barcodes, the textbooks say the same thing about barcodes. They should not be informational fields. They should be meaningless numbers in some range.

And the barcode is facilitating a lookup from the physical world database made up of member ID cards to my members database in the digital world. Exactly the same thing.

Am I missing something???

(I probably am... and you guys can probably save me some huge headaches in my future by explaining what that is.)

Thanks.

Share this post


Link to post
Share on other sites

Lots of questions...good.

But my barcodes must be unique... they go on member ID cards and on product tags... thus, they might as well also be the ID used in my database. Why not?

Any data that is accessed by the user is ANY way is subject to change. Even if you have totally locked down you solution, it still has the possibility of change. This is not an attack on you system or its development, simply good relational design and database theory here.

I was only trying to be unique for a given database. And strictly speaking, yours aren't guaranteed to be unique either. Sure, the odds are much better, but its not a guarantee. Why try to be unique across all databases? (Again, not being argumentative... just trying to figure out what I am missing... what am I going to run into in the future that I don't know about now.)

My system is NOT guaranteed to be unique. The odds are so extremely against it as to make it impossible without deliberate effort. However in many years of testing across thousands of DB in dozens of systems, I have yet to have a single duplicate. Baring deliberate effort of course.

I have to argue against this! My codes have no meaning... yours have meaning! Mine are simply an arbitrary number with N digits. Yours are the creation date, creation time, and database reference ID. The latter is meaningless, but the former two are meaning-rich. Sure, its "scrambled"... but its there and has meaning. I could output a report into Excel and sort by creation date... using your ID field.

Every alpha-numeric sting of characters has meaning at some level. Yes my scheme follows a pattern, but as far as the system is concearned it has no meaning. Users never have access to the ID in my system so they never even know it exists and thus has no meaning to them.

Practically, yes its unlikely to collide. But theoretically, this is false. Let's say I make a copy of the database. As soon as its done, I go to make a change on my copy and someone else is freed up to make a change on the original. We both add the new record at the same Time... since its an exact copy, it could get the same ReferenceID... same ID.

Theoritically possible, but extremely unlikely. Assuming that records are created one for one in two copies of the same DB, the RecordIDs will stay in sync. However it is very unlikely (except through deliberate effort) for the exact same records to be created in the exact same pattern, on the same date at the same time. (This was actually part of the reason for the inclusing of the random characters).

On import, either way, I would re-generate the IDs. Isn't that standard practice?

Why bother with the extra effort? Why not just use a system where this is unnecessary?

I assume the goal is to ensure there is no "meaning". This seems like textbook warnings taken too far. In the old days, people used meaningful codes for IDs. And people would use them as informational fields as well as IDs. BAD... because when the information changed, the ID had to. When the ID couldn't, confusion resulted. The lesson learned is NOT that a sequential field is bad because it has "sequential meaning" or that you need "random numbers"... the lesson is to never let the user use ID fields for any real content. A sequential numbering *is* the recommendation... it has no useful info for the user.

No this system was specifically designed to address REAL WORLD issues, although it is also a classic example of good relational database design. This was not developed as a theory, but actually for practice in a system where database synchronization was used, and it was further refined in a system with over 400 database on 5 servers used by over 300 users.

As for barcodes, the textbooks say the same thing about barcodes. They should not be informational fields. They should be meaningless numbers in some range.

And the barcode is facilitating a lookup from the physical world database made up of member ID cards to my members database in the digital world. Exactly the same thing.

Am I missing something???

(I probably am... and you guys can probably save me some huge headaches in my future by explaining what that is.)

Why in the world should a barcode be meaningless? It is a way of identifying a document and that has lots of meaning.

I think that you barcode should be very meaningful, regardless of how you actually generate it.

The only thing that I am suggesting is that you NOT use this as part of the underlying linkage that ties your databases together.

Share this post


Link to post
Share on other sites

Thanks for putting up with my questions. You've mostly convinced me... but with mods...

Why bother with the extra effort? Why not just use a system where this is unnecessary?

Unlikely to be necessary. But, if you want to be absolutely safe, you must re-generate.

But I think I am seeing the WISDOM of your technique... it is sooo unlikely to generate the same RecordID at the same second in time, that you can virtually ignore the possibility and pretend you have a unique ID for all objects in all databases. And then you can cheat and use them that way in exports and imports... and that is sooo much easier. Yes, there is small chance of corruption, but that is so rare that it can be ignored (as long as your database isn't life-and-death). Just go edit the field when discovered... probably never happen.

That works for me...

So, now the other issue... my barcodes...

Why in the world should a barcode be meaningless? It is a way of identifying a document and that has lots of meaning. I think that you barcode should be very meaningful, regardless of how you actually generate it. The only thing that I am suggesting is that you NOT use this as part of the underlying linkage that ties your databases together.

The Barcode is just another way of generating an external database that you need to re-import records from. For that you need a universally unique ID code, just like when cloning a database. By using your ID, I could potentially clone a database, continue generating people's ID cards on both databases independently, import all the changes across, and likely never have a collision. Cool!

The only downside is your IDs might be too long for many barcode readers. However, I could easily shorten it up and not increase the odds of collision in practice by much at all. For example, the hyphens add nothing but readability... drop those... users aren't supposed to be reading them. The first two digits of the date won't change for 30 years. Drop them without thought. The third of the date only changes every 3 years. Minimal value... drop that. The first digit of the time is always zero... no value... drop that. So, now I am down to a 14-digit code. I think that's a workable barcode.

Right( "00" & Status(CurrentDate), 3) &

Right( "0000" & Status(CurrentTime), 5) &

Right( "00000" & Status(CurrentRecordID), 6)

Do you see any problems in shortening it up that way?

Thanks.

Share this post


Link to post
Share on other sites

My main reason for the padding and the dashes was simply to use the full 20 characters. Your 14 character idea is fine, especially since you have clearly stated the possible problems and decided that dropping those characters will not affect you.

I am glad that our discussion on this topic has help you in your development.

Share this post


Link to post
Share on other sites

Also as a clarification on Kennedy's use of a 14 digit key. His scheme is exactly the same as what I described, where as I padded the number out to 20 digits and added a "-" to seperate the components, Kennedy omits the dash and only pads for 14 characters.

Other than the specific personal preference on size and appearance, we are in agreement on the implementation of the key scheme.

Good job on the templates, I especially like the column sorting deal. I may just steal that!

Share this post


Link to post
Share on other sites

What is the issue with the built-in serial numbers, except for "update" (export import) situations? In that case, I would save the current serial number to an "update control" database and set it from there again. The current serial number can be found from a New Record/Delete Record command in three lines (if not from a status function - I'm stranger).

And regarding the bar code application: it does not hurt to have two unique ID's. one is the FM internal record key, and one is the unique member number.

Share this post


Link to post
Share on other sites

nope it does not do this, at least I did not find it. It explains the very clever ID-time-date concept.

Share this post


Link to post
Share on other sites

What exactly are you looking for?

The "built-in" serial number in Filemaker is a problem because it is a simple sequential serial number. Without some massive effort on the part of the developer, with validation or external storage it is impossible to adequately protect the contents of the field or to insure that it is unique. It is very easy to override, to replicate, to duplicate, etc.

That said, this was also NOT the point of this function, since Filemaker internally stores a unique number for each record. This was really just designed as a simple sequential number, good for a whole load of things, but not for any kind of unique ID.

Share this post


Link to post
Share on other sites

Kurt, me is somehow new to Filemaker but the data base involved is already large and I'm browsing.... I would be alerted if it would break down easily because of the use of serial numbers, as FM is known to have a few bugs, and I encountered some even fewer already.

From what is written, serial numbers are not that unreliable. FM doesn't, in a multiuser environment, assign two records the same number. It seems that the only issue, on the FM level, is the starting value after a data base update.

FM isn't rock-solid anyway. Its great advantages are elsewhere. I completely agree however, that the 20-digit record id scheme makes sense.

"Peter"

Share this post


Link to post
Share on other sites

And, what happens when you need unique serial-sequential numbers?. Example: New Invoice (invoice number will be key to the dependant Invoice details table)

Share this post


Link to post
Share on other sites

Hi,

 

As I am going thru many interesting subject of this forum, I want to share my approach of ID numbers. I have read the serial number incremented by one and automatically set is the best, there are also people they don't like this formula....!

In my opinion, the number can be what we (each one wants), when I was in Mexico, yes in Mexico, I found the ID card the citizens use is in fact their credential for voting. So, the document is kind of highly secure, and they use a number which is unique to each mexico citizen (110 millions in 2010)

The number is calculated based on the fields of their Voters database. Name, Father name, mother name, date of birth, place of birth (inlucing city and state).. By the way it should be something similar in the USA for voters ID reference..

Of course I made a deviation of this unique number for a database which purpose is to list victims of any natural disaster.

 

 

Left (Familia::JDFApellidoP ; 3 ) & Left (Familia::JDFNombre ; 2 ) & Right (FechaNacimAño ; 2)  & Left (FechaNacimMes ; 2)  & Left (FechaNacimDia ; 2)  & Left (Familia::JDFApellidoM ; 3 )

 

The 15 characters are very secure, and the probability of having two persons with exactly the same names, date of birth, remains remote. In the worst case of finding two equal ID numbers, the number of them make the problem solved with manual changing one of them. Of course, other parameters can be added to the calculated field to make the number larger than 15 char., and even more secure.

 

I am interested in reading opinions on this matter

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.