Been racking my brain: Supertype / subtype

dmontano · October 13, 2008

I want to "try" this table structure:

1. Supertype Table: Parties

2. Subtype Table of Parties: People

3. Subtype Table of Parties: Organizations

Sample data entry flow would be:

Data is entered only in the Subtype tables (People or Organizations) which creates a one-to-one relationship automatically to the Supertype table "Parties". I can get the tables to connect automatically using the method Fenton had provided some time ago, at this link:

http://fmforums.com/forum/showtopic.php?tid/194695/tp/1/

but that method requires a field from the Supertype table to be on the layout of the Subtype tables that the User would need to enter some data or select a value in order to trigger the connection.

My problem lies in the fact that the "Party" supertype table is acting only as the structure to "represent" these two subtypes to be used in other solution relationships as described in some of the books and papers I have been researching.

Due to the fact that the attributes for each of these subtype tables are unique to themselves, they share no attributes which leaves the supertype table with just the "Party" primary key. In the books, they refer to using the "Organization" name and the concatenation of the First and Last name of a person from the people table to provide a value in the "Parties" table (I think).

But I am "trying" to implement the one-to-one as described in the link above and can not get it to work.

On my subtype table layout I have no real mandatory field from the supertype table that a user would enter or select a value from to trigger the one-to-one connection.

If ANYONE has any experience with actually using, implementing, or understanding the implementation of Supertype / Subtype tables... your help would be greatly appreciated.

Dave Graham · October 13, 2008

Data is entered only in the Subtype tables (People or Organizations) which creates a one-to-one relationship automatically to the Supertype table "Parties". I can get the tables to connect automatically using the method Fenton had provided some time ago, at this link:

http://fmforums.com/forum/showtopic.php?tid/194695/tp/1/

I had no idea I'd become famous! ;-)

but that method requires a field from the Supertype table to be on the layout of the Subtype tables that the User would need to enter some data or select a value in order to trigger the connection.

Correct.

On my subtype table layout I have no real mandatory field from the supertype table that a user would enter or select a value from to trigger the one-to-one connection.

The reverse join technique is a handy shortcut; nothing more. In your case you should create a custom menu and script to control the new record creation. The script would switch context to the PARTY table, create the record, capture it's primary key, switch context back to the subtype table, create the record and set the PARTY_ID value.

- Dave

comment · October 13, 2008

I'm afraid that no matter how you implement this, you might be disappointed with the result. If the subtypes do not have any shared attributes, the usefulness of the supertype table will be quite limited. It will be able to display its SerialID and some calculated data to its parent table, but in order to provide any functionality beyond that you will need to employ some fancy jumping around, which might not be worth the final result.

Dave Graham · October 13, 2008

If the subtypes do not have any shared attributes, the usefulness of the supertype table will be quite limited.

Actually, it's quite the opposite of what you've mentioned. There are two huge advantages of the party model implementation, regardless of whether they share any common attributes:

1. The supertype entity binds all of the subtypes together in a logical Table Occurrence Group. All of the related entities and scripting can be shared amongst the various subtypes, which is a tremendous advantage. You can also copy and paste groups of layout elements between layouts based on the various subtype TOs and they just "work".

2. Reporting or batch operations such as mailing campaigns can be done from a single context without having to jump around to gather data.

It will be able to display its SerialID and some calculated data to its parent table, but in order to provide any functionality beyond that you will need to employ some fancy jumping around, which might not be worth the final result.

One of the major whole points to implement the party model is to prevent "jumping around". I think you may have missed some of the critical elements of this approach.

dmontano · October 13, 2008

Thanks David and Comment,

I hope this discussion really continues. I have researched high and low to find examples of the implementation of the "Party" concept into a physical database - and have come up quite empty.

I have found numerous white papers, books, and educational material in the form of slides, etc., and the following "Macro" concept of Supertype and Subtype is presented. However, these are always presented as "logical"models. Since I am ignorant to database design to a high degree AND I am not formally trained to interpret the logical model into a physical one - I get stuck quite easily.

I fully understand that a logical model can be implemented in a few ways, but have no experience to see when one situation calls for one implementation over the other NOR do I know how to implement the Super/Sub in Filemaker.

Before I was exposed to the Super/subtype concept, I frequently ran into "entities" that where very difficult to model in a clear and concise manner. It all started with "Equipment". I could have joined the masses and stuck it all in one table - it would have been ugly and it would have had some complexities in its own right. But then I started to see more and more corallaries with other entity types. So, I thought I should wrap my head around the concept of supertype / subtype and learn how to implement it and then I would be able to see what complexities it presents so I can become better at making the right choice as to how to build my entity structures and relationships.

Another case in point that I came across was this:

I built an Employee table (for our employees)

then I built a table for Vendor Employees

the structures where practically identical. Same for Vendor Companies, Customers, etc.

I could have continued building my entities like that, but then I realized that I would be building a solution that was predestined to take one approach for good.

Given the material I have been reading - the Supertype / Subtype approach seems to be anticipatory of what may come down the pipe and not be tied up because of the data structure.

I am not professing this as the way things should be done. I have expresses that I am ignorant in this area. So I want to build it and see what its pros and cons are. I need help building a few connections and be exposed to the techniques that should be employed when encountering one particular type of Supertype/Subtype structure versus another.

Sorry so long winded, but I hope this will put what I want to learn in perspective.

Thanks!

comment · October 13, 2008

Given a structure of:


Parent -< Supertype - TypeA

              |

            TypeB

1. You cannot show data in a portal from Parent to Supertype, without using either a calculation field or stacked fields. Now, which fields should be paired together for this, if the subtypes do not have any shared attributes?

2. The same is true for a report produced from the Supertype table. What are the columns for such layout? It's easy to combine CompanyName with a person's FullName; but when SubtypeA has fields for Color, Size and Material, while SubtypeB has fields for Country, Province and ZIP code, then the layout design becomes quite difficult. Or one could simply leave room for both - which would be the same as having both subtypes in a single table from the beginning.

3. You cannot easily GTRR from Parent to a subtype - you need to figure out which subtype is the correct one first.

I could go on, but I think that's sufficient. I am no stranger to this model - I have used it as far back as version 4. It can be VERY useful in some situations, for example when different transaction types need to have an entry in a common ledger. But it's not a magic cure for anything and it has costs alongside the benefits.

Dave Graham · October 14, 2008

Given a structure of:
Parent -< Supertype - TypeA

              |

            TypeB
1. You cannot show data in a portal from Parent to Supertype, without using either a calculation field or stacked fields. Now, which fields should be paired together for this, if the subtypes do not have any shared attributes?

I set indicator fields in the supertype table to flag the type(s) of related subtype records. This can be done automatically via relational create or via script. It's actually quite trivial.

2. The same is true for a report produced from the Supertype table. What are the columns for such layout? It's easy to combine CompanyName with a person's FullName; but when SubtypeA has fields for Color, Size and Material, while SubtypeB has fields for Country, Province and ZIP code, then the layout design becomes quite difficult. Or one could simply leave room for both - which would be the same as having both subtypes in a single table from the beginning.

I don't get it. You first imply that it would be difficult to determine the columns for the layout then describe the simple nature of adding a column from wherever the source field resides; quite a simple process really. How does this become difficult? I don't get what you are saying at all.

NB, if you had fields for Country, Province and Postal code in a contact subtype table then you most certainly have a flawed data model. These fields belong to an ADDRESS entity and should exist in their own related table. The ADDRESS table can be further broken down to be a subtype of a CONTACTMECHANISM supertype, in the same fashion as the Party Model does for contacts.

3. You cannot easily GTRR from Parent to a subtype - you need to figure out which subtype is the correct one first.

I don't understand the logic again. How would you navigate without first knowing where you want to go? If you know where you want to go, you'd know which one is the correct one to begin with.

I could go on, but I think that's sufficient. I am no stranger to this model - I have used it as far back as version 4. It can be VERY useful in some situations, for example when different transaction types need to have an entry in a common ledger. But it's not a magic cure for anything and it has costs alongside the benefits.

There was simply no easy way to implement a super/subtype model with FileMaker prior to version 7 due to the limitation of the relationship model. I'm beginning to understand how you might not be crazy about this now; I suspect you haven't revisited it lately.

I'm finishing up a conversion project now that heavily leans on these principles. When the project began in FileMaker 2 it was a bunch of separate database files. Over time it grew and grew into a massive bunch of silos (i.e., contact information literally existed in eight distinctly separate groups of files). When we converted this system we merged all contact information into a single party table, and all of the former silos became subtype tables, which branched off of it. The system is currently in currently in use by 65+ users in four offices and contains >200 tables. I can tell you unequivocally that it's both simple to implement (once you grasp a few tricks) and hugely beneficial with almost no "cost" to speak of.

While it is easier to deal with some of FileMaker's quirks by placing all of the attributes in the supertype table, it makes for a far cleaner and more semantically correct model to keep the attributes where they belong.

Ocean West · October 14, 2008

Dave, the next logical extension to this model is usually the discussion regarding temporal data, such as keeping historical data - would care to enlighten the inquiring minds :

Dave Graham · October 14, 2008

I hope this discussion really continues. I have researched high and low to find examples of the implementation of the "Party" concept into a physical database - and have come up quite empty.

There are some minor quirks to implement it in FileMaker, but it's largely just a matter of creating the tables and relationships just like any other RDBMS.

I fully understand that a logical model can be implemented in a few ways, but have no experience to see when one situation calls for one implementation over the other NOR do I know how to implement the Super/Sub in Filemaker.

I've attached an old example file I whipped up. It's a bit rough around the edges but demonstrates how you can quickly create a related Contact supertype record from a Company or Person subtype layout.

NB, you will see that I break my normal Anchor Buoy conventions and use bidirectional communication from layouts based on a buoys. This is key to unlocking a huge benefit from this model and is well worth the slight deviation from the rules.

Another case in point that I came across was this:
I built an Employee table (for our employees)

then I built a table for Vendor Employees

Both of these entities in most cases will be modeled on a single PERSON table. An employee could also have related EMPLOYMENT records, but that's a separate entity. If you know for a certainty that an attribute belongs only to a single subtype then it's safe to add it in the appropriate subtype table. For all others it is bother more correct and requires less time to just add it to the supertype and be done with it. Be careful how far down the path of normalization you go; the longer the relationship chain the more difficult it can be to do things in FileMaker.

Given the material I have been reading - the Supertype / Subtype approach seems to be anticipatory of what may come down the pipe and not be tied up because of the data structure.

I think it's more correct to say that proper data modeling provides flexibility and scalability. Employing sub and supertyping is part of the structure and is the proper way to handle certain models like the PARTY model (among others).

You are definitely on the right track. Experiment and explore this a bit further and you will not regret it.

Contact_Type_Example.fp7.zip

Dave Graham · October 14, 2008

Dave, the next logical extension to this model is usually the discussion regarding temporal data, such as keeping historical data - would care to enlighten the inquiring minds

When you hear about temporal data it sounds like a big deal. In fact it's just a simple matter of not overwriting things that change over time. It's useful to see how information changes and for auditing as well.

For example, what if you needed to know the address on file for a customer last August? While this may or may not be important to you, it could be for some industries (credit reporting agencies for example). Rather than write over the address when a customer moves, you simply indicate the date(s) that the related entity was valid for (e.g., Date From, Date Thru), and optionally an indicator field for the current one.

Another example:

Let's say that you manage a sales database and you are wisely using the Party Model. You could have a LEAD subtype for those contacts in your system which represent a sales opportunity. Rather than deleting a lead if there is no further opportunity, you should instead just expire it, along with a note as to why it has been expired. Since a contact could become a lead many times over it's lifetime, it could be useful to know which contacts were former leads and optionally research how those opportunities worked out in the past.

comment · October 14, 2008

I'm afraid you are ignoring the starting point:

Due to the fact that the attributes for each of these subtype tables are unique to themselves, they share no attributes which leaves the supertype table with just the "Party" primary key.

Now let's say I am a user looking at a portal to the supertype table: what do I see (other than a serial number)?

Or let's say I am looking at a report produced from the supertype table: what are my column headings?

If you are not using such portal, and do not need to produce a union report (I notice your demo does neither), you can safely ignore my reservations.

Dave Graham · October 14, 2008

Now let's say I am a user looking at a portal to the supertype table: what do I see (other than a serial number)?

Usually there are common attributes, but either way it's not terribly important since it's not a requirement. Even if you don't have shared attributes, you are likely to see other data such as related phones, addresses, invoice history, call notes, etc. which are related to the PARTY.

Or let's say I am looking at a report produced from the supertype table: what are my column headings?

Any related field from the super and/or subtype tables, or any other related field for that matter. How is this different from any other report?

comment · October 14, 2008

Even if you don't have shared attributes, you are likely to see other data such as related phones, addresses, invoice history, call notes, etc. which are related to the PARTY.

If you can see that, then they DO have shared attributes. Whether they are in a related table or not makes no difference in the current context. If one of the subtypes doesn't have ANY address (local or related), then such portal is not going to be very useful for that subtype.

Usually there are common attributes, but either way it's not terribly important since it's not a requirement.

I didn't say it was a requirement. I said OP might be disappointed with the result.

How is this different from any other report?

It's different because when the attributes of each subtype sre different (or even just different enough), then each subtype has different and conflicting requirements upon the layout's real estate. This is true for every union report, regardless of how it's implemented.

Dave Graham · October 15, 2008

If you can see that, then they DO have shared attributes. Whether they are in a related table or not makes no difference in the current context.

A shared attribute would be a field in the PARTY table. An address field would be a related field in an ADDRESS table. These are not the same thing and it does make quite a bit of difference in the current context. The purpose of relating ADDRESS, PHONE and other such tables to the PARTY is so that they can be shared amongst all of the subtypes from a single context without duplication.

If one of the subtypes doesn't have ANY address (local or related), then such portal is not going to be very useful for that subtype.

I can't follow this line of thinking. Even if one subtype could not have an address -- which is purely a hypothetical line of thinking because in the real world all PARTY subtypes would --, how does this limit the tremendous overall advantage otherwise offered by the PARTY model to other contact subtypes?

I didn't say it was a requirement. I said OP might be disappointed with the result.

You're entitled to your own opinion. My own experience, the experience of many expert colleagues of mine, and the published works from many experts in data modeling tell me that this is not true.

It's different because when the attributes of each subtype sre different (or even just different enough), then each subtype has different and conflicting requirements upon the layout's real estate. This is true for every union report, regardless of how it's implemented.

I can't grasp what you are referring to. Any different field has an effect on a layout's real estate, and is in no way limited to the PARTY model. There's simply no way that these fields can somehow conflict with one another.

I'll give you a concrete example of some reports that can be done simply from a layout based on the PARTY supertype. Say that I want to produce a demographics report showing every contact in my database subsummarized by zip code. From the party table I can include customers, vendors, employees, etc (some or all) in this report. What if I wanted a list of all contacts with e-mail addresses? Again, it's trivial if you have properly modeled your data.

I think it's time that you whipped up a sample file that illustrates some of the challenges you are facing. Without it I suspect we could go around and around like this forever and never agree.

comment · October 15, 2008

We are indeed going in a circle here. You say that it's hypothetical for the subtypes not to have shared attributes. But that's exactly what OP said. You keep giving examples where there ARE shared attributes, in order to prove the validity of the model. I have no dispute with the model and I have already said it can VERY useful - when the real-life situation fits. I don't know what the real-life situation is here, and OP could be very well mistaken in his assessment (it does seem strange for contacts not to have any shared attributes) - but that's what he said, and my reservation was, and still is, directed at that.

If you need an example of a difficult union report, then consider this:

CUSTOMER: Adam

CHARGES

Labor:

• Design | 5hrs | $50 | $250

• Coding | 3hrs | $75 | $225

...

Materials:

• Widgets | 3" | 7" | Red | Office | $25

...

External Expenses:

• TV Spot | 30" | WXYZ | 2008-10-15 | Prime | $750

...

PAYMENTS

• 2008-10-15 | Check | #1205 | Bank of Singapore | $1000

...

BALANCE

$250

CUSTOMER: Betty

...

This may seem a bit contrived, but I have real-life examples that are not too far away from it. In fact, I have one that's even worse.

Dave Graham · October 15, 2008

We are indeed going in a circle here. You say that it's hypothetical for the subtypes not to have shared attributes. But that's exactly what OP said.

He said that there were no shared attributes between PEOPLE and ORGANIZATIONS. It's perfectly acceptable for these two PARTY types to not have shared attributes.

You keep giving examples where there ARE shared attributes, in order to prove the validity of the model. I have no dispute with the model and I have already said it can VERY useful - when the real-life situation fits. I don't know what the real-life situation is here, and OP could be very well mistaken in his assessment (it does seem strange for contacts not to have any shared attributes) - but that's what he said, and my reservation was, and still is, directed at that.

Again ... related data is NOT the same as shared attributes, and this is the entire justification for the validity of this model. If the PARTY types he mentioned had no shared attributes and NO similar related data then yes, I would agree with you. While he didn't mention it specifically, you should infer that the entities he wishes to track would no doubt have a lot of related entities in common (even if they aren't shared attributes).

I have attached a modified version of my sample file, which I cleaned up a bit. This new sample is properly normalized and has no shared attributes between PERSONS and ORGANIZATIONS.

If you need an example of a difficult union report, then consider this:(...)

This may seem a bit contrived, but I have real-life examples that are not too far away from it. In fact, I have one that's even worse.

I'll grant you that a report which contains joins from multiple tables with multiple related records can be a little tricky to do in FileMaker, which forces you to rely on portals or a dedicated reporting table. That example doesn't compare well with what we're talking about, since 99% of the time there is a one-to-one relationship between a super and subtype. Even when there isn't a 1-to-1 relationship, there should only be a single current subtype for each supertype. Since you don't have to show many related records on supertype reports, it's no more difficult to report on this data than it would be if you had all of the fields local to the supertype table. I still don't understand how this model could impose "different and conflicting requirements upon the layout's real estate?"

dmontano · October 15, 2008

Whle I haven't a chance to look at the file you posted David, I am very interested in looking at it when I get home.

As for my original statement of not having any shared attributes, I have stumbled upon an observation last night when I was staring at my tables, layouts and my dilemma....

Party is somewhat unique in comparison to my previous attempt at wanting to use a Supertype / Subtype model without realizing it (this was the issue that Fenton had provided input on and the link I posted in the first post.).

Equipment "seems" to be an easier fit to implement. Shared attributes for the Supertype would be:

Manufacturer (I now think manufacturer would reside in subtype Organization; supertype Party)

Model

Make or Brand

then the various equipment subtypes would hold the data particular to their characterists (Camera would have focal length, body type, shutter speed, etc.. While Chainsaws would have a totally different set of fields in which to record their characterists.) I am being extreme in my example and I believe the main point is that only the "Equipment" items that warrant the tracking to such a degree would be held in this type of data model.

Therefore the Equipment Supertype would have relatively meaningful data to present at the Equipment Supertype level for a portal or a report. Equipment is easier for me to see the separation and what fields would go where.

Party on the other hand has my head twisted. I have been so accustom to describing a person entity by their role: vendor, employee, customer, etc. I still am prodding along trying to become more aware of these ethereal-to-me entities. I struggle.

Hi Comment, your comment about the sole attribute ID got me thinking and after trying to get it to work, I experience the difficulty of triggering the related record creation and the User (me) experience. I absolutely could not get a connection made that I could use for a User interface.

Thats when I thought, "Why don't I try and move the name fields up to the Supertype. Then place those related fields on their respective subtype table's layout and upon entry in those fields the related record is cerated in the Supertype. It does work. Told everyone I was ignorant. I did not know that I could do that - but I probably have been doing it for a while - if that makes any sense.

David offered a scripted method and I built that and it does work, but how to "discriminate" between types is a challenge for me and still is - will work on this. I think the scripted method could get over my hump, but it did shed some light on what "I" would be facing. Given the fact that you know my extreme limitations and database noviceness - your comments are helpful as they provide some insight into the challenges I may face.

However, my brain is seeing the data in a particular organization scheme and placing fields in dedicated tables is much more logical to me. I have a hard time keeping things straight - and Supertype / subtype gives me that framework. As I eluded to previously - I have not turned up via research example physical files in which I can implement and see for myself - in particular... using FileMaker.

If anyone happens to have a copy of Len Silverston's book (Page 30 I think) it would illustrate what I am looking at and what I am so uncertain about. When the "logical" data model is illustrated with "Subtypes" inside of "Subtypes" inside of "Subtype" inside a "Supertype" it does makes sense. What does not make ANY sense to me is how to build the physical from the logical illustration. For now, all I have been able to do is approach the model literally and build tables for each of the logically modeled subtypes.

This discussion has turned out to be the most helpful bit of information I have found. My fingers are crossed that the discussion continues.

Dave Graham · October 15, 2008

If anyone happens to have a copy of Len Silverston's book (Page 30 I think) it would illustrate what I am looking at and what I am so uncertain about. When the "logical" data model is illustrated with "Subtypes" inside of "Subtypes" inside of "Subtype" inside a "Supertype" it does makes sense. What does not make ANY sense to me is how to build the physical from the logical illustration. For now, all I have been able to do is approach the model literally and build tables for each of the logically modeled subtypes.

Keep in mind that Len provides a lot of information, some of which is optional for certain industries or applications. You are likely to get a little buried if you start off trying to implement his complete models! }:( -)

You can see from my first example file that I denormalized the PARTY types a bit and put all of the individual attributes in the PARTY table. This gives use the ability to easily create sub and supertype entities in one shot via relationship. The modified sample I just posted would require you to script the creation of the records but is a better example of a properly normalized database.

dmontano · October 15, 2008

Sorry for my confusing previous post:

Here is my observation about no shared attributes:

PERSON

First Name

Middle Name

Last Name

ORGANIZATION

Name

These are not "physically" shared fields. However, are they shared attributes logically? The "data structures" are different AND separated as illustrated in the field label prompting for that type of "specific" value from the User.

On one hand - they are not shared if viewed from this physical perspective. On the other hand - in reality - they share the same "meaning". They are attributes that as humans we use to convey WHO is the party we are referring to.

Because the fields themselves are NOT shared, I came to believe they were not shared. Maybe this will reveal how literal I take things and maybe that is a fundamental mistake.

I have no idea - nor leg to stand on - if this observation is true or false? Are these shared attributes? If so, do we agree the fields as presented above a physically separate?

If my assumptions are correct, does one concatenate the "PERSON" FirstName, MiddleName, LastName; then for ORGANIZATION use "Name" as the values for the Party Name field? Seems obviously simple - but the fact that I can not store a calculated field from a related table in another table is problematic and may be one reason why this model is not used more in FileMaker.

However, if it is advisable to place these "meaningfully shared attribute" type fields originating from the Supertype table and place them on their respective subtype layouts - that would seem to address that issue.

comment · October 15, 2008

you should infer that the entities he wishes to track would no doubt have a lot of related entities in common

If by "related entities" you mean things like addresses and phones, I'm afraid I'll have to disagree. In a contacts database, addresses are NOT an entity (something we need to track), but an attribute (something we want to know about a tracked entity).

Having multiple addresses per contact can promote them from local fields to related records - but it doesn't change their nature. Addresses ARE an entity in a real-estate agent's or a landlord's database. But in a contacts database, when a contact is deleted, its addresses are deleted along. Their only reason of existence is to describe the contact - and I believe that's as good a definition of an attribute as you'll find.

That example doesn't compare well with what we're talking about, since 99% of the time there is a one-to-one relationship between a super and subtype.

Actually, that's exactly what it is (or can be) }:( there's a supertype of Transactions, with a one-to-one relationship to various transaction subtypes (Labor, Materials, External Expenses and Payments). These subtypes share (at least) one common attribute of Amount, which is used to produce the statement's balance. Each row marked with a bullet is a transaction - one record in the supertype table, and one record in a subtype table, exactly like you said.

I still don't understand how this model could impose "different and conflicting requirements upon the layout's real estate?"

I thought that much was obvious: the same physical "column" needs to show data from different fields, depending of the subtype of the current row. To use the latest example:

"Camera would have focal length, body type, shutter speed, etc.. While Chainsaws would have a totally different set of fields in which to record their characterists."

Even if you decide to stagger the fields horizontally (which is not user-friendly, IMHO), the page is only so wide. Perhaps it will become clearer if you try to produce a printed alphabetical directory (details, addresses, phones and all) of the contacts in your demo.

David Jondreau · October 15, 2008

If my assumptions are correct, does one concatenate the "PERSON" FirstName, MiddleName, LastName; then for ORGANIZATION use "Name" as the values for the Party Name field? Seems obviously simple - but the fact that I can not store a calculated field from a related table in another table is problematic and may be one reason why this model is not used more in FileMaker.

However, if it is advisable to place these "meaningfully shared attribute" type fields originating from the Supertype table and place them on their respective subtype layouts - that would seem to address that issue.

The inability to store related data is, IMHO, one of the weaknesses of this model. But I do use it. In the case of Companies and Individuals, they do share some attributes, like name. And, I do agree with comment, that addresses and phone numbers and the like are attributes of the party as well.

I don't see the point of using a super/sub type model if there were no shared attributes. But we're talking about Companies and Individuals here (which despite initial appearance do share attributes), not Elephants and Impressionism (try and find a shared attribute between those two!).

What I do with the stored/unstored issue is have an auto-enter calc name field (usually several for different name formats) for the party and a trigger field in the calc. Whenever a name is changed on the subtype table, I make sure the trigger field is "tickled" via script.

dmontano · October 15, 2008

David Graham said:

NB, you will see that I break my normal Anchor Buoy conventions and use bidirectional communication from layouts based on a buoys. This is key to unlocking a huge benefit from this model and is well worth the slight deviation from the rules.

I actually do not use Anchor Bouy. I understand it, just do not see the advantage of it over losing the bidirectional benefits of "my method" which is I guess... let the relationships determine the graph. I keep it clean and organized... looks like clusters that are almost always attached to some other cluster. So NOT following Anchor Bouy is fine.

Both of these entities in most cases will be modeled on a single PERSON table. An employee could also have related EMPLOYMENT records, but that's a separate entity. If you know for a certainty that an attribute belongs only to a single subtype then it's safe to add it in the appropriate subtype table. For all others it is bother more correct and requires less time to just add it to the supertype and be done with it. Be careful how far down the path of normalization you go; the longer the relationship chain the more difficult it can be to do things in FileMaker.

Just as you described is how I am attempting to model Persons. Same concept for Organizations. Which led to...

The Party Type

I have been working on the 'Party Type" entity which I am currently setting up as (I think) a reference table that connects to "Party Classification" entity which is a join between Party and Party Type to allow a Party to have many "Party Types".

That attempt turned out to be the following tables:

Party Type Category

Party Type Subcategory

that culminates in "Party Type".

Party Type Categories ----< Party Type Subcategories

Party Type Subcategories ----< Party Types

then:

Party Types -----Parties

I think it's more correct to say that proper data modeling provides flexibility and scalability.

This is ONE of the primary reasons I want to go this route - that is if I can build it. The more and more I build "modules", I find myself needing the data built somewhere else, OR I find my table structures was being duplicated.

By the way.. what is NB?

dmontano · October 15, 2008

Hi David,

I think you forgot to attached your revised file. Could you post it?

Thanks

Dave Graham · October 15, 2008

If by "related entities" you mean things like addresses and phones, I'm afraid I'll have to disagree. In a contacts database, addresses are NOT an entity (something we need to track), but an attribute (something we want to know about a tracked entity).

There is your opinion, and there is relational theory. The saying goes "The key, the whole key, and nothing but the key, so help me Codd." Basically in 2NF, a non-key column must depend on the whole primary key and in 3NF it must not depend on anything outside of the primary key. An address is not solely dependent on the PARTY's primary key (it can be share amongst many PARTYS for instance, a PARTY may not have an address at all, or could have many of them, and certainly could depend on attributes outside of the primary key which are related to locations and geography.

Having multiple addresses per contact can promote them from local fields to related records - but it doesn't change their nature.

I'd say that you would have had to circumvent the rules of normalization in order to force an address into the PARTY table to begin with, which actually did change it's nature; albeit in reverse of the example that you've given.

Addresses ARE an entity in a real-estate agent's or a landlord's database. But in a contacts database, when a contact is deleted, its addresses are deleted along. Their only reason of existence is to describe the contact - and I believe that's as good a definition of an attribute as you'll find.

Because you believe it doesn't make it true. The fact is that referential integrity is not the same thing as normalization. Address do NOT describe the contact, but rather exist solely for the purpose of managing information relating to the contact. Do you really think that a zip code shares the same importance in defining a person as does the the person's first name?

Actually, that's exactly what it is (or can be) there's a supertype of Transactions, with a one-to-one relationship to various transaction subtypes (Labor, Materials, External Expenses and Payments). These subtypes share (at least) one common attribute of Amount, which is used to produce the statement's balance. Each row marked with a bullet is a transaction - one record in the supertype table, and one record in a subtype table, exactly like you said.

I misspoke. I meant to say that 99% of the time there is a 1 to 1 relationship between super and subtypes in the PARTY model. I can see how reporting on other more complex sub/supertype models can be difficult. I continue to believe that your initial claims about the reporting challenges in this specific example are overstated.

I thought that much was obvious: the same physical "column" needs to show data from different fields, depending of the subtype of the current row. To use the latest example:

"Camera would have focal length, body type, shutter speed, etc.. While Chainsaws would have a totally different set of fields in which to record their characterists."

Even if you decide to stagger the fields horizontally (which is not user-friendly, IMHO), the page is only so wide.

Now I understand what you are referring to, which still doesn't apply to the PARTY model. Assuming it did apply though, how else would you model the data to make that report any better? If you took all of those fields and shoved them into the same table you'd have precisely that same challenges with screen real estate. And ultimately this begs the question: is this report useful in any way in the real world? A listing of different products along with all of their attributes simply doesn't make sense.

Perhaps it will become clearer if you try to produce a printed alphabetical directory (details, addresses, phones and all) of the contacts in your demo.

I've produced such a report tens if not hundreds of times and have yet to run into a challenge.

comment · October 15, 2008

There is your opinion, and there is relational theory.

Well, I could say the same thing, except I'm not into making condescending observations.

Whether something is an entity or an attribute has nothing to do with normalization rules or referential integrity. It's a BUSINESS decision that must be taken BEFORE you can even begin normalizing your data.

Case in point:

An address is not solely dependent on the PARTY's primary key (it can be share amongst many PARTYS for instance

No. Not in a contacts database. You do NOT share an address among contacts in a contacts database (unless it's somehow relevant that two unrelated contacts share the same address - but that would be a business decision again). See what I wrote on the very same issue here:

http://fmforums.com/forum/showpost.php?post/253766/

Do you really think that a zip code shares the same importance in defining a person as does the the person's first name?

Importance is one thing, classification another. In some circumstances, a ZIP code might be more important than a person's first name, in others less. That doesn't change the fact that they both exist solely to describe the person.

how else would you model the data to make that report any better?

You are still missing my main (and only) point. If the data fits the model, then that's how it's going to be organized, and any issues arising from that decision are going to be dealt with. However, the advantages of the model decrease as the number of shared attributes goes down, while the costs of implementation go up. There comes a point where the costs outweigh the benefits.

I've produced such a report tens if not hundreds of times and have yet to run into a challenge.

Why don't you show us a screenshot of what it looks like in layout mode.

David Jondreau · October 15, 2008

I don't have a deep understanding of the normalized forms, so whether or not an address is an "attribute" in terms of relational theory is unclear and frankly doesn't matter to me. But one of the strengths of the Party model is the ability to track Addresses (attribute or entity) over multiple subtypes.

So instead of having:

a) Addresses stored in the Company or Individual table

}:( a Company Address table and an Individual Address table

or c) one Address table with a complex foreign key

I get to have d) one Address table with a simple foreign key.

Also, in the three Party models I've done that are in use, they've all been one to many super/sub types. At least a few contacts in those systems are a combination of Customer, Vendor, Consumer, Employee, etc etc.

Dave Graham · October 15, 2008

I actually do not use Anchor Bouy. I understand it, just do not see the advantage of it over losing the bidirectional benefits of "my method" which is I guess... let the relationships determine the graph. I keep it clean and organized... looks like clusters that are almost always attached to some other cluster. So NOT following Anchor Bouy is fine.

This is really a matter of personal preference. AB imposes a lot of duplicate schema and overhead to the graph, but ultimately make discrete manageable groups of relationships. In very complex systems your clusters are going to get huge, and I think you might revisit this at some point.

I have been working on the 'Party Type" entity which I am currently setting up as (I think) a reference table that connects to "Party Classification" entity which is a join between Party and Party Type to allow a Party to have many "Party Types".

Your quickly getting into an area that FileMaker is not too great at. I dispense with PARTY TYPE and PARTY CLASSIFICATION and it's other related entities for the sake of practicality. You will often need to relate to a PARTY and PARTY TYPE(S). If these are stored in separate tables you wont be able to draw a simple relationship. Unless you really need to manage information relating to the types and classifications I wouldn't bother with it.

By the way.. what is NB?

It stands for "nota bene" which means to "observe carefully or take special notice (used in written text to draw attention to what follows)."

Dave Graham · October 15, 2008

Well, I could say the same thing, except I'm not into making condescending observations.

For some reason I found myself getting defensive but didn't notice it. I re-read my posts and notice it now. You have my sincere apology for the condescending tone.

Ultimately we are splitting the hair pretty fine here. We're both talking about the same thing but defining it differently. I consider an attribute as the same as a non-key column whereas you consider it an element of business logic. The fact remains that we both agree that however you define an attribute, the PARTY model does make sense when there are elements of shared information, regardless of whether that data resides in the PARTY table or other related tables. Correct?

You are still missing my main (and only) point. If the data fits the model, then that's how it's going to be organized, and any issues arising from that decision are going to be dealt with. However, the advantages of the model decrease as the number of shared attributes goes down, while the costs of implementation go up. There comes a point where the costs outweigh the benefits.

I don't see this model being any more difficult than if you were to denormalize the information down to a single table. I think we're just going to have to agree to disagree on this one.

I've produced such a report tens if not hundreds of times and have yet to run into a challenge.

Why don't you show us a screenshot of what it looks like in layout mode.

mr_vodka · October 15, 2008

This is one of the most interesting threads posted in a while... :thumbup:

I had started to look at the party model a few months ago but have not implemented it yet in any of my solutions. Its good to know the pros and cons of it.

comment · October 15, 2008

we both agree that however you define an attribute, the PARTY model does make sense when there are elements of shared information, regardless of whether that data resides in the PARTY table or other related tables. Correct?

Essentially yes, except I would prefer to put it in the negative: the model does NOT make sense when there is no (or very little*) "shared information". The opposite is not NECESSARILY true (though the model should certainly be considered, at least).

I've attached a modified version of my sample file, which includes a contact listing

Yes, but that only shows the FIRST related address and phone of each contact, and none of the details (e.g. company's tagline). And even so, the NAME field is already a moderately complex calculation - and that's just with only two subtypes.

These are just examples of the implementation costs I mentioned. Granted, such layout is going to be difficult to design in other data models, too. But there are added complexities that are unique to this model (such as having to condition every GTRR from "outside" to a subtype). These things add up and the total cost needs to be considered against the potential benefits.

---

(*) Perhaps it should be noted that "model fits data" is NOT a binary proposition. A data model rarely, if ever, fits the real-world situation perfectly. At most, one can say that "this model fits better than the other one". In any but the simplest situations, it's more likely to be "well, this model is good in this aspect, but not so good in another, while the other model will give us a great advantage here, but it will cause problems there".

Dave Graham · October 15, 2008

Yes, but that only shows the FIRST related address and phone of each contact, and none of the details (e.g. company's tagline). And even so, the NAME field is already a moderately complex calculation - and that's just with only two subtypes. These are just examples of the implementation costs I mentioned.

If I wanted to include the tagline there would likely be a specific report for Companies, or I could've included it under the company name. If I wanted to include related addresses and phones I could've done so with portals.

This has nothing to do with the PARTY model and everything to do with related values. I.e., the same challenge of representing the name would exist if you had the entire thing in one table, ditto for related addresses and phones. There is absolutely zero inherent costs associated with the PARTY model in this example.

But there are added complexities that are unique to this model (such as having to condition every GTRR from "outside" to a subtype). These things add up and the total cost needs to be considered against the potential benefits.

I'll admit that FileMaker's inherent weakness when it comes to relationships can be a challenge. Generically speaking, the choice of whether to employ sub/super types in your model may yield varying degrees of value to your solution, but the advantages to the Party Model so drastically outweigh it's advantages that I'd say you'd be hard-pressed to find a case where it doesn't make sense.

(*) Perhaps it should be noted that "model fits data" is NOT a binary proposition. A data model rarely, if ever, fits the real-world situation perfectly. At most, one can say that "this model fits better than the other one". In any but the simplest situations, it's more likely to be "well, this model is good in this aspect, but not so good in another, while the other model will give us a great advantage here, but it will cause problems there".

I partially agree. For instance, I recommended against fully implementing some of Len's complete logical models on the basis that it would overcomplicate the solution (even Len mentions this). At it's core though, the fundamentals of the Party model make it a perfect fit for managing contact information. I can't imagine a situation that I wouldn't recommend this for, since it offers so much in the way of flexibility and scalability.

So much of my company's revenue is derived from retrofitting old databases that weren't designed well to handle changes and requirements over time. Database schemas are not unlike the foundation in your home; you shouldn't have to re-pour the foundation for a room addition.

comment · October 15, 2008

the choice of whether to employ sub/super types in your model may yield varying degrees of value to your solution, but the advantages to the Party Model so drastically outweigh it's advantages that I'd say you'd be hard-pressed to find a case where it doesn't make sense.

Obviously this is a case-by-case decision, but in general I'd say that if there's no requirement for dedicated and/or "filtered" layouts, then having a few extra fields for each type is by itself not reason enough to make the effort. Even the layouts issue is rather moot when navigation is scripted.

you shouldn't have to re-pour the foundation for a room addition.

That's a good point. OTOH, pouring concrete costs money. Preparing foundations for a skyscraper when you're building a one-story house, just in case it might grow, is not economic - so that doesn't lead anywhere.

Dave Graham · October 15, 2008

That's a good point. OTOH, pouring concrete costs money. Preparing foundations for a skyscraper when you're building a one-story house, just in case it might grow, is not economic - so that doesn't lead anywhere.

Let's just beat the dead horse some more. In FileMaker terms, the home analogy could at best be expanded to a multi-story dwelling. If you needed a skyscraper you would demo the entire house and use an industrial strength back-end.

If the difference in a foundation for a ranch home and a multi-story dwelling was one extra inch in thickness and larger pipes and there was even the most remote chance that this dwelling could become a multi-story dwelling, then I'd argue that it's appropriate to invest a little more in the planning and pouring of the foundation now.

'Nuff said.

Anuviel · October 16, 2008

Nice post, good reading, good, valid points, don't turn it into a Conundrum :goodpost:

dmontano · October 16, 2008

I kind of feel like I am driving a VW in a formula one race in this discussion... as mentioned before - this is very informative and I hope this doesn't die-out.

Hi David,

I started last night to look at what you posted originally and will look at your recent file post as well tonight - thank you for providing something I can look at. Your comments about Len's book and his observations are understood as well - to what degree I understand - still a little grey for me.

If anyone would take a moment and review this link and help translate a "little" I would appreciate it.

http://www.dmreview.com/issues/20020901/5675-1.html

I have read and re-read a portion of Len's book to see if the light bulb would go off - no such luck. I assure you I am not asking for others to "do-it-for-me", rather, I am asking for some enlightenment. I am missing (among many other things) HOW, when designing the actual physical files one could build from such a model?

Referring to Figure 1 in the link I attached:

For example:

Party Role Supertype (Could be 1 table)

Person Role (Subtype of Party Role)

Organization Role (Subtype of Party Role )

Now, notice the ORGANIZATION ROLE, which has Subtypes inside of it, such as:

DISTRIBUTION CHANNEL, which has 2 other subtypes inside of it. ORGANIZATIONAL UNIT has 5 subtypes in it.

If I take just the PARTY ROLE portion literally, I would have 29 tables to build. Am I understanding this correctly?

Part of the biggest problem is my inexperience, coupled with the fact that the book (and any other material I have researched) does not give extensive sample data for each of these "subutypes" and so the "meaning" of what these "subtypes" are is really subject to misinterpretation - on my part.

Please keep in mind, I am not trying to model the universe. However, I see this logic to a large degree as pretty straight-forward. The implementation has me puzzled.

Looking at Figure 1, am I correct in assuming that this is modeled to a high degree of normalization? And if so, do most DB designers easily see what and how to denormalize the model to make implementation or retrieval easier for them, their system, their requirements?

And lastly, most importantly to me, is it "possible" to physically build each subtype as shown in the figure as their own tables?

Some of the previous posts has me observing that it is at the designers discretion on where to normalize out to a higher degree than in other areas. Am I somewhat on the right track with the assumption?

Sign In

Been racking my brain: Supertype / subtype

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in