Sunday, 10 January 2010

Handle SuperTypes and SubTypes while converting LDM to PDM.Transition from Logical Data Model to Physical data Model?

Super Types and Subtypes
The final issue is really just a matter of choice – and knowing what the choices are, as well as their pro’s and con’s.

Super and Sub Types (i.e. inheritance) are a very powerful logical modeling construct. Most business users are more than capable of assisting with construction of the logical data model – because it’s really nothing more than a static picture of their data at rest. As one accounting manager said to me once during a business user interview, “I know my own business whether it’s painting on cave walls or this format.”

So let’s take a very simple example – bank accounts can be either savings or checking accounts (but not both at the very same time) – yet all accounts share some common characteristics or attributes. Here’s a simple example of this:

You have a Super Type entity as Account and two subtype entities as Checking and Saving Account.

So how does one physically implement this construct in the physical data model – and why would one choose one way versus another? Now this is one place where the DBA should choose a direction and make a critical design implementation choice. Let’s now examine the three options:

Option #1 - One Big Table

This is probably the most common choice.

•Generate Parent = Y and Generate Children = N
•Requires Discriminator attribute (e.g. Account Type)
•Violates third normal form (… nothing but the key …)
•PRO: Easy to code against, just one big table …
•CON: All child columns optional, so need table check constraint

Option #2 - Table per Sub-Type

This is another common and simple alternative.

•Results in N-1 tables
•Gen. Parent = N, Gen. Children = Y, Inherit All Attributes = Y
•PROS: All child columns implemented as expected
•CON: Two tables to code against


Option #3 - Table per Super and Sub-Type

This choice is valid – but has some issues related to complex application coding and resulting performance (due to excessive joins).

•Results in N tables
•Gen. Parent = Y, Gen. Children = Y, Inherit Only Primary Attr. = Y
•NOT RECOMMENDED: Just Plain Overkill

Which should you use? That’s up to you. But you do need to understand the three alternatives – because there is not a one-size-fits-all answer that works every time.

Also note that not all data modeling tools support these choices the same way. Most data modeling tools fail to create the complex constraints and triggers required by the first two alternatives – so your DBA will have to compensate and add the missing database logic.

No comments: