Monday, February 9, 2009

Entity-Relationship Diagramming: Part IV

In Entity-Relationship Parts I, II, and III, I described a method of normalization that builds up from individual data elements. Following this method results in a database design that is normalized to the Third Normal Form -- without having to know what the Normal Forms are.

(I haven't discussed the Normal Forms. Perhaps I will in a later post.)

In Part III, I left off with the completed Entity-Relationship Diagram shown in Figure 1.

Figure 1: Employee Database E-R Diagram

When I do E-R diagramming, I like to ignore the attributes, concentrating instead on just the entities and their relationships. But at some point, obviously, the attributes have to come back into it. I like to save that to the end.

There are some good E-R diagramming tools available out there, like Visio and ERWin, but for small projects, I usually just implement the tables directly in Access and create the Relationships there. The Access Relationship Window works pretty well as a diagramming tool, and you've got a completed database at the end.

Implementing an E-R Diagram

So far, I've talked about Entities and Attributes. I do that deliberately to keep myself from thinking about implementation issues while I'm designing the data model. But at the implementation phase, entities become tables and attributes become fields.

The first thing I do is add an Autonumber, Primary Key field (Surrogate Key), to each table, naming it after the table with a suffix of "ID". Thus the tblEmployee table will have a field EmployeeID. Then I create a Unique Index on the Natural Key I identified during E-R diagramming process.

At this point, my tables look like Figure 2.

Figure 2: Tables with Surrogate Keys Added

Now it's time to look at my relationships. Relationships are created on fields holding related information, Primary Key to Foreign Key. In a One-to-Many (1:M) relationship, the primary key of the table on the "One" side is added to the table on the "Many" side table and becomes the foreign key.

Since there is a 1:M relationship between tblEmployee and tblEmployeeCertification, I'll put EmployeeID in EmployeeCertification and it will become the foreign key for tblEmployee.

I'll do the same with every relationship. The resultant tables look like Figure 3.

Figure 3: Tables with Foreign Keys Added

Lastly, I need to create the actual relationships. I do this by clicking and dragging the primary key of one table into its corresponding foreign key in the other. For instance, I'll click on EmployeeID in tblEmployee and drag it to EmployeeID in tblJobHistory.

When I do, I will get an "Edit Relationships" dialog box shown in Figure 4.

Figure 4: Edit Relationships Dialog Box

The Referential Integrity box will not be checked by default. However, you MUST check it in order to create a true relationship. Without it, Access will create a line between the tables, but it will NOT be a relationship.

When you click the Referential Integrity box, Access should correctly identify the relationship as a One-to-Many relationship. If it does not, if it says Indeterminate, you've done something wrong.

Clicking OK results in the relationship being created. Figure 5 shows that.

Figure 5: Relationship Between tblEmployee and tblJobHistory.

Now, I'll just do that for the rest of the relationships, and I'm done.

Figure 6: Completed Data Model

The process I've outlined here is really just the bare bones. If you're interested in seeing examples in more detail, you can find some in the Tutorials: Database Design section of my website.

Link to top: Entity-Relationship Diagramming: Part IV



Anonymous said...

Opulently I assent to but I contemplate the collection should prepare more info then it has.

Anonymous said...

Nice dispatch and this mail helped me alot in my college assignement. Thank you on your information.

Anonymous said...

"Then I create a Unique Index on the Natural Key"

Why do we make an autonumber ID and then do this too?

Why not just make the natural key into the primary key?

Your articles are the best for learning database design by the way... =) Thanks for the resource

Roger Carlson said...

Well, there's a whole school of thought that says you should do just that. However, I feel that having a surrogate key and a unique index has some significant advantages over natural keys.

1. First of all, a surrogate (autonumber) field will never need to change. Any real-world key has the possibility of needing to change, which then requires a mechanism to update them. The Cascade Updates is supposed to handle this, but I've had cases where I needed to change the actual data type of the primary key. That's a real mess.

2. If you have a multi-field natural key and you make that the primary key, then ALL of the fields in the key have to be added to any related table. This means in addition to a multi-field primary key in one table, you may have many multi-field foreign keys.

3. Lastly, most natural keys are text data (even if they're numbers). Text fields are more "costly" to link data. Joining tables on a single numeric field is much faster, so an autonumber field is much more efficient.

So the thought with surrogates is that you're splitting the functions of the primary key in two. One field for joining tables, and another (or multiple fields) for maintaining real-world uniqueness.