Into a structure like this:
Using a One-Time process that creates the tables and relationships in addition to copying the data.
Appending to Existing Tables
The process for normalizing repeated column data into Existing Tables is similar to the One-Time process. It does, however, require you to create the tables and relationships correctly ahead of time. Because I've just created them in the previous section, I'll simply delete the data and re-append it from the linked spreadsheet.
The two processes are very similar, but I'll be using Append queries rather than Make-Table queries.
Step 1: Remove the common PC elements.
Since I've already got a table, I'll use an Append query rather than the Make-Table query I used earlier:
INSERT INTO PC ( PC_Num, OperatingSystem )
SELECT PC_Num, OperatingSystem
FROM PC_RepeatedColumns_Link;
Or in the Query Builder:
The resulting table looks like this:
Since I already have an autonumber primary key defined on the table, I don't need to do anything further.
Step 2: Create Software_Temp table
This step is exactly the same as Step 2 above. I'll use a Make-Table query to create a temporary table that will be used for creating the relationships later.
SELECT PC_ID, Software INTO Software_Temp
FROM
(SELECT PC.PC_ID, PC_RC_Link.Software1 As Software
FROM PC INNER JOIN PC_RC_Link ON PC.PC_Num = PC_RC_Link.PC_Num
WHERE Software1 IS NOT NULL
UNION
SELECT PC.PC_ID, PC_RC_Link.Software2
FROM PC INNER JOIN PC_RC_Link ON PC.PC_Num = PC_RC_Link.PC_Num
WHERE Software2 IS NOT NULL
UNION
SELECT PC.PC_ID, PC_RC_Link.Software3
FROM PC INNER JOIN PC_RC_Link ON PC.PC_Num = PC_RC_Link.PC_Num
WHERE Software3 IS NOT NULL
UNION
SELECT PC.PC_ID, PC_RC_Link.Software4
FROM PC INNER JOIN PC_RC_Link ON PC.PC_Num = PC_RC_Link.PC_Num
WHERE Software4 IS NOT NULL
UNION
SELECT PC.PC_ID, PC_RC_Link.Software5
FROM PC INNER JOIN PC_RC_Link ON PC.PC_Num = PC_RC_Link.PC_Num
WHERE Software5 IS NOT NULL)
The result will be a table that looks like this:
Step 3: Remove Duplicates from Software_Temp
I need to create a query that removes the duplicate values. The DISTINCT predicate works well for that. As before, I'll use an Append write the data to my existing Software table:
INSERT INTO Software ( Software )
SELECT DISTINCT Software_Temp.Software
FROM Software_Temp;
With the resulting table:
Again, this table already has an autonumber primary key, so I don't need to create one.
Step 4: Creating the Linking Table: PC_Software
So now, I've got my two "One-Side" tables: PC and Software. Now I just need to fill the linking table.
I can do that with an Append Query and a simple join of Software and Software_Temp:
INSERT INTO PC_Software ( SoftwareID, PC_ID )
SELECT Software.SoftwareID, Software_Temp.PC_ID
FROM Software INNER JOIN Software_Temp ON Software.Software = Software_Temp.Software;
Or in the Query Builder:
The final result will look like this:
It is important to note that the order in which the data is moved is vital. Data must be written into the "one-side" tables (PC and Software) first, and then data can be moved into the "linking" tables (PC_Software).
This two-part post is available as a single download (with sample database) here: http://www.rogersaccesslibrary.com/forum/normalizingrepeatingcolumns1_topic562.html.
No comments:
Post a Comment