What I'm going to concentrate in the next few post are specific problems with querying data stored in repeated columns. This time, it's Multiple Unions.
Others in this series:
- Multiple ORs
- Multiple Unions (this post)
- Multiple Joins
- Multiple IIFs
- Impossible Joins
- Summing
- Counting
- Averaging
Multiple UNIONs
Last time, in Querying Repeated Columns: Multiple ORs, I talked about the difficulties of querying data in repeated columns. But it gets worse. Suppose I want to categorize each type of software. In which case, I'd have to add a category column for each software column (Software1Type, Software2Type, ...etc). It could look something like this:
Figure 1: Repeated Columns with repeated category columns. |
By comparison, the normalized version would look like Figure 3, with both Software and Category removed to their own table:
Figure 2: Fully normalized with Category column added. |
Now, suppose I want to query a listing of all word processors for each PC. To create such a query with repeated columns, I'd have to resort to multiple UNIONs:
SELECT PC_ID, Software1 as Software, Software1Type as Category
FROM Worse_PC_RepeatedColumns
WHERE Software1Type="word processor"
UNION
SELECT PC_ID, Software2 as Software, Software2Type as Category
FROM Worse_PC_RepeatedColumns
WHERE Software2Type="word processor"
UNION
SELECT PC_ID, Software3 as Software, Software3Type as Category
FROM Worse_PC_RepeatedColumns
WHERE Software3Type="word processor"
Unfortunately, this must be done in SQL view since union queries cannot be viewed in the Query Builder.
Figure 3: Word processor for each PC with repeated columns. |
By contrast, creating the query in the normalized structure looks like this:
SELECT PC.PC_Num, Software, Category
FROM Software INNER JOIN
(PC INNER JOIN PC_Software ON PC.PC_ID = PC_Software.PCID)
ON Software.SoftwareID = PC_Software.SoftwareID
WHERE Category="word processor";
This one can be viewed in the Query Builder:
Figure 4: Word processor for each PC with normalized structure. |
Again, the results of both queries are identical with the exception that the query with repeated columns (UNION) is not updateable.
Figure 5: Results of two queries. |
More columns in the repeated columns table makes the query more complex. I'd have to add another UNION for each column pair. Worse yet, there is a limit to the number of Unions you can use in a query, so if you have too many categories, your query may not just be difficult, but impossible.
By contrast, adding more values to the the normalized tables requires no change to the query at all, and there is virtually no limit on the number of categories.
No comments:
Post a Comment