Tuesday, December 15, 2015

Union Query: Part 2


More Union Queries
Last time in Union Queries: Part 1, I discussed simple Union queries and some of their uses. This time I want to talk about some advanced topics, starting with Union All.

Union Vs. Union All

As you remember from last time, a Union query takes the result of one query and "appends" it to another. At the same time, it removes duplicates and orders the records on the first field in the field list.

The Union All does the same thing as the Union, but without removing the duplicates or sorting the resultset, so:

SELECT * FROM TableA
UNION ALL
SELECT * FROM TableB


Would return:

CustomerID CreditLimit
---------- -----------
1001       $500
1010       $350
1017       $375
1020       $425
1017       $375
1008       $600


The order is not determined by the base tables. The result has no order, so the records could be returned in any order.

Sorting Union Queries

So what if you want to sort your Union All or want a different sort order for your Union? Like any Select query, you can add an Order By clause to the end.

SELECT CustomerID, CreditLimit
FROM TableA
UNION ALL
SELECT CustomerID, CreditLimit
FROM TableB
ORDER BY CustomerID;


Will return:

CustomerID CreditLimit
---------- -----------
1001       $500
1008       $600
1010       $350
1020       $425
1017       $375
1017       $375


The Order By applies to the entire result. If the column names are different between queries (which, you'll remember, I said was allowable), the column names from the *first* query must be used. Order By clauses in any of the other component queries will be ignored. Only the last one will be applied.

Restricting Rows

Like any other Select query, you can restrict rows from the resultset with a Where clause. Unlike the Order By, however, each component query can have its own criteria. If you put a Where clause on the last query, it will apply to ONLY the results of the last query. So this:

SELECT CustomerID, CreditLimit
FROM TableA
UNION
SELECT CustomerID, CreditLimit
FROM TableB
WHERE CreditLimit > 500


Produces this:

CustomerID CreditLimit
---------- -----------
1001       $500
1008       $600
1010       $350
1017       $375


Since 1010 and 1017 are in TableA, they can be less than 500, whereas 1020 is in TableB, so it is restricted. If you want a Where to apply to the entire resultset, you have to put it in each query.

SELECT CustomerID, CreditLimit
FROM TableA
WHERE CreditLimit > 500
UNION
SELECT CustomerID, CreditLimit
FROM TableB
WHERE CreditLimit > 500


Tables With Unequal Numbers Of Columns

As I said, the individual queries don't have to have the same column names, nor do they have to be the same datatype, but the *number* of columns in the field list must be equal.

What if your tables have unequal columns? In that case, you need to create false columns with an alias.

So suppose you have two tables which hold much the same data, but have slightly different structures.*

TABLE A
CustomerID CreditLimit
---------- -----------
1001       $500
1010       $350
1017       $375


TABLE B
CustomerNum Credit Active
---------- ------ ------
1008       $600   Y
1017       $375   Y
1020       $425   Y


*Please Note: I am not advocating this as a good database design. In a properly designed database, you would never have two tables which stored the same type of data. However, you may find yourself in this situation if you are fixing a poorly designed database or if you are merging separate databases.

If I don't know the value of "Active" for TableA, I can leave the value NULL.

SELECT CustomerID, CreditLimit, NULL AS Active
FROM TableA
UNION
SELECT CustomerNum, Credit, Active
FROM TableB;


CustomerID CreditLimit Active
---------- ----------- ------
1001       $500
1008       $600        Y
1010       $350
1017       $375        Y
1020       $425        Y


However, if I assume all of the customers in TableA are active

SELECT CustomerID, CreditLimit, "Y" AS Active
FROM TableA
UNION
SELECT CustomerNum, Credit, Active
FROM TableB;


CustomerID CreditLimit Active
---------- ----------- ------
1001       $500        Y
1008       $600        Y
1010       $350        Y
1017       $375        Y
1020       $425        Y


Because Union Queries cannot be viewed in the Query Builder, they are often over looked by Access novices, but they are a powerful tool to have in your SQL arsenal.

.

Tuesday, December 8, 2015

Union Query: Part 1

Simple Union Query

Introduction

In Microsoft Access, a Union Query is an SQL Specific query, which means it can only be written in SQL. It cannot be created or edited in the Access Query Builder. Novices often confuse UNIONS with JOINS.

Joins (see What is a Join: Part 1 and subsequent posts) combine two (or more) tables row-wise, that is, the results of matching information will be displayed on a single row.

Unions, on the other hand, combine tables column-wise, that is, the results of one SQL statement will be "appended" to the results of another as additional rows. In essence, a Union sticks the results of one query on to the bottom of another.

The structure of a Union Query is very simple. It is two or more queries with the UNION keyword in between. For instance:

SELECT Field1, Field2 FROM Table1
UNION
SELECT FieldA, FieldB FROM Table2

As you can see from the example, the field names do not need to be the same. The column name in the result will come from the first query. The fields don't even need to be the same datatype. In the query above, Field1 could be an integer, Field2 a date/time, FieldA a text field, and FieldB a currency field.

The only real requirement is that there must be an equal number of columns in each of the component SQL statements. If there are different numbers of fields in the field lists, an error will occur.

One important thing to remember about the Union is that it is non-updateable (for more on this see: (This Recordset Is Not Updateable. Why?). As a result, Unions are useful for displaying data, but not for entering or editing it.

Union Example

One of the most common uses for a Union query is to consolidate tables.
For example, imagine a situation where you have similar datasets from 2 different sources, and you want to consolidate/synchronize/merge them:

TABLE A

CustomerID CreditLimit
---------- -----------
1001       $500
1010       $350
1017       $375

TABLE B

CustomerID CreditLimit
---------- -----------
1008       $600
1017       $375
1020       $425

This query:

SELECT CustomerID, CreditLimit FROM TableA
UNION
SELECT CustomerID, CreditLimit FROM TableB

Will produce this result:

CustomerID CreditLimit
---------- -----------
1001       $500
1008       $600
1010       $350
1017       $375
1020       $425

The Union combines the two tables, while at the same time removing the duplicate records. The Union also sorts the recordset on the first column, in this case CustomerID.

Since TableA and TableB have the same number of fields, I could also have done this:

SELECT * FROM TableA
UNION
SELECT * FROM TableB

To actually consolidate the tables into a single table, save the query (say, CustomerUnion), then use it as the table in a Make-Table query.

SELECT * INTO TableC FROM CustomerUnion;

Or use the Union Query in the "From" clause of the Make-Table:

SELECT * INTO TableC
FROM
(SELECT * FROM TableA
UNION
SELECT * FROM TableB)
AS CustomerUnion;

(I'll discuss using a query in the From clause in more detail in a later post.)

Other Uses

There are many applications for the Union Query. One is in the Access implementation of a Full Outer Join (which I will address in a later post). But one very common use is in the Row Source of Combo Boxes.

Suppose I have a combo box to filter records on a form by Customer Name.




My Row Source for the combo could look like this:

SELECT CustomerID, CustomerName FROM Customer;

But suppose I want the combo to have an option to choose all records.



I can use a Union query in the Row Source to add artificial records to the drop down list:

SELECT "*" as CustomerID, "All" as CustomerName FROM Customer
UNION
SELECT CustomerID, CustomerName FROM Customer;

This will produce a result of

CustomerID CustomerName
---------- -----------
*          All
1010       Carlson
1017       Smith

With CustomerID as the bound column in the combo box, I can use a query like:

SELECT * FROM TableA
WHERE CustomerID Like '" & Combo1 & "'"


to filter my form.

Next time, I'll look at some advanced topics with Union Queries.

.

Thursday, December 3, 2015

Select Queries Part 4: PARAMETERS, TOP, DISTINCT, and TRANSFORM...PIVOT

This is my final post in this series about Select Queries.

In Select Queries Part 1: Simple Queries, I discussed the SELECT and FROM clauses. In Select Queries Part 2: Restricting Rows - the Where Clause, I talked about the WHERE clause. Select Queries Part 3: Sorting and Grouping (ORDER BY, GROUP BY) showed the Order By and Group By clauses.
 

This time I'm going to discuss a few lesser known but powerful clauses: Top, Distinct, Transform...Pivot, and Parameters.

PREDICATE (optional)

Another less well known section of a SQL statement is the Predicate. The Predicate follows the SELECT keyword and precedes the Field List. There are two important predicates: TOP and DISCTINCT.

Top Predicate

The Top Predicate allows you to display the top(x) values from a query in terms of either a number or percentage. I can't really talk about the TOP predicate without discussing the ORDER BY clause because the rows displayed are determined by the ORDER BY clause.

Examples:

SELECT TOP 10 ProductName
FROM Products
ORDER BY Cost

This will display ten rows of ProductNames whose costs are the LOWEST.

SELECT TOP 25% ProductName
FROM Products
ORDER BY Cost DESC

This will display 25% of the total number of rows of the Product table whose costs are HIGHEST.

In other words, the query creates a standard Select Query, applies the sort order in the ORDER BY clause, then displays just the top X values from the query.

To add a TOP predicate in the Query Builder, go to the Properties of the Query and look for the Top Values property:


A word about duplicates: If there are duplicates in Top rows, they will all be displayed. So, sometimes Top 10 could return 11 or even more rows. It is possible to remove these duplicates with a subquery. To do that and a lot of interesting things with the TOP predicate, download a free database sample from my website called: TopQuery which goes into more detail including:

  • Removing Duplicates.
  • Top values with Aggregates (Totals Query)
  • Top values per Group
  • Returning Random X records from your table.
  • User input TOP value (parameter).

DISTINCT Predicate

The DISTINCT predicate will remove duplicates from your result set based on the fields in your Field List. For instance, if I had a Products table that looked like this:

ProductName Cost
----------- ----
Ax           $3
Hammer       $15
Hammer       $10
Wrench       $5
Wrench       $4

SELECT DISTINCT ProductName
FROM Products

Would return:

ProductName
-----------
Ax
Hammer
Wrench

Removing the duplicated rows. The effect is the same as using GROUP BY clause without an aggregate function:

SELECT ProductName
FROM Products
GROUP BY ProductName;

DISTINCTROW

The DISTINCTROW predicate is unique to Access. It will return unique records across the entire record, not just the fields in the Field List.

SELECT DISTINCTROW ProductName
FROM Products

returns:

ProductName
-----------
Ax
Hammer
Hammer
Wrench
Wrench

Honestly, though, I've never found a good use for DISTINCTROW because I always have a Primary Key in all my tables, so all of the rows are already unique.

To create either the DISTINCT predicate in the Query Builder, set the Unique Values property to Yes, to create the DISTINCTROW predicate set the Unique Rows value to Yes.



TRANSFORM...PIVOT (Crosstab Query)

The Crosstab Query is an Access Specific query that allows you to display recordset results in a much more compact form. For instance, if I had an OrderDetail table that looked like this:


I could use a Crosstab Query to show the data like this:



The Crosstab Query used two clauses, the TRANSFORM, which comes before the SELECT statement, and the PIVOT, which follows the GROUP BY. In SQL View, it looks like this:

TRANSFORM Sum(Quantity) AS SumOfQuantity
SELECT CustomerNum, OrderID
FROM OrderDetail
GROUP BY CustomerNum, OrderID
PIVOT ProductName;

In the Query Builder, it looks like this:


The easiest way to learn Crosstab Queries is to use the Crosstab Wizard that's available when you create a New Query.

PARAMETERS Clause (Optional)

Queries can also accept user input. This input is called a Parameter. Parameters are placed in the WHERE clause of a SQL statement and takes the form of a Prompt in square brackets. For example:

SELECT * FROM Customers
WHERE CustomerNum [Enter Customer Number]

When the query is run, a parameter input box appears:


When the user inputs a valid customer number, the query returns the row(s) applicable to the customer entered.

A parameter can also reference a control on a form. Suppose I have a Customer form and I want to see all of the orders associated with a particular customer. I could have a query that looks at the CustomerNum in the current form and return all of the order numbers for that customer.



So if I had a query that looked like this:

SELECT CustomerNum, OrderNumber
FROM Customer INNER JOIN Orders
ON Customer.CustomerID = Orders.CustomerID
WHERE CustomerNum=[forms]![frmCustomer]![CustomerNum]

It would look at the value in the CustomerNum text box on the frmCustomer form and use that as the parameter value to return all the orders for customer 100.

Access does a pretty good job of figuring out what the data type of a parameter is. However, there are times when it gets confused. This is particularly true when trying to use parameters with a Crosstab query. In these cases, you can define what the data type of the parameter will be with the PARAMETERS Clause.

The PARAMETERS clause goes before the SELECT statement in a Select query:

PARAMETERS [Enter Customer Num] Text(255);
SELECT Customer.*
FROM Customer
WHERE CustomerNum=[Enter Customer Number];

Notice the semi-colon at the end.

In a Crosstab query, it goes before the TRANSFORM statement:

PARAMETERS [Enter CustomerNum] Text(255);
TRANSFORM Sum(Quantity) AS SumOfQuantity
SELECT CustomerNum, OrderID
FROM OrderDetail
WHERE CustomerNum=[Enter CustomerNum]))
GROUP BY CustomerNum, OrderID
PIVOT ProductName;

To create parameters in the Design View, choose Parameters from the Design Ribbon (A2007) or from the Query>Parameters Menu (A2003 and before). You will get a dialog box that looks something like this:


.