Thursday, June 25, 2009

Select Queries Part 3: Sorting and Grouping (ORDER BY, GROUP BY)

In the first post of this series, Select Queries Part 1: Simple Queries, I talked about creating simple queries with the first two sections of a Select Query, the SELECT clause and the FROM clause. In the second post, Select Queries Part 2: Restricting Rows - the Where Clause, I discussed how to restrict rows returned with the WHERE clause. This time, I'm going to discuss how to sort your results with the ORDER BY clause and how to aggregate your data with the GROUP BY clause.

ORDER BY Clause

Tables in relational databases like Access (or SQL Server or Oracle, for that matter) do not have any intrinsic order. This means they can be returned in any order, not necessarily the order they were entered or that they appear when you open the table. So if you want them in a particular order, you have to sort them yourself. You do that in a query in the ORDER BY clause.

The ORDER BY clause has two parts: [field] [ASC/DESC] (repeat for as many fields as you need).

The [field] designates the field on which the sort will be performed and [ASC/DESC] tells how the field will be sorted. ASC means ascending (smallest to largest). DESC means descending (largest to smallest). ASC is the default, so if you don't designate an order, it will be smallest to largest.

Letters are sorted alphabetically A-Z (or Z-A if DESC). Numbers are sorted numerically. This is all very obvious until you have numbers that are stored as text. While they look like numbers, they will not sort numerically. For instance, these character strings sorted alphabetically:

CustomerNum
-----------
101
102
1100
201
3001
301

One solution is to add leading zeros to your text numbers. Numeric data will not save leading zeros, but text will:

CustomerNum
-----------
0101
0102
0201
0301
1100
3001

Later on, I'll discuss another method that does not require altering your data.

The Order By clause follows the Where clause:

SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin
FROM Products
WHERE ProductName = "hammer"
ORDER BY Cost DESC

Examples of ORDER BY clauses:

ORDER BY ProductName (Products alphabetically A-Z)

ORDER BY [Price]-[Cost]DESC (margin highest to lowest)

ORDER BY ProductName, Cost DESC

The last example shows that you can also sort on multiple fields. In this case, the table will be sorted on ProductName (ascending) and within each group of products, it will be sorted on Cost (descending).

ProductName Cost
----------- ----
Ax           $3
Hammer       $15
Hammer       $10
Wrench       $5
Wrench       $4

The fields in the ORDER BY clause do not have to be in the Field List of the SELECT clause. For instance, this will work just fine:

SELECT ProductName
FROM Products
ORDER BY Cost

In the Query Builder, simply uncheck the checkbox to do this:


You can also sort on an expression. I showed [Price]-[Cost], but you can also apply functions to your fields. For instance, earlier I showed how text numbers will sort incorrectly. I showed how leading zeros will correct it, but another solutions is to display the text numbers as they are in the field list, but to add a function to the ORDER BY clause that converts them to numeric.

SELECT [CustomerNum]
FROM Customer
ORDER BY CLng([CustomerNum]);

CLng () is a built-in function that converts Text to Long Integer.

GROUP BY and HAVING Clauses

The GROUP BY and HAVING clauses are used with "Totals" or Aggregate queries.

The GROUP BY allows you to group your data and apply a function to the grouped data. For instance:

SELECT ProductName, Cost
FROM Products

Will result in the following:

ProductName Cost
----------- ----
Ax           $3
Wrench       $4
Hammer       $15
Wrench       $5
Hammer       $10

However, if I wanted to see the average cost for each group, I could add a GROUP BY clause:

SELECT ProductName, Cost
FROM Products
GROUP BY ProductName, Avg(Cost)

ProductName Cost
----------- ----
Ax         $3
Wrench       $ 4.5
Hammer       $
12.5

Every field in the Field List MUST be represented in the GROUP BY clause either to group on or with an aggregate function. Examples of aggregate functions are: Sum, Avg, Min, Max, and Count.

The HAVING clause works like the WHERE clause, but it restricts rows after they have been grouped.

SELECT ProductName, Cost
FROM Products
GROUP BY ProductName, Avg(Cost)
HAVING Avg(Cost)

Will return:

ProductName Cost
----------- ----
Wrench       $
4.5
Hammer       $
12.5

There's a lot more to be said about Totals Queries, and I'll do that in greater depth in a later post.

Next time, in Select Queries: Part 4, I'll finish up the Select Query with some odds and ends: Top Query, Distinct Query, Crosstab Query and Parameters.

.

Monday, June 22, 2009

New Sample: Query_SplitToMultiRowsReMergeSorted

Query_SplitToMultiRowsReMergeSorted

by AD Tejpal

Explanatory Notes

This sample db demonstrates splitting of comma (,) or star (*) separated string elements into multiple rows followed by re-merging of these elements duly sorted, back into strings. It is a virtually codeless solution, but for use of two wrapper functions related to built in Split() function.

For splitting the strings into multiple rows, a single field table (T_Ref) populated with numbers 0 to 10 is used in Cartesian join with the table holding source data. The sorted output is appended to table T_Jobs_Split, which in turn is used to generate the crosstab query followed by final select query, re-merging the sorted elements back into strings.

Note:
(a) If the maximum number of elements in one string is likely to exceed 10, additional numbers can be added to table T_Ref. At the same time, pivot clause of crosstab query and calculated field of final select query (based upon crosstab query) will need o be expanded to accommodate additional columns.

(b) Use of number reference table (as compared to master job table) in Cartesian join with source table for generating multiple rows has the advantage that it eliminates the risk of inconsistent results arising out of partial match using Like operator. Moreover, it does away with the need of preparing such a table and keeping it up-to-date, apart from ensuring that there is no inadvertent discrepancy in its contents.

Version: Access 2000 file format.
Reference: DAO 3.6

You can find the sample here: http://www.rogersaccesslibrary.com/forum/forum_posts.asp?TID=459
.

Thursday, June 18, 2009

Select Queries Part 2: Restricting Rows - the Where Clause

In my last post, Select Queries Part 1: Simple Queries, I talked about the first two sections of a Select Query, the SELECT clause and the FROM clause. These allow you to display certain columns from designated tables. The next section is one of the most powerful features of a query: the WHERE clause.

WHERE Clause


The WHERE Clause is one of the most powerful features of a query. It allows you to restrict the rows returned based one or more criteria. These criteria are in the form of an expression, the general form of which is:


[Field] <comparison operator> [Value]

The field is a column in the base table(s) or a calculated column. It does not have to be in the Field List. The comparison operators are the standard math comparators: =, <, >, <=, >=, or <>; with some SQL specific ones added: IN/EXISTS, BETWEEN, LIKE, and IS NULL. The value portion can be either a hard coded value (like "hammer" or 25), or it can be another field or expression.

The WHERE clause follows the From Clause in a SQL Statement like so:

SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin
FROM Products
WHERE ProductName="hammer"

Examples of simple WHERE clauses:

WHERE ProductName="hammer"
(to show all hammers)

WHERE ProductName <> "hammer"
(to show all products EXCEPT hammers )

WHERE Cost <= 0.05

WHERE Price < Cost
(products sold below cost)

WHERE [Price]-[Cost]>2*[Cost]
(where the margin is greater than twice the cost)

LIKE is used with character data only and uses the asterisk as a wildcard symbol:

WHERE ProductName LIKE "ham*"
(returns "hammer" and "hammock")

WHERE ProductName LIKE "*nail"
(returns "10p nail" and "8p nail")

IN allows you to test if a field matches one of a list of values.

WHERE Cost IN (1, 2, 5, 8)

IN can also contain another SQL statement. The SQL statement contained in the IN must have only one field in the Field List.

WHERE ProductName IN (SELECT ProductName FROM Products)

I'll talk more about this in a later post when I discuss subqueries.

BETWEEN allows you to test for a range of values:

WHERE Cost BETWEEN 5 AND 10
WHERE BeginDate BETWEEN #1/1/2008# AND #12/31/2008#


IS NULL is a special comparator that tests whether or not a field is Null. The other comparison operators do not work with Null, so IS NULL is the only way to test for it.

WHERE ProductName IS NULL

Multiple Criteria and Logical Operators

You can also have multiple criteria by using multiple expressions joined by the Logical Operators: AND/OR. Examples of simple multiple criteria:

WHERE ProductName="10p nail" OR ProductName="8p nail"
WHERE ProductName="saw" AND Cost BETWEEN 5 AND 10
WHERE ProductName="saw" OR Cost BETWEEN 5 AND 10

Unfortunately, the Query Builder View of WHERE clauses (or Criteria as it's called in the QB) looks quite different than in the SQL View. In the QB, you do not need to repeat the field name with an OR or AND statement like you do in SQL. For instance, if I wanted to display both 10p nails and 8p nails, my WHERE clause would look like this:

WHERE ProductName="10p nail" OR ProductName="8p nail"

But the Query builder would look like this:

Figure 1

Of course, this would also work:

Figure 2

You have to be careful when creating multiple criteria in the Query Builder because sometimes they don't say exactly what you think they do. It matters which lines in the Criteria that you put your expressions. You create ORs on separate lines, while you create ANDs on the same line.

So this:
WHERE ProductName="saw" OR Cost BETWEEN 5 AND 10

Translates to:

Figure 3

(Notice the criteria are on separate lines.)

But this:

WHERE ProductName="saw" AND Cost BETWEEN 5 AND 10

Translates to:

Figure 4

(Notice here the criteria are on the same line.)

It gets even more complicated with multiple ANDs and Ors. For instance, what does this statement mean?

WHERE ProductName="saw" OR ProductName="hammer" AND Cost BETWEEN 5 AND 10

You might think it means I want "saws and hammers with a cost between 5 and 10". However, it doesn't. There is an Order of Precedence to the logical operators as there are with arithmetic symbols. AND always takes precedence over OR. So what will really be returned is "ALL saws and only those hammers that cost between 5 and 10". The AND expression will be evaluated before the OR expression.

So how do I get what I want? The answer is parentheses. If I want something evaluated out of precedence (like my OR expression) I surround it with parentheses. So it would be this:

WHERE (ProductName="saw" OR ProductName="hammer") AND Cost BETWEEN 5 AND 10

In the Query Builder, it would look like this:

Figure 5


One last thing about the Query Builder and WHERE clauses. The QB overdoes it with table names and parentheses. If I create the following query in the Query Builder:

Figure 6

I will get the following SQL statement:

SELECT Products.ProductName, Products.Price,
FROM Products
WHERE (((Products.ProductName)="saw"
Or (Products.ProductName)="hammer")
AND ((Products.Cost) In (1,2,5,7)));

Because there is only one table in the query, the table name preceding every field is not necessary. However, it also puts in parentheses, which, while it makes the query technically correct, actually makes it harder to read. Many of these parentheses can be removed, but not all. Naturally, the parentheses that surround the IN list must remain and also the OR statement that we know we want to take precedence. The query can be cleaned to look like this:

SELECT ProductName, Price
FROM Products
WHERE (ProductName="saw"
Or ProductName="hammer")
AND Cost In (1,2,5,7);

Because of the possibility of error when creating complex WHERE clauses in the Query Builder, I usually create them in the SQL View where I know I can control the parentheses.

NOT Operator

There is one more important Logical Operator, and that's the NOT operator. The NOT operator reverses an expression. It can be used with the LIKE, IN, and BETWEEN operators or entire multiple-expression WHERE clauses. It returns all rows EXCEPT those that would have been returned if you hadn't used the NOT.

WHERE ProductName NOT LIKE "ham*"
WHERE Cost NOT IN (1, 2, 5, 8)
WHERE Cost NOT BETWEEN 5 AND 10
WHERE Cost NOT IS NULL

It's important to remember that NOT also reverses both the comparison operators and the other logical operators, so you need to be really careful. For instance,

WHERE NOT(ProductName="saw" Or ProductName="hammer")

Is equivalent to:

WHERE (ProductName<>"saw" AND ProductName<>"hammer")


Next time, I'll look at sorting and groupin in: Select Queries Part 3: Sorting and Grouping (ORDER BY, GROUP BY).

.

Thursday, June 11, 2009

Select Queries Part 1: Simple Queries

Select Queries

The most common type of query is the Select Query. Its purpose is to return a dataset that is a dynamic view into your database. In fact, in other SQL dialects, like T-SQL (SQL Server) or SQL*Plus (Oracle) a saved Select Query is called a View.

The Select Query is very flexible and allows you to see your data in a variety of ways.
  • You can specify which columns you want to see.
  • You can also restrict the rows returned based on one or more criteria.
  • You can join multiple tables together on a common field(s).
  • You can return unique records, the top or bottom number (or percentage), sort your data on one or more columns.
  • And you can make your query interactive by adding parameters.
A Select Query can be created in the Query Builder View or directly in the SQL View. Before Access 2007, you would create a new query by going to the Database Window, select the Queries tab, and click New. Since Access 2007, go to the Create Tab on the Ribbon and select Query Design.

In Access 2007, it looks like Figure 1. The top pane shows which table or tables on which the query is based. The bottom pane shows which fields will be used in the query and the criteria used to restrict the rows.

Figure 1:

Switching to SQL View will result in the above query to look like this:

SELECT Products.ProductName, Products.Cost, Products.Price
FROM Products
WHERE (((Products.Cost)>=5))
ORDER BY Products.ProductName;

It is fairly easy to learn SQL by creating a query in the Query Builder and switching to SQL View. So I'm going to talk about what the various sections of a Select Query are.

In general, a Select Query will look like this:

SELECT [optional Predicate] [Field List]
FROM [Table/Query/Join]
WHERE [criteria to restrict rows]
ORDER BY [Field List] ASC/DESC


SELECT Clause

The Select Clause identifies which fields will be displayed in the result set of the query. The Query Builder always qualifies the field name with the table name, which is not strictly necessary unless the field name exists in multiple tables. This will be important when we talk about Joins, but for now, the Select Clause could also look like this:

SELECT ProductName, Cost, Price

You can also create calculated columns in the Select Clause. For instance, suppose I wanted to know the difference between the cost and the price of a product. I could add a calculation to a new column. In the Query Builder, it would look like this:

Figure 2:

Which would look like this in SQL:

SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin

You have to give the calculated column a name also called an 'alias', which I decided to call "Margin". In the Query Builder, it goes in front of the calculation. In SQL View, it follows the calculation with the AS keyword. The square brackets ([]) are not strictly necessary here, but there are times that they are. For instance, if you had a space in your field name (Product Name), you would be required to put square brackets around it ( [Product Name] ).

A word about naming: While Access SQL will allow you to use spaces, odd characters, and reserved words as field names by surrounding them in brackets, there are times when this will get you into trouble. Therefore it is best practice to avoid field and table names which will require brackets. You won't go wrong if you name your objects with letters, numbers, and the underscore only and avoid words like Date, Month, Table and other reserved words. You can find a list of reserved words here: http://support.microsoft.com/kb/286335.

You can also select ALL columns with the asterisk:

SELECT *

The asterisk is not the most efficient way of specifying columns, especially in a large table with many columns. If performance is an issue, you might want to restrict your query to just the columns you need. On the other hand, if your base table structure changes frequently, the asterisk can be useful because you don't have to change your query when you add or remove a column.

FROM Clause

The From Clause specifies the table or tables that form the basis of the query. Ultimately, of course, all the data comes tables, but you are not limited to specifying table names. You can also specify saved queries just as if they were tables. This is useful for creating "stacked" queries, that is, queries that build upon a previous query or queries.

In the Query Builder, the From clause is represented by "tables" in the upper pane (see Figure 1). To add another table or query, right-click in the upper pane and select Add Table in the pop-up box.

In SQL View, the from clause looks like this:

FROM Products (Products being a table in our database)
Or
FROM qryProductsSold (assuming qryProductsSold is a saved query)

Adding the From clause to the query above yields:

SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin
FROM Products

Another very important function of the From clause is to create Joins, that is, merge the records of two or more tables on common fields. For instance, if I wanted to see all of my Orders and their corresponding Order Details, I can join the Order table and the Order Details table. Figure 3 shows the Query Builder View.

Figure 3

The SQL View looks like this:

FROM Orders INNER JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID;

In this case, the fully qualified field name (that is with the table name in front of the field name) is required because the OrderID field exists in both tables.

There's a lot more to be said about Joins, and I'll do that in a later post.

Lastly, it is also possible to use an entire SQL statement in a From Clause by simply putting it in parentheses followed by an alias. In the SQL View, this would be perfectly legal:

SELECT FullName, Address, City, State, Zip
FROM (SELECT [FirstName] & " " & [Lastname] AS FullName, Address, City, State, Zip FROM Customer) AS FullCustomer;

This example is not particularly useful, but it does demonstrate how a SQL Statement can be used as a result set in the From clause. You have to be careful, however, because Access doesn't always handle this well. I'll also talk about that in a later post.

Next time, I'll continue with the Select Query by looking at how to restrict the rows returned (WHERE clause) in Select Queries Part 2: Restricting Rows - the Where Clause

.

Monday, June 1, 2009

What Is A Query?

What is a query?
A query is a multi-purpose object in Microsoft Access. It can be used to display data from a single table or multiple tables, perform calculations on your data, modify data within your tables, and create or modify the structure of tables, indexes or relationships. Basically, anything that uses Structured Query Language (SQL) is called a query.

It is important to realize that queries are dynamic, that is, they allow you to view or manipulate data stored in your tables. I had one customer who complained that she was losing data from her database. Turned out she was creating queries and then deleting rows from the query that she didn't want to see. She didn't realize that in doing so, she was actually deleting the rows from the table the query was based on.

Query Builder View vs. SQL View
Access has two ways to create or modify a query. The Query Builder View allows you to create queries in a graphical interface. SQL View allows you to create queries directly in the SQL language. The two views are interactive. In other words, a query created in the Query Builder will be viewable and editable in the SQL View and vice versa. A good way to learn SQL is to create one in the Query Builder and view it in the SQL View.

Types of Queries

Select queries
Select queries create a dynamic view into the database. In fact, in other SQL implementations like SQL Server and Oracle, Select Queries are called Views. They allow you to see just what you want out of your tables. You can choose to see all columns or just selected ones. You can also restrict the number of rows based on criteria. (see: Select Queries Part 1: Simple Queries)

Joins
One important feature of a query is its ability to join multiple tables together on one or more common fields. If you have normalized your database properly (see: What is Normalization?), this is how you put your data back together again. For instance, if you had a CustomerID field in two tables: Customers and Orders, you could join the two tables on the CustomerID field to link all Customers with their respective Orders. (see: What is a Join: Part 1 (Introduction))

Aggregate (Totals) Queries
There are times when you want to aggregate your data, that is, you want to group your data and apply some mathematical function each group. The most common functions are sum and average. Access provides a type of query called a Totals Query that will do this.

Crosstab Queries
A Crosstab Query is a special kind of aggregate query, which calculates an aggregate function (like sum or average), and groups the resulting dataset by two sets of values — one down the side and another across the top. Crosstab queries are particularly useful for displaying summarized data in an easy to understand format.

Action Queries (Data Manipulation Language Queries)
Action Queries (also called Data Manipulation Language Queries or DML) make changes to the data in your tables. You can delete all or some records from your table (see Action Queries: Delete Query). You can change the value of a field or fields in all or some records in your table (see Action Queries: Update Query). You can append records to a table, either selecting records from another table or append specific values (see Action Queries: Append Query). Lastly, you can create a new table from selected record from an existing table (see Action Queries: Make-Table Query).

SQL Specific Queries
The only thing SQL Specific queries have in common is that they cannot be represented in the Query Builder window, so once created, you can never go back to the Query Builder window.

DDL Queries (Data Definition Language Queries)
DDL queries (also called Data Definition Language queries) allow you to create new tables and indexes modify existing ones.

Union Queries
Union queries combine one or more tables. However instead of combining them row-wise like the Join, the union combines tables column-wise. (see: Union Query: Part 1)


Pass-through Queries
A pass-through query is used to return records or manipulate data in a server-based database like SQL Server or Oracle. It simply passes the query on to the host database engine, bypassing the Jet database engine altogether. A pass-through query must be written in the SQL dialect of the host engine. For SQL Server, that would be T-SQL. For Oracle, it would be SQL *Plus.

Updateable vs. Non-updateable Queries

A query can be updateable or non-updateable, depending on a lot of conditions. If a Select Query is updateable, you can actually add, delete and modify records in the base table through the query. In Access, the * button on the Navigation Bar will be visible if the query is updateable. It will be grayed out if it is not. The actual conditions under which a query is updateable can be found in the Access Help System.

Action queries can also be updateable or not depending on the same conditions as the Select query. This is based on whether the dataset returned in the FROM clause of the action query is updateable.

Some queries are inherently non-updateable. These include Union queries, Crosstab queries, and Totals queries. Non-updateable queries can only be used to display data. For more detailed information see "This Recordset Is Not Updateable. Why?"

Creating and Using Queries in Code

Some SQL dialects, like T-SQL, allow you to use procedural constructs like IF statements and Loops, in a query. Access SQL does not. However, Access allows you to create and execute queries in Visual Basic procedures through the use of DAO (Data Access Objects), which is a library of objects that allows you to programmatically manipulate the database.

.