Thursday, February 25, 2016

What Is A Join: Part 3 (Cartesian Joins)

Cartesian Joins (Cross Joins)

This is part 3 of a multi-part series on the SQL Join. Last time I talked about the most common type of join, the Equi-Join (see: What is a JOIN: Part 2 (Inner Join)). This time, I'm going to talk about the least common join type: the Cartesian or Cross Join.

Cartesian Joins are notable mostly for being avoided. A Cartesian Join matches every record in one table to every record in another. Except in very rare instances, this is something to be avoided at all costs. Since it joins all records from both tables, the resultset of a Cartesian join can quickly become huge. A Cartesian join of two tables, each holding a thousand records will result in a query displaying a million records (1000 x 1000). Cross-joining two tables with a hundred thousand records each will exceed the 2GB limit of Access.

A Cartesian join is most commonly created in the Query Builder by neglecting to add a join line between the tables.


Since Orders has 4 records and OrderDetails has 7 records, the resultset of the Cartesian join will have 28 records:

Cartesian Join

OrderNumber

OrderID

Quantity

111

1

1

222

1

1

333

1

1

444

1

1

111

2

2

222

2

2

333

2

2

444

2

2

111

3

3

222

3

3

333

3

3

444

3

3

111

1

2

222

1

2

333

1

2

444

1

2

111

1

1

222

1

1

333

1

1

444

1

1

111

3

2

222

3

2

333

3

2

444

3

2

111

3

1

222

3

1

333

3

1

444

3

1

 

As you can see, the information is fairly useless.

The corresponding SQL statement (SQL View) looks like this:

SELECT Orders.OrderNumber, OrderDetails.OrderID, OrderDetails.Quantity
FROM Orders, OrderDetails;

There is no Join clause between the tables or Join criteria, simply a comma between the tables.

But as I said, there are some rare uses for a Cartesian join.

One almost trivial use is to add a value to each record of a table. For instance, suppose I have a table called User with one record.

User

UserName

Roger

 

If I create a Cartesian join with my Orders table, I'll get:

Cartesian Join 2

OrderID

CustomerID

OrderNumber

UserName

1

1

111

Roger

2

2

222

Roger

3

3

333

Roger

4

1

444

Roger

 

Now, of course there are other ways to do this. However, the Cartesian join has the additional property of being non-updateable. If this is a desirable effect, then a Cartesian join can be useful.

But the main use of a Cartesian join is in creating joins in the Where clause, which I'll talk about in What is a Join: Part 4 (Equi-Joins in the WHERE Clause) .

[top]

Wednesday, February 17, 2016

What is a JOIN: Part 2 (Inner Join)

Equi-Join (Inner Join)

This is part 2 of a multi-part series on the SQL Join. For an introduction to the whole topic, take a look at:
What Is A Join: Part 1.

This time I'm going to talk about the Equi-Join, also called an Inner Join. An equi-join combines records from two tables which have common values in both tables and displays only those with matching records. Since an equi-join is the most common type of join, it is often simply referred to as a Join.

Consider the following Order and OrderDetail table

Orders

OrderID

OrderNumber

1

111

2

222

3

333

4

444

 

OrderDetails

OrderDetailID

OrderID

Quantity

1

1

1

2

2

2

5

3

3

6

1

2

7

1

1

8

3

2

9

3

1

Joining them on the OrderID field results in:

Single Join Query

OrderNumber

OrderID

Quantity

111

1

1

111

1

2

111

1

1

222

2

2

333

3

3

333

3

2

333

3

1


Notice that although the Orders table has a record for Order Number 444, it does not display in Single Join Query. That's because there are no matching records in OrderDetails. An equi-join shows only those records with matching values in both tables.

The SQL for this query is:

SELECT Orders.OrderNumber, OrderDetails.OrderID, OrderDetails.Quantity
FROM Orders INNER JOIN OrderDetails
ON Orders.OrderID = OrderDetails.OrderID;

In the Query Builder, it looks like this:


Notice the 1 and infinity symbol (∞) on the Join line. These indicate that a Relationship exists between these tables and that the OrderID is a Primary Key for the table on the "One-side". This is important, because if you want a query with a join of two tables to be updateable, the field or fields on the "one-side" of the join must either be a Primary Key or a Unique Index. I'll talk more about the updateability of queries in a later post.

Multiple Joins

A query is not limited to a single join. Each join is between just two tables, but the FROM clause of the query can have many joins. For instance, I can add the Products table

Products

ProductID

ProductName

Cost

Price

1

hammer

$1.00

$2.00

2

nail - 10p

$0.01

$0.02

3

saw

$5.00

$10.00

4

wrench

$6.00

$7.00

5

nail - 8p

$0.01

$0.02

6

drill

$20.00

$40.00

7

screw - 1x4

$0.02

$0.04

8

hammer

$4.00

$5.00

to the above query to get the product and price for the order.

Something like this:

MultipleJoin

OrderNumber

OrderID

Quantity

ProductName

Price

111

1

1

hammer

$2.00

222

2

2

nail - 10p

$0.02

333

3

3

saw

$10.00

111

1

2

wrench

$7.00

111

1

1

saw

$10.00

333

3

2

hammer

$2.00

333

3

1

saw

$10.00

In the Query Builder, it looks like this:


The SQL is:

SELECT Orders.OrderNumber, OrderDetails.OrderID, OrderDetails.Quantity, Products.ProductName, Products.Price
FROM Products INNER JOIN
(Orders
INNER JOIN OrderDetails
ON Orders.OrderID = OrderDetails.OrderID)
ON Products.ProductID = OrderDetails.ProductID;

Next time, I'll discuss the least common type of join, the Cartesian or Cross Join.

 

.

Wednesday, February 10, 2016

What is a Join: Part 1 (Introduction)

What is a Join: Part 1

Introduction

This is the beginning of a multi-part series devoted to Joins in Microsoft Access.

A Join connects the records of two tables based on common information in the records. In What Is Normalization: Part II, I discussed how to organize data into smaller tables to reduce redundancy. In What is Normalization: Part III, I discussed relationships and their role in defining how tables should be put back together so all of the data is retained. The Join is a SQL (Structured Query Language) implementation of a relationship. In other words, if normalization splits tables apart, the Join is how you put them back together.

In Select Queries: Part I, I talked briefly about the SQL FROM clause, where Joins are created. The general SQL syntax for the most common type of Join (Equi-Join) is:

FROM Table1 JOIN Table2 ON Table1.JoinField = Table2.JoinField

Where JoinField is a field in each table which holds the same data. So, if I had Order and OrderDetail tables, they would both have to have an OrderID field in common to relate the records in the two tables. The FROM clause for this join would be:

FROM Order JOIN OrderDetails ON Order.OrderID = OrderDetails.OrderID

Although the Equi-Join (also known as an Inner Join) is the most common, there are many different types of joins:

  1. Equi-Join (Inner Join)
  2. Cartesian Join (Cross Join)
  3. Equi-Joins in the Where Clause
  4. Outer Join (Left Join or Right Join)
    1. Unmatched Query
    2. Full Outer Join
  5. Self-Join
  6. Theta Join

In subsequent posts, I'll be discussing each of these in turn. Next up, the Equi-Join.

.

Friday, February 5, 2016

Top Query Revealed: Parameters

Top Query Revealed: Parameters

This is the fifth and last in a series on the Top Query. In previous posts, I've discussed Simple Top Queries, Aggregate and Grouping Top Queries, Problem of Ties in a Top Query and Finding Random Records.

Create a Parameter for Top Value?

One question which often comes up with regard to Top queries is if you can supply the TOP predicate as a parameter like you can with Where criteria. Unfortunately, the answer is you can't, at least not with native SQL. You can, however, use code to modify your Top query programmatically.

To do this, we need to create a subroutine called TopParameter in a General Module. Of course, I could hard code it for a specific query, but it would be far more useful to have code that will modify any Top query that I supply it. So my code will include a QueryName argument that passes in the name of the query to be modified.

Sub TopParameter(QueryName As String)
On Error GoTo Err_TopParameter

Next, I'll declare some object and scalar variables.

Dim db As DAO.Database
Dim qdf As DAO.QueryDef
Dim strSQL As String
Dim strTopVal As String
Dim strSQLTemp As String
Dim StartPos As Integer

Then I need to open the current database and set the database object variable (db) to it. The QueryDefs method of the database object lets me open the query definition of the Top query I want to modify and set it to a querydef variable (qdf).

Set db = CurrentDb
Set qdf = db.QueryDefs(QueryName)

The SQL property of the QueryDef object reads the SQL statement of the query into a string variable so I can modify it.

strSQLTemp = qdf.SQL

Since every Top query has the TOP predicate with a space before and after it, I can find the position of the first space after the existing Top value. I need to do this because I have to remove the existing Top value and replace it with the new value.

StartPos = InStr(strSQLTemp, " TOP ") + 5

It's a good idea to test whether the query is actually a Top query or not, which I can do like this:

If StartPos = 5 Then
MsgBox "Not a Top Query"
GoTo Exit_TopParameter
End If

Now I'll prompt the user for the new Top value with an inputbox.

strTopVal = InputBox("Enter TOP value:")

Next I'll read all the text after the old Top value into a variable. To do that, find the first space after the TOP predicate like this:

strSQLTemp = Mid(strSQLTemp, InStr(StartPos, strSQLTemp, " "))

Now I've got all the pieces needed to rebuild the SQL string with the inputted Top value.

strSQL = "SELECT TOP " & strTopVal & strSQLTemp

Lastly, I'll set the SQL property of the query to the new SQL string, which will save the query with the new Top value.

qdf.SQL = strSQL

To finish it off, I'll add the Exit_TopParameter label which I used earlier to exit the routine if it is not a Top query. In addition, I'll clean up the object variables and add error trapping.

Exit_TopParameter:
db.Close
qdf.Close
Set db = Nothing
Set qdf = Nothing
Exit Sub

Err_TopParameter:
MsgBox Err.Description
Resume Exit_TopParameter
End Sub

To call the Query, I need to first call this code, then open the query, like this:

Call TopParameter("MyTopQuery")
DoCmd.OpenQuery "MyTopQuery", acNormal, acEdit

And that's it. Every time the code is run, it prompts for a new Top value, modifies, and then opens the query.

A free sample illustrating this process (and, indeed, all the information in the Top Query Series) can be found here: http://www.rogersaccesslibrary.com/forum/forum_posts.asp?TID=233.

.