SQL Server has a nice built-in function called COUNT DISTINCT, which is missing in Access SQL.
What does COUNT DISTINCT do? Well, there are times when you want to count distinct values in a query, that is, a count of values without duplicates. For instance, given the following table, how many distinct customers have orders?
|8||1/17/2008||Crystal & Co.||$111.85|
|9||1/17/2008||Crystal & Co.||$511.00|
|13||1/18/2008||Crystal & Co.||$311.00|
|14||1/18/2008||Crystal & Co.||$811.50|
In SQL Server, I can do this:
SELECT COUNT(DISTINCT Customer) AS CountOfCustomer FROM Orders
Which will give me the following:
In Access, if I use the Distinct predicate with the count:
SELECT DISTINCT Count(Customer) AS CountOfCustomer FROM Orders;
Since Access SQL does not have the Count Distinct function, what can I do?
There are actually three different methods for simulating the Count Distinct:
- Subqueries in the FROM Clause
- Subqueries in the Field List
- User-Defined Function (this post)
- Crosstab Query (reader submitted method)
A user-defined function is simply a piece of VBA code that you can call from other code or from a query. In this case, I'll call it from a query.
This process works pretty much like the built-in Domain functions like DCount, DMax, DLookup, etc. If you're unfamiliar with Domain Functions, you can read more in my post: Domain Functions Demystified.
Into the function, I send the name of a field, a table (or query), and a Where clause (without the WHERE keyword), In addition, I also need to send the Group By clause (again without the GROUP BY keyword).
Then I take the query created in Subqueries in the Field List and build it in code with the addition of a WHERE clause that restricts the query to a single row matching the row being evaluated in the query. The evaluated query returns a single value.
Lastly, I open a recordset and return the value from the "countof" column from the function.
First of all, create the following Function in a General Module:
Function DistinctCount(FieldName As String, _
TableName As String, _
WhereClause As String, _
GroupByClause As String) As Long
'declare variablesDim db As DAO.Database
Dim rs As DAO.Recordset
Dim strSQL As String
On Error GoTo DistinctCount_Error
'set database variable to current databaseSet db = CurrentDb
'create SQL string with function argument valuesstrSQL = "SELECT Count(Temp." & FieldName & ") as countof " & _
" FROM (SELECT " & GroupByClause & ", Customer FROM " & _
TableName & " GROUP BY " & GroupByClause & ", " & _
FieldName & " ORDER BY " & GroupByClause & ") AS Temp " & _
" WHERE " & WhereClause & _
" GROUP BY " & GroupByClause
'open a recordset of the resultant SQL statementSet rs = db.OpenRecordset(strSQL)
'if a record is foundIf Not rs.EOF Then
'return the value DistinctCount = rs!countof
'return -1 to indicate "no records" DistinctCount = -1
On Error GoTo 0
'destroy object variables Set rs = Nothing
Set db = Nothing
MsgBox "Error " & Err.Number & " (" & Err.Description & _
") in procedure DistinctCount of Module Module1"
Call the Function
Calling the function, from a query looks something like this in the Query Builder:
CountOfCustomer: DistinctCount("Customer","Orders","OrderDate =# " & [OrderDate] & "#","OrderDate")
In SQL view:
SELECT Orders.OrderDate, DistinctCount("Customer","Orders","OrderDate =# " & [OrderDate] & "#","OrderDate") AS CountOfCustomer, Count(Orders.OrderID) AS CountOfOrderID, Sum(Orders.Amount) AS SumOfAmount
GROUP BY Orders.OrderDate
ORDER BY Orders.OrderDate;
Which results in the following:
How it works
The function runs for each row evaluated. For instance, in the above query, for 1/17/2008, the query that is created in the code looks like this:
SELECT Count(Temp.Customer) as countof FROM (SELECT OrderDate, Customer FROM Orders GROUP BY OrderDate, Customer ORDER BY OrderDate) AS Temp WHERE OrderDate =#1/17/2008# GROUP BY OrderDate
The SQL for the next row is:
SELECT Count(Temp.Customer) as countof FROM (SELECT OrderDate, Customer FROM Orders GROUP BY OrderDate, Customer ORDER BY OrderDate) AS Temp WHERE OrderDate =#1/18/2008# GROUP BY OrderDate
The SQL for the last row is:
SELECT Count(Temp.Customer) as countof FROM (SELECT OrderDate, Customer FROM Orders GROUP BY OrderDate, Customer ORDER BY OrderDate) AS Temp WHERE OrderDate =#1/19/2008# GROUP BY OrderDate
As you can imagine, with a lot of customers, this might not be the best performer. But I'll leave that for next time, when I'll look at the methods and discuss the pros and cons of each.