Monday, April 13, 2009

What is the fastest way to return the Record Count of a table?

As with most things in Access, there are many ways to get the record count of a table. So which is the fastest? The answer is: it depends.

There are a many factors that determine performance:

  • Whether the database is on the local drive or a network,
  • Whether the table is native or linked,
  • Whether the table is indexed or not,
  • The number of records in the table.

There are other factors, of course, like processor speed and disk access speed, but we'll assume we're testing all of the proposed methods on the same machine, so the results will be proportional.

I don't propose to test all combinations of these performance variables. If there are only a few records, all the methods shown will return the record count so quickly that it doesn't really matter. Similarly, if the database is on a local drive (C:), there is no perceptible difference. On the other end, if the table is not indexed, some of the methods are so slow, they are impractical.

So I've chosen a table with 7125822 indexed records (mostly because I had a table with 7125822 records available). I will test both native and linked tables, so I need two databases, which I will access across a standard 100mbps network.

Methods
There are three basic methods to find the number of records in a table:

  1. DCount domain aggregate function.
  2. SQL Count aggregate query.
  3. Recordcount property of the OpenRecordset method of the DAO database object.

Each of these methods has several variations as well. For instance, both the DCount and SQL methods can count on * or on a specific field. This may affect performance. The SQL method can use a saved query or an embedded SQL statement, which also may affect its performance. Lastly, the RecordCount method will ONLY work on a native table, not on a linked table. However, I can use the OpenDatabase method to open a linked table as a native table.
So here are the basic methods I tested.

Method 1a (DCount with *)

lntCount = DCount("*", "MyTable")

Method 1b (DCount with fieldname)

lntCount = DCount("ID", "MyTable")

Method 2a (embedded SQL with *)

lntCount = CurrentDb.OpenRecordset _
("SELECT Count(*) AS [CountOfID] FROM MyTable;") _
![CountOfID]


Method 2b (embedded SQL with fieldname)

lntCount = CurrentDb.OpenRecordset _
("SELECT Count([ID]) AS [CountALL] FROM MyTable;") _
![CountALL])


Method 3 (saved query)

lntCount = CurrentDb.OpenRecordset("Count_Query") _
![CountOfID]


Method 4a (RecordCount Native table)

lntCount = CurrentDb. _
OpenRecordset("Mytable").RecordCount

As I said, the RecordCount property will only return the correct record count if the table is a LOCAL table. If it is a linked table or a query, it will always return 1 as the record count. However, I can explicitly open the table with the OpenDatabase method.

Method 4b (RecordCount Linked table)

lntCount = OpenDatabase _
("c:\My Documents\LinkedDatabase.mdb"). _
OpenRecordset("Mytable").RecordCount


Testing the Methods

To minimize any differences in network traffic and processor usage, I ran each of the methods 10 times and averaged the elapsed time. The Figure 1 below shows the outcome.

Figure 1: Comparison of Record Count Methods

The results of whether to use a * or a fieldname in either the DCount or SQL statement appears inconclusive. It also doesn't seem to matter much if the table is native or linked. However, I can say that that the RecordCount property of a TableType Recordset object is the fastest method by at least an order of magnitude.

.

4 comments:

Wayne Phillips said...

Nice find, Roger.

I took this a little further and found the reason _why_ the OpenRecordset version is so much faster...

The reason is that the OpenRecordset version simply reads the cached value which is stored in the Jet table definition page (not accessible by any ordinary means other than with a hex editor), whereas the other methods don't use that (and most of the methods probably don't have _access_ to that extra low-level data) and so have to resort to calculate the value from iterating through the indices.

From a technical point of view it means that the OpenRecordset method will _never_ take any longer on a direct local table since it's just reading a cached 32 bit value from the file - even when you're dealing with millions of records.

Wayne Phillips
http://www.everythingaccess.com

Wayne Phillips said...

Correction to my previous comment: just found out that the record number cached in the Jet table definition page is actually exposed in the TableDef object - so you can get at it that way too.

Roger Carlson said...

Thanks for the follow-up, Wayne.

Wayne Phillips said...

Welcome, Roger.

(p.s. to anyone else reading this, in my first comment I meant: "I took this a little further and found the reason _why_ the OpenRecordset version (method 4a/b) is so much faster...", as I wasn't very clear in that post. Need more sleep :-))