**The DMedian() Function:**

In my last post, (__How do I calculate a Median in Access? Part 1____)__ I defined Median and described in general terms how to calculate a median in Access. This time, I'm going to describe in detail the **DMedian()** user-defined function.

First of all, you have to create a Public Function in a General Module. Create a new module by selecting **Module** in the **Database Window** and click the **New** button. Immediately click the **Save** button and name the module "**basDMedian**". Next, type the following function into the module.

Public Function DMedian(FieldName As String, _

TableName As String, _

Optional Criteria As Variant) As Variant

' Created by Roger J. Carlson

' __http://www.rogersaccesslibrary.com__

' Terms of use: You may use this function in any application, but

' it must include this notice.

'Returns the median of a given field in a given table.

'Returns -1 if no recordset is created

' You use this function much like the built-in Domain functions

' (DLookUp, DMax, and so on). That is, you must provide the

' 1) field name, 2) table name, and 3) a 'Where' Criteria.

' When used in an aggregate query, you MUST add each field

' in the GROUP BY clause into the into the Where Criteria

' of this function.

' See Help for more on Domain Aggregate functions.

On Error GoTo Err_Median

Dim db As DAO.Database

Dim rs As DAO.Recordset

Dim strSQL As String

Dim RowCount As Long

Dim LowMedian As Double, HighMedian As Double

'Open a recordset on the table.

Set db = CurrentDb

strSQL = "SELECT " & FieldName & " FROM " & TableName

If Not IsMissing(Criteria) Then

strSQL = strSQL & " WHERE " & Criteria & " ORDER BY " & FieldName

Else

strSQL = strSQL & " ORDER BY " & FieldName

End If

Set rs = db.OpenRecordset(strSQL, dbOpenDynaset)

'Find the number of rows in the table.

rs.MoveLast

RowCount = rs.RecordCount

rs.MoveFirst

'Determine Even or Odd

If RowCount Mod 2 = 0 Then

'There is an even number of records. Determine the low and high

'values in the middle and average them.

rs.Move Int(RowCount / 2) - 1

LowMedian = rs(FieldName)

rs.Move 1

HighMedian = rs(FieldName)

'Return Median

DMedian = (LowMedian + HighMedian) / 2

Else

'There is an odd number of records. Return the value exactly in

'the middle.

rs.Move Int(RowCount / 2)

'Return Median

DMedian = rs(FieldName)

End If

Exit_Median:

'close recordset

rs.Close

Exit Function

Err_Median:

If Err.Number = 3075 Then

DMedian = 0

Resume Exit_Median

ElseIf Err.Number = 3021 Then

'EOF or BOF ie no recordset created

DMedian = -1

Resume Exit_Median

Else

MsgBox Err.Description

Resume Exit_Median

End If

End Function

To use this function in a query, simply use the query above that calculates the average with the DAvg, and replace DAvg with DMedian, like this:

SELECT Hospital, DMedian("TimeToProc", "tblTimeToProcedure",

"[Hospital] = '" & Hospital &"'") AS MedianTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital

Which will return this:

Hospital | MedianTimeToProc |

A | 2.60 |

B | 3.5 |

**Caveats:**

You have to be careful, however, because it is very easy to produce data that looks correct, but is not. If you're not careful to include every field contained in the Where clause and Group By clause of your main query in the Criteria of the DMedian function, your data will not be correct.

Let's look at a slightly more complex example. Suppose we want to aggregate our TimeToProc data at the Hospital and Unit level. We might have a table that looks like this:

tblTimeToProcedure

PatientID | Hospital | Unit | TimeToProc |

1 | A | C1 | 1.5 |

2 | A | C2 | 2 |

3 | A | C1 | 2.2 |

4 | A | C2 | 3 |

5 | A | C1 | 2 |

6 | A | C1 | 4 |

7 | A | C2 | 5 |

8 | A | C2 | 100 |

9 | B | A1 | 2.5 |

10 | B | A1 | 2 |

11 | B | A1 | 4.2 |

12 | B | A2 | 3 |

13 | B | A2 | 2.2 |

14 | B | A2 | 4 |

15 | B | A1 | 5 |

16 | B | A1 | 6 |

Suppose I create a totals query in Access that aggregates the data on **Hospital** and **Unit**, but only provide the **Hospital** to the DMedian function's criteria. Like this:

SELECT Hospital, Unit, DMedian("TimeToProc","tblTimeToProcedure",

"[Hospital] = '" & [Hospital] & "'") AS MedianTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital, Unit;

I'll get a result that looks like this:

Hospital | Unit | MedianTimeToProc |

A | C1 | 2.60 |

A | C2 | 2.60 |

B | A1 | 3.5 |

B | A2 | 3.5 |

This is obviously wrong. The DMedian function is aggregating ONLY at the Hospital level, while the main query is aggregating at both Hospital and Unit.

In order to produce the correct median values, we have to supply both the Hospital and the Unit to the DMedian function.

DMedian("TimeToProc", "tblTimeToProcedure", "[Hospital] = '" & Hospital &"' AND [Unit] = '" & Unit & "'")

The finished query will look like this:

SELECT Hospital, Unit, DMedian("TimeToProc", "tblTimeToProcedure",

"[Hospital] = '" & Hospital &"'AND [Unit] = '" & Unit & "'") AS MedianTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital, Unit;

This will produce the correct results:

Hospital | Unit | MedianTimeToProc |

A | C1 | 2.10 |

A | C2 | 4 |

B | A1 | 4.2 |

B | A2 | 3 |

So the rule is that every field that participates in the **Where** clause or **Group By** clause of the main aggregate query must also be represented in the **Criteria** argument of the domain aggregate function.

**Making Sure Your Query is Correct**

One way to make sure you are aggregating the data correctly in both the main query and the domain aggregate function is to create an aggregate query using the SQL AVG function, and then create the same query using the DAvg domain aggregate function. If the two queries produce the same values, you know you have created it correctly. Then just replace DAvg with your DMedian function.

For instance, first create an AVERAGE aggregate query using the SQL AVG function. The SQL statement would look like this:

SELECT Hospital, Unit, AVG(TimeToProc) AS AvgerageTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital, Unit

Hospital | Unit | AverageTimeToProc |

A | C1 | 2.43 |

A | C2 | 27.50 |

B | A1 | 3.94 |

B | A2 | 3.07 |

Now create the same query, but this time use the DAvg function.

SELECT Hospital, Unit, DAvg("TimeToProc", "tblTimeToProcedure",

"[Hospital] = '" & Hospital &"' AND [Unit] = '" & Unit & "'") AS

AverageTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital, Unit;

Hospital | Unit | AverageTimeToProc |

A | C1 | 2.43 |

A | C2 | 27.50 |

B | A1 | 3.94 |

B | A2 | 3.07 |

Since the two queries produce identical results, we know that we have set up the domain function correctly. Now just replace DAvg with your DMedian function:

SELECT Hospital, Unit, DMedian("TimeToProc", "tblTimeToProcedure",

"[Hospital] = '" & Hospital &"' AND [Unit] = '" & Unit & "'") AS

MedianTimeToProc

FROM tblTimeToProcedure

GROUP BY Hospital, Unit;

Hospital | Unit | MedianTimeToProc |

A | C1 | 2.10 |

A | C2 | 4 |

B | A1 | 4.2 |

B | A2 | 3 |

**On-line Database Sample:**

On my website, (__http://www.rogersaccesslibrary.com__), there is a small sample database called Median.mdb. This sample has the above example and another, more complex, example in an Access database, so you can see how the DMedian function works.

There are two versions of the DMedian function. DMedian is in the basMedianADO module, and DMedian is in the basMedianDAO. You must use the DMedian97 in an Access 97 database.

**Conclusion:**

Because Microsoft Access^{®} does not have a Median aggregate function, finding the median value of a group of data can be a challenge. But with a little programming expertise, you can create your own DMedian domain-aggregate function that you can use to find the Median value for an aggregate query.