SQL GROUP BY
The SQL GROUP BY clause is used along with the SQL aggregate functions and specifies the groups where selected rows are placed. WHEN one or more aggregate functions are presented in the SQL SELECT column list, the SQL GROUP BY clause calculates a summary value for each group. To grasp this concept we will illustrate the SQL GROUP BY application with example. Consider the already familiar Weather table:
City | AverageTemperature | Date |
New York | 22 C | 10/10/2005 |
Seattle | 21 C | 10/10/2005 |
Washington | 20 C | 10/10/2005 |
New York | 18 C | 10/09/2005 |
Seattle | 20 C | 10/09/2005 |
Washington | 17 C | 10/09/2005 |
Our task is to calculate the average temperature for each of the cities in the Weather table. Here is how to accomplish that using the SQL GROUP BY clause:
The result of this SQL GROUP BY statement is the following:
City | AverageTemperature |
Washington | 18.5 C |
Seattle | 20.5 C |
New York | 20 C |
As you can see in the result data set, we have one row for each city. Each city in our SQL ORDER BY statement, represents one group because the City column is specified in the GROUP BY clause.
But what is the number in the AverageTemperature column? How come we have only one number for each city when we have multiple temperature entries for each of the cities in the Weather table? Because the AverageTemperature column is an argument for the SQL AVG aggregate function, the single value we see in the final result set is simply the average temperature value for its respective city:
Washington average temperature = (20 + 17)/2 = 18.5 Seattle average temperature = (21 + 20)/2 = 20.5 New York average temperature = (20 + 20)/2 = 20
When GROUP BY is used, one of the following must be true:
1. Each column which is not part of an aggregate expression in the select list is presented in the SQL GROUP BY clause.
2. The SQL GROUP BY expression matches exactly the select list expression.
Here is an example of incorrect GROUP BY statement:
This SQL GROUP BY statement is wrong because the Date column is presented in the SQL SELECT list, but is not in the GROUP BY clause. The correct SQL GROUP BY statement should look like this: