GROUP BY
The GROUP BY phrase is one of the most commonly used SQL statements, and is similar to an Excel’s Pivot Table functionality. There are several nuances in using GROUP BY, so let’s consider a simple example: calculating the average population estimate for NY, CA, and TX in one single query.
SELECT ST, AVG(2022 estimate)
FROM database_table
WHERE ST IN (‘NY’, ‘CA’, ‘TX’)
GROUP BY ST
We have 3 rows of data with the average for NY, CA, and TX as a new column. A GROUP BY will GROUP BY the State column, and calculate the Average across each State that we’re grouping by.
Think of the GROUP BY as returning 50 distinct rows for the 50 distinct states in the broader table, and the AVG is essential in a GROUP BY because since we now have 50 rows of data, we can’t have 6 rows for TX for example, so the AVG will aggregate across those 6 rows and return 1 single value.