GROUP BY Statement in SQL

Using GROUP BY: Grouping Data Based on One or More Columns

The GROUP BY statement in SQL is used to arrange identical data into groups. This statement is often used with aggregate functions (COUNT, AVG, SUM, MIN, and MAX) to perform a calculation on each group of data. GROUP BY can be a powerful tool for analyzing data and extracting meaningful insights from a database.

Syntax of GROUP BY:

SELECT column_name(s), aggregate_function(column_name) 
FROM table_name 
WHERE condition 
GROUP BY column_name(s);

Example 1: Grouping by One Column

Consider a table named employees that contains data about the employees of a company, including their department. To count the number of employees in each department, you would use:

SELECT department, COUNT(*) AS employee_count 
FROM employees 
GROUP BY department;

In this query:

COUNT(*) counts all rows in each group.
department is used to group rows with the same department value together.
employee_count is a label for the count column in the output.

Aggregate Functions: Using COUNT, AVG, SUM, MIN, and MAX with GROUP BY

Aggregate functions perform a calculation on a set of values and return a single value. When used with GROUP BY, these functions produce a summary for each group.

Example 2: Using AVG with GROUP BY

If you want to find the average salary in each department:

SELECT department, AVG(salary) AS average_salary 
FROM employees 
GROUP BY department;

This query calculates the average salary (AVG(salary)) for each department and displays it along with the department name.

Example 3: Using SUM, MIN, and MAX

To get more insights, such as the total salary expenditure, and the minimum and maximum salaries in each department:

SELECT department, SUM(salary) AS total_salary, MIN(salary) AS min_salary, MAX(salary) AS max_salary 
FROM employees 
GROUP BY department;

This query shows the total, minimum, and maximum salaries for each department, giving a clear view of the salary distribution within departments.

Example 4: Grouping by Multiple Columns

You can also group by more than one column. For example, to find the average sales by each employee within each region:

SELECT region, employee_id, AVG(sales_amount) AS average_sales 
FROM sales 
GROUP BY region, employee_id;

This groups the sales data first by region and then by employee_id, calculating the average sales for each employee within each region.

Practical Tips and Considerations

Order of Columns: The order of columns in the GROUP BY clause is important when grouping by multiple columns. The results are grouped by the first column and then sub-grouped by the next.
Where vs. Having: When using GROUP BY, if you need to filter rows based on the group, use the HAVING clause instead of WHERE. WHERE filters rows before they are grouped, and HAVING filters after groups are formed.

Example 5: Using HAVING with GROUP BY

To find departments where the average salary is greater than $50,000:

SELECT department, AVG(salary) AS average_salary 
FROM employees 
GROUP BY department HAVING AVG(salary) > 50000;

This query first groups the employees by department and calculates the average salary per department. Then, it filters out the departments where the average salary is not greater than $50,000.

Previous Lesson

Back to Tutorial

Next Lesson

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
tk_lr	1 year	The tk_lr is a referral cookie set by the JetPack plugin on sites using WooCommerce, which analyzes referrer behaviour for Jetpack.
tk_or	5 years	The tk_or is a referral cookie set by the JetPack plugin on sites using WooCommerce, which analyzes referrer behaviour for Jetpack.
tk_r3d	3 days	JetPack installs this cookie to collect internal metrics for user activity and in turn improve user experience.
tk_tc	session	JetPack sets this cookie to record details on how user's use the website.

SQL Basics