SQL query practics focusing on Null, Count, Sum and Avg

Icon made by Freepik from www.flaticon.com

SQL is the language for all database queries. It is easy to start due to its structural characteristics, but sometimes, to get correct result SQL queries, especially for the beginner of SQL user, practic is needed to understand the possible situations.

At this article, Null, Count, Sum and Avg will be the focuses. The code is based on MySQL. Let’s start.

  • NULL

NULL is different as zero or a space. For example, for business, zero might mean that sales has attampted, but not success to sale the product, but null means there is no attampt. That’s meaning ful difference. Zero or space are values, the syntax is value=0 or ´´, but for null, the syntax is: IS NULL or IS NOT NULL.

Using below codes to create a table:

The table looks like:

Record 1: with full information, record 2: last name is empty, record 3: last name is text with NULL, record 4: last name is null.

How to get record 1 is direct, and let’s see how to get different record for 2 to 4:

The record is

  • The record is
  • COUNT

Use below code to create a table and insert data:

Count is an aggregate function. It can be used to find the null or non-null data, not only numerical but also text:

The COUNT(*) returns the number of rows including duplicate, non-NULL and NULL rows.

The COUNT(expression) returns the number of rows that do not contain NULL values as the result of the expression.

The COUNT(DISTINCT expression) returns the number of distinct rows that do not contain NULL values as the result of the expression.

The return type of the COUNT() function is BIGINT. The COUNT() function returns 0 if there is no matching row found.

The demo database looks like:

COUNT(*)

The result looks like:

Using star(*) will count all of the records including null.

The result looks like:

COUNT(expression):

The result looks like:

At this case, we can see that given a specifical column, the null will not be counted.

The result looks like:

COUNT(DISTINCT expression)

The result looks like:

With distinct, null will not be counted at the result.

  • SUM and mathmatical operator +:

mathmatical operator did not use aggregate, but SUM does. SUM can’t be used as count with star: it needs to sepcify the column.

Let’s create a database as below:

The table is:

The result looks like:

The result looks like:

The result looks like:

The result looks like:

Now you might have found the difference between sum(val_1+val_2) =35 and sum(val_1)+sum(val_2)=37, this difference comes when using operator for null: it can’t be recognized as null automatically, and how to overcome this problem:

useing COALESCE to convert null to zero can help on it

The result looks like:

  • Avg

Avg() function itself doesn’t include the null value.

Based on the table above from Sum, let’s check the average value:

Normally, by definition, AVG(val_1) = SUM(val_1)/COUNT(*) but, because of the existing of null, if the question is to be in line with the definition AVG() should be changed as AVG(val_1)=SUM(val_1)/COUNT(val_1)

Let’s check the result, if the question is to get the average of all id, including the null value:

The result looks like:

If the question is to get the average without the id if the value is null:

The result looks like:

Same for val_2:

The result looks like:

But how about Avg(val_1+val_2)? As we already know, if using avg(val_1+val_2), it equales to 5, which doesn’t include null+null order 2+null, the count is 7 which also excluds the null values.

The result looks like:

If the question is to get all of the value including null (transfered into zero) for all of the value including null:

The result looks like:

As you can see, including or excluding null value can change the value very differently, sometimes it is very hard to recognize it if we don’t understand the basic definition behind.

Hope it helps directly and simply.

(The process of writing this article inspired me summarize further on SQL, so another about translating business requirements into queries for SQL in data analysis is also done. In case you are interested in it, have fun.)

passionate about data analysis and data science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store