top of page
  • Dec 23, 2016

Here's a continuation of my review of Getting Started with SQL, an O'Reilly guide by Thomas Nield, last blogged about on November 27, 2016.

Chapter 8 of the guide covers the JOIN function, which is roughly analogous to Excel's VLOOKUP formula but allows a whole array to be pulled.

The INNER JOIN function allows a user to merge fields from two different tables by referring to a field that both tables have in common. In this example, we see that the CUSTOMER_ORDER and CUSTOMER tables only have the CUSTOMER_ID field in common.

So we can see this command selects a number of fields from both tables and then with the command FROM [TABLE 1] INNER JOIN [TABLE 2] specifies where they are to come from and which field is used as a cross reference. When the same field name appears in more than one table we put the table name in front of the field name separated with a period. INNER JOIN will exclude any records that are not referenced in both tables by the same CUSTOMER_id. If INNER JOINT is substituted with LEFT JOIN, all of the records from TABLE 1 will be in the results regardless. The fields which only appear in TABLE 2 will show NULL values. RIGHT JOIN simply does the opposite, and OUTER JOIN will get all records from both tables.

If you want to check for orphaned records in TABLE 1 - just find the records from TABLE 1 that don't have any matches in TABLE 2 you can run this script:

The WHERE command limits the results to the null values in TABLE 2.

The script to combine three tables would appear this way, with TABLE 1 being referenced first at FROM, and then TABLE 2 and TABLE 3 being specified by separate INNER JOIN commands.


  • Nov 28, 2016

Here's a continuation of my review of Getting Started with SQL, an O'Reilly guide by Thomas Nield, last blogged about on November 14, 2016.

In Chapter 7 of the guide, we learn about the SQL Case statement. In this example, we see how a script run in SQLiteStudio does the following:

1. Selects five fields [report_code; year; month; data; and wind_speed] from the table named STATION_DATA. The SELECT command at the beginning references these fields, and then the FROM command at the end pulls them from the table.

2. Creates entries named 'HIGH' in a new field called wind_severity when the wind speed is greater than 40. The new field is created following the END command, after conditions are given between WHEN and THEN after the CASE command.

3. Creates entries named 'MODERATE' in the new field called wind_severity when the wind speed is between 30-40.

4. Lists all other wind speeds as 'LOW' in the new field.

When you enter a simpler script like this, we merely create a new field called 'tornado_precipitation' that lists the number of tornadoes for each month in each year.

SELECT year, month, SUM (precipitation) as tornado_precipitation FROM station_data WHERE tornado = 1 GROUP by year, month

A script like this one below takes into account when a specified value is not listed in a field. So on the second line we're getting the average temperature when there are entries in the RAIN or HAIL fields but not counting the entries in these fields that are null, and putting them in the new 'avg_precipitation_temp' field. The WHERE command restricts the results to where year is higher than 2000.

SELECT month,

AVG(CASE WHEN rain OR hail THEN temperature ELSE null END) AS avg_precipitation_temp,

AVG(CASE WHEN NOT (rain OR hail) THEN temperature ELSE null END) AS avg_non_precipitation_temp

FROM station_data WHERE year > 2000 GROUP BY month



Here's a continuation of my review of Getting Started with SQL, an O'Reilly guide by Thomas Nield, last blogged about on November 6, 2016.

We can use the COUNT command to input in a new field - the 'AS record_count' part - to count the number of records in a table. So this example:

SELECT year, month, COUNT(*) AS record_count FROM station_data

WHERE tornado = 1

GROUP BY year, month

ORDER BY year, month

So here we begin by selecting fields, counting the total number of records in a table, specifying a value in one column, and then end by grouping records and specifying a sort order from them.

Replacing the asterisk in the parentheses after 'COUNT' with a field name lets the record count only count the non-null values in that field.

SQL Aggregate functions include COUNT(); SUM(); AVG(); MAX(); and MIN() . So in this example we're creating a new field 'total_snow' with the sum of the snow_depth values for individual years.

WHERE cannot be used to filter on aggregated fields. You need to use HAVING instead.

Also note that DISTINCT can be used to get a list of unique records in a field. So if in the SQL Editor you input:

SELECT DISTINCT month FROM station_data

ORDER by month

You'll just a get a list of numbers 1 to 12.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page