SQL (Structured Query Language) is a widely used programming language for storing, managing, and retrieving data in databases. It is especially valuable for analyzing large datasets that cannot be stored on a single machine, a.k.a. big data. One of the key advantages of using SQL for big data analysis is that it does not require data to be directly imported, which in turn keeps the processes of data analysis and data maintenance separate. This separation enhances the efficiency and quality of both processes, ensuring that the insights derived from big data are accurate and actionable.
There are many dialects of SQL, including SQL Server, PostgreSQL, Oracle, MySQL, and SQLite. In this walkthrough of SQL fundamentals, we’ll use SQLite to work with a database stored on our computer. Using SQLite will allow us to execute SQL queries within the RMarkdown environment and having the data local will simplify our connection to it. The database we’ll use is from the company data.world.
To follow along in RStudio, please download the sql-fundamentals-walkthrough repository to your local computer and place the sqlite files in the same folder as this RMarkdown file.
To begin, follow the code chunks below, using the accompanying markdown snippets for insight and guidance. If you see something that can be improved, please do reach out!
library(DBI)# To connect to database management systems
library(RSQLite) # To use SQLite in RMarkdown
library(RMariaDB) # To use MySQL (if needed)
library(dplyr) # Part of tidyverse, has commands for reading data
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Now, let’s set up a connection to the animals database using the dbConnect function. Note that we are not importing data. Instead, we create a connection or pathway that allows us to access it.
db <- dbConnect(
SQLite(), # Dialect of SQL we will be using
dbname = "animals.sqlite" # Name of database and where to find it
)
db
## <SQLiteConnection>
## Path: C:\Users\chira\Documents\Projects\animals.sqlite
## Extensions: TRUE
Finally, before diving into the essential keywords of SQL, let’s examine the tables in this database.
dbListTables(db)
## [1] "austin_animal_center_age_at_outcome"
## [2] "austin_animal_center_intakes"
## [3] "austin_animal_center_intakes_by_month"
## [4] "austin_animal_center_outcomes"
## [5] "austin_animal_center_outcomes_for_animal_type_and_subtype"
## [6] "sqlite_stat1"
## [7] "sqlite_stat4"
Most SQL queries start with the SELECT statement. As the name suggests, SELECT is used to retrieve data from tables in a database.
Here, SELECT is used to retrieve the name and intake_type columns from the table, austin_animal_center_intakes:
SELECT name, intake_type
FROM austin_animal_center_intakes
name | intake_type |
---|---|
Scamp | Stray |
Scamp | Public Assist |
Bri-Bri | Stray |
Tyson | Public Assist |
Jo Jo | Public Assist |
Oso | Owner Surrender |
Oso | Public Assist |
*Dottie | Stray |
Manolo | Owner Surrender |
Manolo | Owner Surrender |
Note: Keywords are case-insensitive, but convention is to capitalize them. It’s also convention to use multiple lines for different parts of the query. This makes the queries more readable.
“*” is a wildcard used to select all columns:
SELECT * FROM austin_animal_center_intakes
animal_id | name | datetime | monthyear | found_location | intake_type | intake_condition | animal_type | sex_upon_intake | age_upon_intake | breed | color |
---|---|---|---|---|---|---|---|---|---|---|---|
A006100 | Scamp | 1512655620 | POINT(17.0 12.0) | Colony Creek And Hunters Trace in Austin (TX) | Stray | Normal | Dog | Neutered Male | 10 years | Spinone Italiano Mix | Yellow/White |
A006100 | Scamp | 1418984460 | POINT(14.0 12.0) | 8700 Research Blvd in Austin (TX) | Public Assist | Normal | Dog | Neutered Male | 7 years | Spinone Italiano Mix | Yellow/White |
A191351 | Bri-Bri | 1447430220 | POINT(15.0 11.0) | 1912 E William Cannon Rd in Austin (TX) | Stray | Normal | Cat | Intact Female | 16 years | Domestic Longhair Mix | Black/White |
A322813 | Tyson | 1425566940 | POINT(15.0 3.0) | Austin (TX) | Public Assist | Normal | Dog | Neutered Male | 11 years | Rottweiler Mix | Black/Brown |
A553074 | Jo Jo | 1459037040 | POINT(16.0 3.0) | 3614 Bill Price in Travis (TX) | Public Assist | Normal | Dog | Spayed Female | 7 years | Labrador Retriever/German Shepherd | Brown/Black |
A672744 | Oso | 1392657000 | POINT(14.0 2.0) | Travis (TX) | Owner Surrender | Normal | Dog | Intact Male | 3 years | Rottweiler Mix | Black/Tan |
A672744 | Oso | 1392657000 | POINT(14.0 2.0) | Austin (TX) | Public Assist | Normal | Dog | Intact Male | 3 years | Rottweiler Mix | Black/Tan |
A672745 | *Dottie | 1392548460 | POINT(14.0 2.0) | Mlk And Fm 969 in Austin (TX) | Stray | Normal | Dog | Spayed Female | 2 years | American Staffordshire Terrier Mix | White/Red |
A672746 | Manolo | 1400679420 | POINT(14.0 5.0) | Austin (TX) | Owner Surrender | Normal | Dog | Neutered Male | 9 years | Jack Russell Terrier Mix | White/Black |
A672746 | Manolo | 1392549660 | POINT(14.0 2.0) | Travis (TX) | Owner Surrender | Normal | Dog | Neutered Male | 9 years | Jack Russell Terrier Mix | White/Black |
To see how many animals, count animal_id:
SELECT COUNT(animal_id)
FROM austin_animal_center_intakes
COUNT(animal_id) |
---|
75947 |
Suppose we want to collect together all the different types of animals. Add keyword DISTINCT to SELECT:
SELECT DISTINCT animal_type
FROM austin_animal_center_intakes
animal_type |
---|
Dog |
Cat |
Other |
Bird |
Livestock |
DISTINCT applies to all columns listed in SELECT:
SELECT DISTINCT animal_type,
sex_upon_intake,
age_upon_intake
FROM austin_animal_center_intakes
animal_type | sex_upon_intake | age_upon_intake |
---|---|---|
Dog | Neutered Male | 10 years |
Dog | Neutered Male | 7 years |
Cat | Intact Female | 16 years |
Dog | Neutered Male | 11 years |
Dog | Spayed Female | 7 years |
Dog | Intact Male | 3 years |
Dog | Spayed Female | 2 years |
Dog | Neutered Male | 9 years |
Dog | Spayed Female | 1 year |
Other | Unknown | 3 years |
Note: The output of this query has 539 rows, meaning there are 539 combinations of these three values in the data set.
In SQL, we use WHERE to filter data.
To find all cats in the monthly intakes data set:
SELECT year, month, count, animal_type
FROM austin_animal_center_intakes_by_month
WHERE animal_type == "Cat"
year | month | count | animal_type |
---|---|---|---|
2013 | 10 | 542 | Cat |
2013 | 11 | 436 | Cat |
2013 | 12 | 331 | Cat |
2014 | 1 | 335 | Cat |
2014 | 2 | 269 | Cat |
2014 | 3 | 353 | Cat |
2014 | 4 | 566 | Cat |
2014 | 5 | 901 | Cat |
2014 | 6 | 821 | Cat |
2014 | 7 | 881 | Cat |
Note: Logical operator is =, not ==
In SQL, we use ORDER BY to sort data in ascending or descending order.
To order by year with ties broken by month:
SELECT year, month, count, animal_type
FROM austin_animal_center_intakes_by_month
WHERE animal_type == "Cat"
ORDER BY year, month
year | month | count | animal_type |
---|---|---|---|
2013 | 10 | 542 | Cat |
2013 | 11 | 436 | Cat |
2013 | 12 | 331 | Cat |
2014 | 1 | 335 | Cat |
2014 | 2 | 269 | Cat |
2014 | 3 | 353 | Cat |
2014 | 4 | 566 | Cat |
2014 | 5 | 901 | Cat |
2014 | 6 | 821 | Cat |
2014 | 7 | 881 | Cat |
Note: WHERE must come before ORDER BY.
To put more recent data first, use DESC. DESC goes after the variable name:
SELECT year, month, count, animal_type
FROM austin_animal_center_intakes_by_month
WHERE animal_type == "Cat"
ORDER BY year DESC, month DESC
year | month | count | animal_type |
---|---|---|---|
2017 | 12 | 100 | Cat |
2017 | 11 | 427 | Cat |
2017 | 10 | 513 | Cat |
2017 | 9 | 656 | Cat |
2017 | 8 | 565 | Cat |
2017 | 7 | 669 | Cat |
2017 | 6 | 895 | Cat |
2017 | 5 | 914 | Cat |
2017 | 4 | 565 | Cat |
2017 | 3 | 353 | Cat |
To order by count in descending order:
SELECT year, month, count, animal_type
FROM austin_animal_center_intakes_by_month
WHERE animal_type == "Cat"
ORDER BY count DESC
year | month | count | animal_type |
---|---|---|---|
2015 | 6 | 1103 | Cat |
2015 | 5 | 1009 | Cat |
2016 | 5 | 921 | Cat |
2017 | 5 | 914 | Cat |
2014 | 5 | 901 | Cat |
2017 | 6 | 895 | Cat |
2014 | 7 | 881 | Cat |
2014 | 6 | 821 | Cat |
2015 | 8 | 812 | Cat |
2015 | 10 | 784 | Cat |
By default, ORDER BY puts NULL values at end; to put them first, add NULLS FIRST at the end of the line.
ORDER BY [colname] NULLS FIRST
Logical operators:
Cats and dogs where intake condition was not normal:
SELECT animal_type, intake_type,
Intake_condition, age_upon_intake
FROM austin_animal_center_intakes
WHERE (animal_type = "Cat" OR animal_type = "Dog") AND (NOT intake_condition = "Normal")
animal_type | intake_type | intake_condition | age_upon_intake |
---|---|---|---|
Cat | Owner Surrender | Sick | 15 years |
Dog | Stray | Injured | 3 years |
Cat | Stray | Injured | 1 year |
Dog | Stray | Injured | 5 months |
Dog | Stray | Injured | 5 years |
Cat | Stray | Injured | 1 year |
Cat | Owner Surrender | Sick | 11 years |
Dog | Public Assist | Other | 3 years |
Dog | Stray | Injured | 1 year |
Dog | Stray | Injured | 2 years |
To check if data value is NULL, use IS NULL:
SELECT * FROM austin_animal_center_intakes
WHERE animal_type IS NULL
animal_id | name | datetime | monthyear | found_location | intake_type | intake_condition | animal_type | sex_upon_intake | age_upon_intake | breed | color |
---|
For numerical data, we can use BETWEEN:
SELECT *
FROM austin_animal_center_intakes_by_month
WHERE count BETWEEN 900 and 2000
year | month | animal_type | count |
---|---|---|---|
2013 | 10 | Dog | 965 |
2014 | 5 | Cat | 901 |
2014 | 5 | Dog | 966 |
2014 | 7 | Dog | 926 |
2014 | 9 | Dog | 917 |
2015 | 5 | Cat | 1009 |
2015 | 5 | Dog | 988 |
2015 | 6 | Cat | 1103 |
2015 | 6 | Dog | 1014 |
2016 | 5 | Cat | 921 |
To rename variables, use AS keyword:
SELECT name AS Name, intake_type AS 'Intake Type'
FROM austin_animal_center_intakes
Name | Intake Type |
---|---|
Scamp | Stray |
Scamp | Public Assist |
Bri-Bri | Stray |
Tyson | Public Assist |
Jo Jo | Public Assist |
Oso | Owner Surrender |
Oso | Public Assist |
*Dottie | Stray |
Manolo | Owner Surrender |
Manolo | Owner Surrender |
To mutate, use AS within select:
SELECT monthyear,
animal_type,
outcome_type,
(age_in_days/365) AS 'Years Old'
FROM austin_animal_center_age_at_outcome
monthyear | animal_type | outcome_type | Years Old |
---|---|---|---|
2014-03 | Dog | Return to Owner | 6.668493 |
2014-12 | Dog | Return to Owner | 7.454795 |
2015-11 | Cat | Return to Owner | 16.252055 |
2015-03 | Dog | Return to Owner | 11.972603 |
2015-04 | Dog | Return to Owner | 7.638356 |
2014-09 | Dog | Return to Owner | 2.668493 |
2014-01 | Dog | Euthanasia | 2.002740 |
2014-01 | Cat | Euthanasia | 15.013699 |
2014-01 | Dog | Return to Owner | 3.005479 |
2014-01 | Dog | Return to Owner | 2.013699 |
The LIKE and NOT LIKE operators are used in a WHERE clause to search for a specified pattern in a column. They are often used with wildcard characters to substitute for one or more characters in a pattern string:
Examples:
Suppose we want to find all data where the breed includes “wolfhound”:
SELECT sex_upon_outcome,
outcome_type,
outcome_subtype,
breed
FROM austin_animal_center_outcomes
WHERE animal_type = "Dog"
AND breed LIKE "%wolfhound%"
ORDER BY monthyear
sex_upon_outcome | outcome_type | outcome_subtype | breed |
---|---|---|---|
Neutered Male | Transfer | Partner | Irish Terrier/Irish Wolfhound |
Spayed Female | Adoption | NA | Irish Wolfhound Mix |
Neutered Male | Transfer | Partner | Irish Wolfhound/Great Pyrenees |
Neutered Male | Adoption | NA | Irish Wolfhound Mix |
Neutered Male | Adoption | NA | Catahoula/Irish Wolfhound |
Intact Female | Return to Owner | NA | Irish Wolfhound/Great Dane |
Neutered Male | Transfer | Partner | Irish Wolfhound/Australian Shepherd |
Neutered Male | Return to Owner | NA | Irish Wolfhound Mix |
Intact Male | Return to Owner | NA | Irish Wolfhound Mix |
Intact Female | Transfer | Partner | Irish Wolfhound Mix |
You can select different parts of the data to look at.
SELECT found_location, intake_type
FROM austin_animal_center_intakes
LIMIT 100
OFFSET 20000
found_location | intake_type |
---|---|
Barton Creek Square in Austin (TX) | Stray |
8411 Ganttcrest Dr in Austin (TX) | Stray |
1802 Cherry Orchid in Austin (TX) | Stray |
Austin (TX) | Owner Surrender |
Austin (TX) | Owner Surrender |
Mckinney Rd & William Cannon in Austin (TX) | Stray |
Jubilee Tr in Austin (TX) | Stray |
4537 S Ih 35 in Austin (TX) | Public Assist |
Del Valle (TX) | Owner Surrender |
4537 S Ih 35 in Austin (TX) | Public Assist |
When done, it’s good practice to disconnect from the database:
dbDisconnect(db)
In part 2 of this walkthrough, we’ll be using the sales.sqlite database. Let’s set up a connection to it:
db_sales <- dbConnect(
SQLite(), # Dialect of SQL we will be using
dbname = "sales.sqlite" # Name of database and where to find it
)
db_sales
## <SQLiteConnection>
## Path: C:\Users\chira\Documents\Projects\sales.sqlite
## Extensions: TRUE
dbListTables(db_sales)
## [1] "accounts" "employees" "intl_accounts" "sales_pipeline"
## [5] "sales_pipline" "sales_teams" "sqlite_stat1" "sqlite_stat4"
Method 1: Use WHERE
SELECT sales_pipeline.sales_agent,
sales_teams.manager,
sales_pipeline.account
FROM sales_teams, sales_pipeline
WHERE (sales_pipeline.sales_agent = sales_teams.sales_agent)
Sales_Agent | manager | Account |
---|---|---|
Moses Frase | Dustin Brinkmann | Cancity |
Darcel Schlecht | Melvin Marxen | Isdom |
Darcel Schlecht | Melvin Marxen | Cancity |
Moses Frase | Dustin Brinkmann | Codehow |
Zane Levy | Summer Sewald | Hatfan |
Anna Snelling | Dustin Brinkmann | Ron-tech |
Vicki Laflamme | Celia Rouche | J-Texon |
Markita Hansen | Celia Rouche | Cheers |
Niesha Huffines | Melvin Marxen | Zumgoity |
James Ascencio | Summer Sewald | NA |
What’s going on here?
This is an inner join. It only keeps agents that appear in both tables.
In WHERE, in addition to defining the join, we can also specify filtering conditions:
SELECT sales_pipeline.sales_agent,
sales_teams.manager,
sales_pipeline.account
FROM sales_teams, sales_pipeline
WHERE (sales_pipeline.sales_agent = sales_teams.sales_agent) AND
sales_pipeline.deal_stage = "Won"
Sales_Agent | manager | Account |
---|---|---|
Moses Frase | Dustin Brinkmann | Cancity |
Darcel Schlecht | Melvin Marxen | Isdom |
Darcel Schlecht | Melvin Marxen | Cancity |
Moses Frase | Dustin Brinkmann | Codehow |
Zane Levy | Summer Sewald | Hatfan |
Anna Snelling | Dustin Brinkmann | Ron-tech |
Vicki Laflamme | Celia Rouche | J-Texon |
Markita Hansen | Celia Rouche | Cheers |
Niesha Huffines | Melvin Marxen | Zumgoity |
Anna Snelling | Dustin Brinkmann | Bioholding |
For extra readability, we can relabel variables/columns using AS. We can also rename whole tables:
SELECT pipeline.sales_agent,
teams.manager,
pipeline.account
FROM sales_teams AS teams, sales_pipeline AS pipeline
WHERE (pipeline.sales_agent = teams.sales_agent) AND
pipeline.deal_stage = "Won"
Sales_Agent | manager | Account |
---|---|---|
Moses Frase | Dustin Brinkmann | Cancity |
Darcel Schlecht | Melvin Marxen | Isdom |
Darcel Schlecht | Melvin Marxen | Cancity |
Moses Frase | Dustin Brinkmann | Codehow |
Zane Levy | Summer Sewald | Hatfan |
Anna Snelling | Dustin Brinkmann | Ron-tech |
Vicki Laflamme | Celia Rouche | J-Texon |
Markita Hansen | Celia Rouche | Cheers |
Niesha Huffines | Melvin Marxen | Zumgoity |
Anna Snelling | Dustin Brinkmann | Bioholding |
Note: Even when renaming tables in FROM, we can use those new names in SELECT.
Method 2: Use JOIN keyword
When joining columns that have the same name in both tables, we can use USING:
SELECT pipeline.sales_agent AS Agent,
Teams.manager,
pipeline.account
FROM sales_teams AS teams
JOIN sales_pipeline AS pipeline USING (sales_agent)
WHERE pipeline.deal_stage = "Won"
Agent | manager | Account |
---|---|---|
Moses Frase | Dustin Brinkmann | Cancity |
Darcel Schlecht | Melvin Marxen | Isdom |
Darcel Schlecht | Melvin Marxen | Cancity |
Moses Frase | Dustin Brinkmann | Codehow |
Zane Levy | Summer Sewald | Hatfan |
Anna Snelling | Dustin Brinkmann | Ron-tech |
Vicki Laflamme | Celia Rouche | J-Texon |
Markita Hansen | Celia Rouche | Cheers |
Niesha Huffines | Melvin Marxen | Zumgoity |
Anna Snelling | Dustin Brinkmann | Bioholding |
If the columns we’re joining by have different names, use ON instead of USING:
SELECT pipeline.sales_agent AS Agent,
Teams.manager,
pipeline.account
FROM sales_teams AS teams
JOIN sales_pipeline AS pipeline
ON pipeline.sales_agent = teams.sales_agent
WHERE pipeline.deal_stage = "Won"
Agent | manager | Account |
---|---|---|
Moses Frase | Dustin Brinkmann | Cancity |
Darcel Schlecht | Melvin Marxen | Isdom |
Darcel Schlecht | Melvin Marxen | Cancity |
Moses Frase | Dustin Brinkmann | Codehow |
Zane Levy | Summer Sewald | Hatfan |
Anna Snelling | Dustin Brinkmann | Ron-tech |
Vicki Laflamme | Celia Rouche | J-Texon |
Markita Hansen | Celia Rouche | Cheers |
Niesha Huffines | Melvin Marxen | Zumgoity |
Anna Snelling | Dustin Brinkmann | Bioholding |
Note: This is still an inner join. It only keep agents that appear in both tibbles.
Right joins, left joins, and full joins are all examples of outer joins.
Example of a Left Outer Join:
SELECT sales_teams.sales_agent AS Agent,
sales_pipeline.deal_stage
FROM sales_teams
LEFT OUTER JOIN sales_pipeline
ON sales_teams.sales_agent = sales_pipeline.Sales_Agent
WHERE Agent = "Carol Thompson"
Agent | Deal_Stage |
---|---|
Carol Thompson | NA |
Note: Carol Thompson has no deals in pipeline and, thus, does not appear in the sales_pipeline table. However, she is included when we do the left outer join as above because she’s in sales_teams.
To do a right join:
SELECT sales_teams.sales_agent AS Agent,
sales_pipeline.deal_stage
FROM sales_pipeline
LEFT OUTER JOIN sales_teams
ON sales_teams.sales_agent = sales_pipeline.Sales_Agent
WHERE Agent = "Carol Thompson"
Agent | Deal_Stage |
---|
To do a full join:
These go within the SELECT:
SELECT SUM(Revenue) AS "Total Revenue"
FROM accounts
Total Revenue |
---|
173358 |
To count the number of rows, use COUNT(*):
SELECT COUNT(*) AS "Number of Deals Won"
FROM sales_pipeline
WHERE deal_stage = "Won"
Number of Deals Won |
---|
4238 |
Note: Counting of rows happens after WHERE.
Find average closing value of each deal:
SELECT COUNT(*) AS "Deals Won",
AVG(close_value) AS "Average Closing Value"
FROM sales_pipeline
WHERE deal_stage = "Won"
Deals Won | Average Closing Value |
---|---|
4238 | 2360.909 |
SELECT sales_agent AS "Agent",
COUNT(*) AS "Deals Won",
AVG(close_value) AS "Average Closing Value"
FROM sales_pipeline
WHERE deal_stage = "Won"
GROUP BY sales_agent
Agent | Deals Won | Average Closing Value |
---|---|---|
Anna Snelling | 208 | 1322.385 |
Boris Faz | 101 | 2590.406 |
Cassey Cress | 163 | 2763.736 |
Cecily Lampkin | 107 | 2147.664 |
Corliss Cosme | 150 | 2806.907 |
Daniell Hammack | 114 | 3194.991 |
Darcel Schlecht | 349 | 3304.338 |
Donn Cantrell | 158 | 2821.899 |
Elease Gluck | 80 | 3614.938 |
Garret Kinder | 75 | 2636.973 |
Works with joins:
SELECT sales_teams.manager,
COUNT(*) AS "Number Deals Won",
AVG(sales_pipeline.close_value) AS "Average Closing Value"
FROM sales_teams
JOIN sales_pipeline ON (sales_teams.sales_agent = sales_pipeline.sales_agent)
WHERE sales_pipeline.deal_stage = "Won"
GROUP BY sales_teams.manager
manager | Number Deals Won | Average Closing Value |
---|---|---|
Cara Losch | 480 | 2354.269 |
Celia Rouche | 610 | 2629.339 |
Dustin Brinkmann | 747 | 1465.011 |
Melvin Marxen | 882 | 2553.209 |
Rocco Neubert | 691 | 2837.258 |
Summer Sewald | 828 | 2372.886 |
To filter observations before an aggregation occurs (or, within an aggregation), use FILTER(WHERE(…)) within the SELECT.
To only count deals with a closing value over 1000:
SELECT sales_agent,
COUNT(close_value) AS total,
COUNT(close_value) FILTER(WHERE(close_value > 1000)) AS 'over 1000'
FROM sales_pipeline
WHERE deal_stage = "Won"
GROUP BY sales_agent
Sales_Agent | total | over 1000 |
---|---|---|
Anna Snelling | 208 | 68 |
Boris Faz | 101 | 70 |
Cassey Cress | 163 | 112 |
Cecily Lampkin | 107 | 63 |
Corliss Cosme | 150 | 97 |
Daniell Hammack | 114 | 91 |
Darcel Schlecht | 349 | 272 |
Donn Cantrell | 158 | 106 |
Elease Gluck | 80 | 32 |
Garret Kinder | 75 | 46 |
To filter observations after an aggregation occurs, use HAVING. Let’s find sales agents with large deals:
SELECT sales_agent,
COUNT(close_value) AS total
FROM sales_pipeline
WHERE deal_stage = "Won"
GROUP BY sales_agent
HAVING COUNT(close_value) > 200
Sales_Agent | total |
---|---|
Anna Snelling | 208 |
Darcel Schlecht | 349 |
Kary Hendrixson | 209 |
Vicki Laflamme | 221 |
To find all accounts, including international:
SELECT intl_accounts.account,
intl_accounts.office_location AS location
FROM intl_accounts
account | location |
---|---|
Ganjaflex | Japan |
Genco Pura Olive Oil Company | Italy |
Globex Corporation | Norway |
Betatech | Kenya |
Bioholding | Philipines |
Hottechi | Korea |
Streethex | Belgium |
Sumace | Romania |
Zencorporation | China |
Sunnamplex | Poland |
SELECT accounts.account,
"USA" AS Location
FROM accounts
Account | Location |
---|---|
Acme Corporation | USA |
Betasoloin | USA |
Betatech | USA |
Bioholding | USA |
Bioplex | USA |
Blackzim | USA |
Bluth Company | USA |
Bubba Gump | USA |
Cancity | USA |
Cheers | USA |
To do the union:
SELECT intl_accounts.account,
intl_accounts.office_location AS location
FROM intl_accounts
UNION
SELECT accounts.account,
"USA" AS Location
FROM accounts
account | location |
---|---|
AWOLEX | USA |
Acme Corporation | USA |
Betasoloin | USA |
Betatech | Kenya |
Betatech | USA |
Bioholding | Philipines |
Bioholding | USA |
Bioplex | USA |
Blackzim | USA |
Bluth Company | USA |
Note: To do a full join, do a left join, do a right join (by swapping the order of the tables in your join), and then take their union.
To find agents in sale_teams but not in sales_pipeline:
SELECT sales_agent
FROM sales_teams
EXCEPT
SELECT sales_agent
FROM sales_pipeline
sales_agent |
---|
Carl Lin |
Carol Thompson |
Elizabeth Anderson |
Mei-Mei Johns |
Natalya Ivanova |
Note: This is like a set difference.
Keywords must appear in the following order (not all have to appear):
dbDisconnect(db_sales)