In database management an aggregate function is a function where the values of multiple rows are grouped together to form a single value of more significant meaning or measurement such as a set, a bag or a list. Wikipedia
count()
returns total number of rows that match our searchsum()
returns added sum of values for a group of rowsavg()
returns calculated average value for a group of rowsmax()
returns largest value in a group of rowsmin()
returns smallest value in a group of rowssum()
, average()
, max()
, min()
only work if the values are numbers (numeric values)count()
doesn’t count null valuescount( * )
will count all the rows, regardless of the null values 1SELECT count( * ) FROM movies; # count all rows in movies database
2
3SELECT count( title ) FROM movies; # count all rows in title column, which are not `null`
4
5SELECT sum( cost ) FROM movies; # product cost for all movies
6
7SELECT avg( tickets ) FROM movies; # average ticket cost
8
9SELECT max( tickets ) FROM movies; # most popular movie, by determining the max amount of tickets sold
10
11SELECT min( tickets ) FROM movies; # least popular movie, by determining the max amount of tickets sold
12
13SELECT max( tickets ), min( tickets ) FROM movies; # multiple functions = multiple values
GROUP BY
groups results by column values1# SELECT ___, aggregateFunction( ___ ) FROM ___ GROUP BY ___
2# SELECT columnName, aggregateFunction( columnName ) FROM tableName GROUP BY columnName
3
4SELECT genre, sum(cost)
5FROM movies
6GROUP BY genre # returns lists of all genres & sum total of costs for each of them
HAVING
is the keyword for inclusion, show results only if they have met the conditionHAVNIG
works in conjunction with an aggregate function 1# SELECT ___, aggregateFunction( ___ )
2# FROM ___
3# GROUP BY ___
4# HAVING aggregate_functino(___) operator value;
5
6# SELECT columnName, aggregateFunction( columnName ) FROM tableName GROUP BY columnName HAVING aggregate_functino(columnName) operator value;
7
8SELECT genre, sum(cost)
9FROM movies
10GROUP BY genre
11HAVING count(*) > 1 # returns lists of all genres (grouped) & sum total of costs for each of them, only if there are more than one row
WHERE
clause before GROUP BY
to further refine your queryConstraints are meant to avoid addition of bad or unwanted data. They provide additional validation
NOT NULL
prevents NULL valuesUNIQUE
Ensures column values are unique (prevents duplicates)PRIMARY KEY
unique identifiers (both UNIQUE and NOT NULL). Can only be assigned once per table (i.e. you can not have more than one primary key column).PRIMARY KEY
can only be defined on only one column (once per table) while UNIQUE
and NOT NULL
can be used on multiple columns 1CREATE TABLE Promotions
2(
3id int PRIMARY KEY, # name must be NOT NULL, must be UNIQUE
4name varchar(150) NOT NULL, # name can not be null
5category varchar(15)
6);
7
8
9CREATE TABLE Promotions
10(
11id int PRIMARY KEY,
12name varchar(150) NOT NULL UNIQUE, # name can't be null, or the same as another value already in column
13category varchar(15)
14);
ERROR: duplicate key value violates unique constraint "promotions_name_key"
DETAIL: Key (name)=(Matinee) already exists.
You can define your constraints (i.e. assign it a name). Makes it easy to edit them later. Also, you get the constraint name in the error message instead of the key value
1CREATE TABLE Promotions
2(
3id int PRIMARY KEY,
4name varchar(150) NOT NULL,
5category varchar(15)
6
7CONSTRAINT unique_name UNIQUE (name) # unique_name is what we assigned for our custom constraint name
8);
9
10CREATE TABLE Promotions
11(
12id int PRIMARY KEY,
13name varchar(150) NOT NULL,
14category varchar(15)
15
16CONSTRAINT unique_name UNIQUE (name, category) # constraint on multiple columns
17);
ERROR: duplicate key value violates unique constraint "unique_name"
DETAIL: Key (name)=(Matinee) already exists.
With value constraints you create references between two tables.
Naming convention for these is to singularize the table you are referencing, followed by an underscore and the column name (e.g. country_id
, movie_id
, customer_id
)
movie_id
column is a foreign key referencing the id
primary key column in the Movies
table1# find out movie id
2SELECT id
3FROM movies
4WHERE title = "Gone With The Wind"; # assume it returns 2
5
6# use that movie id to find out promotions
7SELECT name, category
8FROM Promotions
9WHERE movie_id = 2;
REFERENCES
constraint1movie_id int REFERENCES movies(id)
1# table being referenced must be created first (should be existing)
2CREATE TABLE Movies
3(
4 id int PRIMARY KEY,
5 title varchar(120) NOT NULL UNIQUE
6);
7
8
9CREATE TABLE Promotions
10(
11 id int PRIMARY KEY,
12 movie_id int REFERENCES movies(id),
13 name varchar(50),
14 category varchar(15)
15);
16
17
18CREATE TABLE Promotions # same as above
19(
20 id int PRIMARY KEY,
21 movie_id int REFERENCES movies, # primary key column is used by default
22 name varchar(50),
23 category varchar(15)
24);
25
26
27CREATE TABLE Promotions # same as above, different syntax
28(
29 id int PRIMARY KEY,
30 movie_id int,
31 name varchar(50),
32 category varchar(15),
33 FOREIGN KEY (movie_id) REFERENCES movies
34);
Once you have a foreign key constraint in place, it’ll give you an error if you try to reference a foreign key value that doesn’t exist
1INSERT INTO Promotions (id, movie_id, name, category)
2VALUES (4, 999, 'Fake Promotion, 'Hoax');
ERROR: insert or update on table "promotions" violates foreign key constraint "promotions_movie_id fkey"
DETAIL: Key (movie_id)=(999) is not present in table "movies".
(Child) records with a foreign key reference to a (parent) record that has been deleted
movie_id
11 in your Promotions table, but the movie has been deleted.1DELETE FROM Movies WHERE id = 6;
ERROR: update or delete on table "movies" violates foreign key constraint "promotions_movie_id_fkey" on table "promotions"
DETAIL: Key (id)=(6) is still being referenced from table "promotions".
1DROP TABLE Movies;
ERROR: can not drop table movies because other objects depend on it
DETAIL: constraint promotions_movie_id_fkey on table promotions depends on table movies
If you delete both the Movie and Promotions, you’ll get no errors. Basically, you’ll delete all referenced/associated records first and then delete the thing you wanted to delete
1DELETE FROM Promotions WHERE movie_id = 6;
2DELETE FROM Movies WHERE id = 6;
Same goes for dropping tables. First, drop any tables that are making references to the table you want to delete, and then delete the actual table
1DROP TABLE Promotions;
2DROP TABLE Movies;
For example, let’s say you want to make sure people are not entering negative values
1CREATE TABLE Movies
2(
3 id int PRIMARY KEY,
4 title varchar(120) NOT NULL UNIQUE,
5 genre varchar(120),
6 duration int CHECK (duration > 0) # validate input for this column
7);
1INSERT INTO Movies (id, title, genre, duration)
2VALUES (7, 'Intouchables', 'Drama', -10);
ERROR: new row for relation "movies" violates check constraint "movies_duration_check"
DETAIL: Failing row contains (7, 'Intouchables', 'Drama', -10 )