02. Statement Types

SQL databases work by you sending a “statement” to a server, which you hope will return a set of data as well as some metadata. If you’re less lucky, it’ll return an error.

These tend to look like “SELECT * FROM …” or similar.

There are 4 different types of statement:

DDL (Data Definition Language) - statements that will let you alter the structure of a database - for example, creating a new Table/Index or adding a column to an existing table
DML (Data Manipulation Language) - statements that will alter the data inside the database - for example retrieving data or inserting/deleting data
TCL (Transaction Control Language) - statements that are concerned with the initiation/completion/termination of transactions
DCL (Data Control Language) - statements that are to do with configuring user permissions

DDL [CREATE TABLE/ALTER TABLE/CREATE INDEX]

I think going into detail on a lot of these is perhaps a little futile, so I’ll just quickly point out some pitfalls/tips tricks

CREATE TABLE

If you’re using MySQL - make sure you’re using the format utf8_mb4 as your charset for creation - see the MySQL Addendum for more details

CREATE INDEX

If you’re using Postgres, you can run CREATE INDEX concurrently - this will take longer and thrash the disk, but it will avoid locking the table for INSERTs.

DML [INSERT/UPDATE/UPSERT]

INSERT

Always remember you can insert multiple rows at once using the syntax:

INSERT INTO my_table (col1, col2) VALUES ("val1", "val2"), ("val3", "val4");

This plays into rule number one of databases: make as few calls as possible

If you’re using Postgres, you can use RETURNING * at the end of your INSERTs to get the data back you’ve just inserted. This is useful for populating any IDs you’ve just created in a simple fashion.

UPDATE

Again, if you’re using Postgres, you can use RETURNING * at the end of your INSERTs to get the data back you’ve just inserted. This is useful for populating any IDs you’ve just created in a simple fashion.

UPSERT

A common problem you may come across is having some data and wanting to insert it into the database, but being unsure about whether this data already exists or not.

For this, we have the concept of an UPSERT - an insert where we can add additional behaviour to run if it finds a colliding entry.

In Postgres, this looks like this:

INSERT INTO my_table (id, name) VALUES (1, 'Steve') ON CONFLICT (id) DO NOTHING;

This would avoid failing if it finds that the record already exists.

If on collision you’d prefer to update it to the latest value, you can write that like so:

INSERT INTO my_table (id, name) VALUES (1, 'Steve') ON CONFLICT (id) DO UPDATE SET name = excluded.name;

In MySQL this looks more like so:

INSERT IGNORE INTO my_table (id, name) VALUES (1, 'Steve')

REPLACE INTO my_table (id, name) VALUES (1, 'Steve')

TCL - Transactions

Often ignored to the peril of data - but transactions ensure that you can run multiple queries atomically - this is useful if you to add data into multiple tables and need a guarantee data will be inserted into either both or none.

That is a bad summary of transactions - I’d recommended reading into it elsewhere than here as it’s important.

DCL - User Controls

This is useful if you want to control individual user access on your database. This is out of scope of this tutorial as it doesn’t really impact on performance (afaik) and has greatly differing syntaxes between databases.