SQL Fundamentals #
What SQL Is (and Isn’t) #
SQL (Structured Query Language) is the standard language for interacting with relational databases. Virtually every relational database — PostgreSQL, MySQL, SQLite, SQL Server, Oracle — speaks SQL.
SQL is a declarative language. You describe what result you want, not how to compute it. When you write a query, the database engine decides the best execution plan — which indexes to use, in what order to scan tables, how to join data. You focus on the logic; the engine handles the mechanics.
SQL is not a general-purpose programming language. You can’t write a web server or a command-line tool in SQL. It’s purpose-built for four categories of operations:
- DDL (Data Definition Language): Creating and modifying tables and schemas —
CREATE,ALTER,DROP. - DML (Data Manipulation Language): Reading and modifying data —
SELECT,INSERT,UPDATE,DELETE. - DCL (Data Control Language): Managing permissions —
GRANT,REVOKE. - TCL (Transaction Control Language): Managing transactions —
BEGIN,COMMIT,ROLLBACK.
This module focuses on DDL and DML — defining tables and working with data.
Creating Tables #
Before you can store data, you need to define the structure. Here’s how you’d create the tables from the previous module:
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE
);
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL,
product TEXT NOT NULL,
quantity INTEGER NOT NULL CHECK (quantity > 0),
price REAL NOT NULL CHECK (price > 0),
order_date DATE NOT NULL DEFAULT CURRENT_DATE,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
Notice how constraints are declared inline with the table definition. The database will enforce these rules for every row, regardless of which application is writing the data.
Inserting Data #
INSERT INTO customers (customer_id, name, email)
VALUES (1, 'Alice', 'alice@example.com');
INSERT INTO customers (customer_id, name, email)
VALUES (2, 'Bob', 'bob@example.com');
INSERT INTO customers (customer_id, name, email)
VALUES (3, 'Charlie', 'charlie@example.com');
You can also insert multiple rows in a single statement:
INSERT INTO orders (order_id, customer_id, product, quantity, price, order_date)
VALUES
(101, 1, 'Widget', 3, 9.99, '2025-03-15'),
(102, 2, 'Gadget', 1, 24.99, '2025-03-16'),
(103, 1, 'Gizmo', 2, 14.99, '2025-03-17'),
(104, 3, 'Widget', 5, 9.99, '2025-03-18'),
(105, 1, 'Gadget', 1, 24.99, '2025-03-19');
If you try to insert an order with a customer_id that doesn’t exist in the customers table, the database will reject it — that’s the foreign key constraint at work.
Querying Data with SELECT #
The SELECT statement is the workhorse of SQL. At its simplest:
SELECT * FROM customers;
This returns every column and every row from the customers table. In practice, you’ll almost always want to be more specific.
Selecting Specific Columns #
SELECT name, email FROM customers;
Only returns the columns you ask for. This matters for performance — especially with wide tables that have many columns.
Filtering with WHERE #
SELECT * FROM orders
WHERE price > 10;
Returns only the orders where the price exceeds 10. The WHERE clause supports the comparison operators you’d expect: =, != (or <>), <, >, <=, >=.
You can combine conditions with AND and OR:
SELECT * FROM orders
WHERE price > 10 AND quantity >= 2;
Other useful WHERE operators:
-- Pattern matching
SELECT * FROM customers WHERE name LIKE 'A%';
-- Checking for NULL
SELECT * FROM orders WHERE order_date IS NOT NULL;
-- Membership
SELECT * FROM orders WHERE product IN ('Widget', 'Gadget');
-- Range
SELECT * FROM orders WHERE price BETWEEN 10 AND 30;
Sorting with ORDER BY #
SELECT * FROM orders
ORDER BY order_date DESC;
Results are returned in descending date order. Use ASC (the default) for ascending.
You can sort by multiple columns:
SELECT * FROM orders
ORDER BY customer_id ASC, order_date DESC;
Limiting Results #
SELECT * FROM orders
ORDER BY price DESC
LIMIT 3;
Returns only the three most expensive orders. LIMIT is essential for pagination and for quickly inspecting large tables.
Updating Data #
UPDATE customers
SET email = 'alice.new@example.com'
WHERE customer_id = 1;
The WHERE clause is critical. Without it, the UPDATE affects every row in the table — a common and painful mistake.
You can update multiple columns at once:
UPDATE orders
SET quantity = 4, price = 8.99
WHERE order_id = 101;
Deleting Data #
DELETE FROM orders
WHERE order_id = 102;
Again, the WHERE clause determines what gets deleted. DELETE FROM orders without a WHERE clause deletes every row in the table.
Joins: Combining Data from Multiple Tables #
The real power of relational databases emerges when you combine data from related tables. This is what joins do.
INNER JOIN #
Returns rows where there’s a match in both tables:
SELECT customers.name, orders.product, orders.quantity, orders.price
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id;
Result:
| name | product | quantity | price |
|---|---|---|---|
| Alice | Widget | 3 | 9.99 |
| Bob | Gadget | 1 | 24.99 |
| Alice | Gizmo | 2 | 14.99 |
| Charlie | Widget | 5 | 9.99 |
| Alice | Gadget | 1 | 24.99 |
The join replaced customer_id with the actual customer name. No data was duplicated in storage — the relationship was reconstructed at query time.
LEFT JOIN #
Returns all rows from the left table, even if there’s no match in the right table:
SELECT customers.name, orders.product
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id;
If a customer has no orders, they still appear in the result, with NULL for the order columns. This is useful for finding “orphaned” records — customers who haven’t ordered, products that haven’t been sold, etc.
Table Aliases #
For readability, you can give tables shorter aliases:
SELECT c.name, o.product, o.quantity
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.price > 10;
Aggregation: Summarizing Data #
Aggregate functions compute a single value from a set of rows.
Common Aggregate Functions #
-- Count all orders
SELECT COUNT(*) FROM orders;
-- Total revenue
SELECT SUM(price * quantity) AS total_revenue FROM orders;
-- Average order value
SELECT AVG(price * quantity) AS avg_order_value FROM orders;
-- Most expensive single item
SELECT MAX(price) FROM orders;
-- Least expensive single item
SELECT MIN(price) FROM orders;
GROUP BY #
Aggregation becomes powerful when combined with GROUP BY, which partitions rows into groups:
SELECT c.name, COUNT(*) AS order_count, SUM(o.price * o.quantity) AS total_spent
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
GROUP BY c.name;
Result:
| name | order_count | total_spent |
|---|---|---|
| Alice | 3 | 84.94 |
| Bob | 1 | 24.99 |
| Charlie | 1 | 49.95 |
HAVING: Filtering Groups #
WHERE filters individual rows before grouping. HAVING filters groups after aggregation:
SELECT c.name, SUM(o.price * o.quantity) AS total_spent
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
GROUP BY c.name
HAVING total_spent > 30;
This returns only customers who have spent more than $30 total.
Altering Tables #
Schemas evolve. You can modify existing tables without losing data:
-- Add a column
ALTER TABLE customers ADD COLUMN phone TEXT;
-- Remove a column
ALTER TABLE customers DROP COLUMN phone;
-- Rename a table
ALTER TABLE orders RENAME TO customer_orders;
The specifics vary between database systems — PostgreSQL, MySQL, and SQLite each support slightly different ALTER TABLE capabilities.
The Order of a SQL Query #
A common source of confusion is the order in which SQL clauses are processed. You write them in one order, but the database evaluates them in another:
| Evaluation order | Clause | Purpose |
|---|---|---|
| 1 | FROM |
Which tables to read |
| 2 | JOIN |
How to combine tables |
| 3 | WHERE |
Filter individual rows |
| 4 | GROUP BY |
Partition into groups |
| 5 | HAVING |
Filter groups |
| 6 | SELECT |
Choose which columns to return |
| 7 | ORDER BY |
Sort the result |
| 8 | LIMIT |
Restrict how many rows to return |
This explains, for example, why you can’t use a column alias from SELECT in a WHERE clause — WHERE is evaluated before SELECT.
Understanding this evaluation order makes debugging complex queries much easier.