If more than one table is included in the query, and the tables share a column with the same name, a column name must be preceded by the relevant table name, with a full stop in between.
Functions and operators may be used to produce results that are calculated from the column. The set of functions that is provided varies widely between DBMS, but the normal mathematical operators for addition, subtraction, multiplication, and division, plus a set of basic aggregation functions for maximum value (MAX), minimum value (MIN), summation (SUM), and arithmetic mean (AVG), should always be available.
A column name can also be a constant value (number or text), in which case the value is replicated for every row of the result.
The FROM clause must contain at least one table and all columns that are referred to in the query must exist in at least one of the tables in the FROM clause.
If a single table is specified, then the result is all rows of that table, subject to any filtering applied by a WHERE clause.
A table name may be followed by a table alias, which can be used in place of the original table name anywhere else in the query.
If two tables are specified, separated only by a comma, the result is all possible combinations of the rows of the two tables (a Cartesian product). This is known as a cross join.
An inner join is created from a cross join by specifying a condition so that only rows that have matching values are returned (typically using a foreign key to match with a primary key). The condition may be specified within the WHERE clause (see Section 8.2.3), or as part of an INNER JOIN syntax as shown below.
An outer join extends the inner join by including in the result rows from one table that have no match in the other table. There are left outer joins (where rows are retained from the table named on the left of the join syntax), right outer joins, and full outer joins (where non-matching rows from both tables are retained).
A self join is a join of a table with itself. This requires the use of table aliases.
By default, all rows from a table, or from a combination of tables, are returned. However, if the WHERE clause is used to specify a condition, then only rows matching that condition will be returned.
Conditions may involve any column from any table that is included in the query. Conditions usually involve a comparison between a column and a constant value, or a comparison between two columns.
A constant text value should be enclosed in single-quotes.
Valid comparison operators include: equality (=), greater-than or less-than (>, <), greater-than or equal-to or less-than or equal-to (>=, <=), and inequality (!= or <>).
Complex conditions can be constructed by combining simple conditions with logical operators: AND, OR, and NOT. Parentheses should be used to make the order of evaluation explicit.
For the case where a column can match several possible values, the special IN keyword can be used to specify a range of valid values.
Comparison with text constants can be generalized to allow patterns using the special LIKE comparison operator. In this case, within the text constant, the underscore character, _, has a special meaning; it will match any single character. The percent character, %, is also special and it will match any number of characters of any sort.
SELECT ... FROM ... WHERE stroke LIKE '%stroke';
The order of the columns in the results of a query is based on the order of the column names in the query.
The order of the rows in a result is undetermined unless an ORDER BY clause is included in the query.
The ORDER BY clause consists of one or more column names. The rows are ordered according to the values in the named columns. The keyword ASC is used to indicate ascending order and DESC is used for descending order.
The results can be ordered by the values in several columns simply by specifying several column names, separated by commas. The results are ordered by the values in the first column, but if several rows in the first column have the same value, those rows are ordered by the values in the second column.
The aggregation functions MAX, MIN, SUM, and AVG (see Section 8.2.1) all return a single value from a column. If a GROUP BY clause is included in the query, aggregated values are reported for each unique value of the column specified in the GROUP BY clause.
Results can be reported for combinations of unique values of several columns simply by naming several columns in the GROUP BY clause.
The GROUP BY clause can include a HAVING sub-clause that works like the WHERE clause but operates on the rows of aggregated results rather than the original rows.
The result of an SQL query may be used as part of a larger query. The subquery is placed within parentheses but otherwise follows the same syntax as a normal query.
Subqueries can be used in place of table names within the FROM clause and to provide comparison values within a WHERE clause.
Paul Murrell
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.