Gadfly SQL constructs
=====================

:Version: $Revision: 1.3 $

This  document  describes  SQL  constructs supported by Gadfly. The
presentation  does not define the complete syntax -- see sqlgram.py
for  the precise syntax as BNF -- nor the complete semantics -- see
a  good  book on SQL for more detailed coverage of semantic (or use
the  source,  Luke  ;c)  ). Also, please have a look at my evolving
database  course  notes  for  more coverage of SQL. Examples of all
supported  constructs  are also shown in the test suite source file
gftest.py.  This  document is only a very brief guide, primarily of
use  to  those  who already understand something about SQL -- it is
neither  a  tutorial  nor  a  detailed  discussion  of  syntax  and
semantics.

.. contents::


The Standard, with omissions
----------------------------

Gadfly supports a large subset of ODBC 2.0 SQL. The reason ODBC 2.0 is
chosen  is  because it provides a fairly strong set of constructs, but
does  not  include  some  of  the  more  obscure features of other SQL
standards  which would be extremely difficult and complex to implement
correctly (and perhaps, not used very frequently (?)).

Supported  features  include  views,  groupings,  aggregates, subquery
expressions,  quantified  subquery  comparisons,  EXISTS,  IN,  UNION,
EXCEPT,  INTERSECT,  searched mutations and Indices, among others (see
below).

Some important omissions from ODBC 2.0 at this point are

- Nulls.
- Outer joins.
- CHECK conditions.
- Enforced data type constraints.
- Alter table (can't implement until NULLs arrive).
- Date, Time, and Interval data types

It is hoped these will be implemented at some future time.

Less important omissions include

Cursor based updates and deletes
  (justification: if you really need them the db design
  is flawed, and it's possible to use python instead).
LIKE string predicate
  (justification: use Python regexes in python code).
Users and permissions
  (justification: Gadfly is not intended for full multiuser use at this time).

These may or may not be implemented at some future time.


Statements
----------

All  interaction  with SQL databases is mediated by SQL statements, or
statement  sequences.  Statement sequences are statements separated by
semicolons. SQL keywords and user defined names are not case sensitive
(but string values are, of course).

SQL statements include the following.

Select Statement
~~~~~~~~~~~~~~~~

The select statement derives a table from tables in the database. It's
general form is::

    sub_query
      optorder_by

Where sub_query is given by::

    SELECT alldistinct select_list
    FROM table_reference_list
    optwhere
    optgroup
    opthaving
    optunion

Read the statement::

    SELECT [DISTINCT|ALL] expressions or *
    FROM tables
    [WHERE condition]
    [GROUP BY group-expressions]
    [HAVING aggregate-condition]
    [union-clause]
    [ORDER BY columns]

as follows:

1) Make all combinations of rows from the tables (FROM line)
2) Eliminate those combinations not satisfying condition (WHERE line)
3) (if GROUP present) form aggregate groups that match on group-expressions
4) (if HAVING present) eliminate aggregate groups that don't satisfy
   the aggregate-condition.
5) compute the columns to keep (SELECT line).
6) (if union-clause present) combine (union, except, intersect)
   the result with the result of another select statement.
7) if DISTINCT, throw out redundant entries.
8) (if ORDER present) order the result by the columns (ascending
   or descending as specified, with precedence as listed).

This  reading has little to do with the actual implementation, but the
answer produced should match this intuitive reading.

Create and drop table
~~~~~~~~~~~~~~~~~~~~~

The  create  and  drop table constructs initialize and destroy a table
structure, respectively::

    CREATE TABLE user_defined_name ( colelts )

    DROP TABLE user_defined_name

The  colelts  declare the names of the columns for the table and their
data  types.  The data types are not checked or enforced in any way at
this time.

Table mutations (INSERT, UPDATE, DELETE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Insert,  Update, and Delete statements insert rows into tables, modify
rows in tables in place, or remove rows from tables respectively::

   INSERT INTO table_name optcolids insert_spec

   DELETE FROM user_defined_name optwhere

   UPDATE user_defined_name
   SET assns
   optwhere

The  insert statement has two variants (in this implementation) INSERT
sub-select and INSERT VALUES::

   insert into r (a,b,c) select a,b,c from s

   insert into r (a,b,c) values (1,2,3)

The  first  inserts  the  result of a SELECT statement into the target
table  and  the  other  inserts  explicit values (which may be dynamic
parameters, see below).

Cursor based updates are not supported at the SQL level, eg::

   update r set a=1 where current of curs

is not supported.

Indices
~~~~~~~

The  create  and  drop  index  statements initialize and destroy index
structures respectively::

    CREATE INDEX user_defined_name
    ON user_defined_name
    ( namelist )

    DROP INDEX user_defined_name

Indices  allow fast access to a table, based on values for the indexed
columns in the namelist.

Indices can be UNIQUE, meaning that the attributes of the index cannot
take on the same values in the table twice::

    CREATE UNIQUE INDEX user_defined_name
    ON user_defined_name ( namelist )

Unique  indices  can  be  used  to  enforce  primary and secondary key
constraints.  After  a UNIQUE index on a table is created inserts that
attempt  to  insert  repeat  values  for  the  indexed columns will be
rejected.

Views
~~~~~

Create  view  and  drop  view  statements  initialize  and drop views,
respectively::

    CREATE VIEW user_defined_name optnamelist
    AS select_statement

    DROP VIEW user_defined_name

Views  are  "derived  tables"  which  are  defined  as  stored  SELECT
statements.  They  can  be  used as tables, except that they cannot be
directly mutated.

It  is  possible  to  "implement your own views in Python". Please see
remotetest.py, gfintrospect and the FAQ for discussion.

Conditions
----------

Conditions  are  truth  valued  boolean  expressions formed from basic
conditions possibly combined using NOT, AND, OR (where NOT has highest
precedence and OR has lowest precedence) and parentheses.

Basic conditions include simple comparisons::

    expression = expression
    expression < expression
    expression > expression
    expression <= expression
    expression >= expression
    expression <> expression

Variants  of  the  simple  comparisons  are  the  quantified  subquery
comparisons::

    expression = ANY ( subquery )
    expression = ALL ( subquery )

(and  similarly  for the other comparison operators). The IN predicate
tests membership (like =ANY)::

    expression IN ( subquery )
    expression NOT IN ( subquery )

For all the quantified comparisons and IN the subquery must generate a
single column table.

Also included are the the BETWEEN and NOT BETWEEN predicates::

    expression BETWEEN expression AND expression
    expression NOT BETWEEN expression AND expression

The  most  general  subquery  predicate is EXISTS and NOT EXISTS which
places no restriction on the subquery::

    EXISTS (subquery)
    NOT EXISTS (subquery)

Expressions
-----------

Expressions  occur  in  conditions  (WHERE,  HAVING,  etc.), in UPDATE
searched assignments, and in the select list of select statements.

Expressions  are  formed  from  primary expressions, possibly combined
using  the  standard  arithmetic  operators  and  parentheses with the
normal precedence.

Primary  expressions  include  numeric  and  string  literals. Numeric
literals  supported  are the Python numeric literals. String constants
are  set  off  by  apostrophies,  where  two  apostrophe's in sequence
represent an apostrophy in the string:
'SQL string literals ain''t pretty'

Column name expressions may be unqualified if they are unambiguous, or
may be qualified with a table name or table alias::

    bar
    frequents.bar
    f.bar

The  rules  for  scoping  of column names are not covered here. Column
names  in  subqueries  may refer to bindings in the query (or queries)
that contain the sub-query.

Subquery expressions of form::

    ( select_statement )

must produce a single column and single row table.

Aggregate  operations  are only permitted in the select list or in the
HAVING condition of SELECT statements (including subselects)::

    COUNT(*)
    COUNT(expression)
    AVG(expression)
    MAX(expression)
    SUM(expression)
    MIN(expression)

and also including the non-standard extension MEDIAN::

    MEDIAN(expression)

Aggregate operations can be applied to distinct values as in::

    COUNT(DISTINCT expression)

The  Dynamic  expression  "?"  is  a  placeholder for a value bound at
evaluation  time  (from  Python values). See the gadfly overview_ doc
"Dynamic Values" section for more details on the use of dynamic parameters.

.. _overview: overview.html

