Mini SQL 2.0 (Beta)
Language Specifications
Introduction
The mSQL language offers a significant subset of the features provided
by ANSI SQL. It allows a program or user to store, manipulate and retrieve data in table
structures. It does not support some relational capabilities such as views and nested
queries. Although it does not support all the relational operations defined in the ANSI
specification, it does provide the capability of "joins" between multiple
tables.
The definitions and examples below depict mSQL key words in upper case,
but no such restriction is placed on the actual queries.
The Create Clause
The create clause as supported by mSQL 2 can be used to create tables,
indices, and sequences. It cannot be used to create other definitions such as views. The
three valid constructs of the create clause are shown below:
- CREATE TABLE table_name (
- col_name col_type [ not null ]
- [ , col_name col_type [ not null ] ]**
- )
- CREATE [ UNIQUE ] INDEX index_name ON table_name (
- field_name
- [ , field_name ] **
- )
- CREATE SEQUENCE ON table_name [ STEP step_val ] [ VALUE initial_val ]
An example of the creation of a table is show below:
- CREATE TABLE emp_details (
- first_name char(15) not null,
- last_name char(15) not null,
- comment text(50),
- dept char(20),
- emp_id int
- )
The available types are:-
| char (len) |
String of characters (or other 8 bit data) |
| text (len) |
Variable length string of chracters (or other 8 bit data) The defined
length is used to indicate the expected average length of the data. Any data longer than
the specified length will be split between the data table and external overflow buffers.
Note : text fields are slower to access than char fields and
cannot be used in an index nor in LIKE tests.
|
| int |
Signed integer values |
| real |
Decimal or Scientific Notation real values |
The table structure shown in the example would benefit greatly from the creation of some indices.
It is assumed that the emp_id field would be a unique value that is used to
identify an employee. Such a field would normally be defined as the primary key. mSQL 2.0
has removed support for the primary key construct within the table creation syntax
although the same result can be achieved with an index. Similarly, a common query may be
to access an employee based on the combination of the first and last names. A compound
index (i.e. constructed from more than 1 field) would improve performance. We could
construct these indices using :
CREATE UNIQUE INDEX idx1 ON emp_details (emp_id)
CREATE INDEX idx2 ON emp_details (first_name, last_name)
These indices will be used automatically whenever a query is sent to the database engine
that uses those fields in its WHERE clause. The user is not required to specify any
special values in the query to ensure the indices are used to increase performance.
Sequences provide a mechanism via which a sequence value can be maintained by the
mSQL server. This allows for atomic operations (such as getting the next sequence value)
and removes the concerns associated with performing these operations in client
applications. A sequence is associated with a table and a table may contain at most one
sequence.
Once a sequence has been created it can be accessed by SELECTing the
_seq system variable from the table in which the sequence is defined. For example
CREATE SEQUENCE ON test STEP 1 VALUE 5
SELECT _seq FROM test
The above CREATE operation would define a sequence on the table called test
that had an initial value of 5 and would be incremented each time it is accessed (i.e.
have a step of 1). The SELECT statement above would return the value 5. If the SELECT was
issued again, a value of 6 would be returned. Each time the _seq field is selected from test
the current value is returned to the caller and the sequence value itself is incremented.
Using the STEP and VALUE options a sequence can be created that starts
at any specified number and is incremented or decremented by any specified value. The
value of a sequence would decrease by 5 each time it was accessed if it was defined with a
step of -5.
The Drop Clause
The Drop clause is used to remove a definition from the database. It is
most commonly used to remove a table from a database but can also be used for removing
several other constructs. In 2.0 it can be used to remove the definition of an index, a
sequence, or a table. It should be noted that dropping a table or an index removes
the data associated with that object as well as the definition.
The syntax of the drop clause as well as examples of its use are given
below.
DROP TABLE table_name
DROP INDEX index_name FROM table_name
DROP SEQUENCE FROM table_name
for example
DROP TABLE emp_details
DROP INDEX idx1 FROM emp_details
DROP SEQUENCE FROM emp_details
The Insert Clause
Unlike ANSI SQL, you cannot nest a select within an insert (i.e. you
cannot insert the data returned by a select). If you do not specify the field names they
will be used in the order they were defined - you must specify a value for every field if
you do this.
- INSERT INTO table_name [ ( column [ , column ]** ) ]
- VALUES (value [, value]** )
for example
- INSERT INTO emp_details
- (first_name, last_name, dept, salary)
- VALUES (`David', `Hughes', `Development','12345')
- INSERT INTO emp_details
- VALUES (`David', `Hughes', `Development','12345')
The number of values supplied must match the number of columns.
The Select Clause
The SELECT offered by mSQL lacks some of the features provided by the
standard SQL specification. Development of mSQL 2 is continuing and some of this missing
functionality will be made available in the next beta release. At this point in time,
mSQL's select does not provide
- Nested selects
- Implicit functions (e.g. count(), avg() )
It does however support:
- Joins - including table aliases
- DISTINCT row selection
- ORDER BY clauses
- Regular expression matching
- Column to Column comparisons in WHERE clauses
- Complex conditions
The formal definition of the syntax for mSQL's select clause is
- SELECT [table.]column [ , [table.]column ]**
- FROM table [ = alias] [ , table [ = alias] ]**
- [ WHERE [table.] column OPERATOR VALUE
- [ AND | OR [table.]column OPERATOR VALUE]** ]
- [ ORDER BY [table.]column [DESC] [, [table.]column [DESC] ]
OPERATOR can be <,> , =, <=, =, <>, LIKE, RLIKE or CLIKE
VALUE can be a literal value or a column name
Where clauses may contain '(' ')' to nest conditions e.g. "where
(age <20 or age>30) and sex = 'male'" .
A simple select may be
- SELECT first_name, last_name FROM emp_details
- WHERE dept = `finance'
To sort the returned data in ascending order by last_name and descending
order by first_name the query would look like this
- SELECT first_name, last_name FROM emp_details
- WHERE dept = `finance'
- ORDER BY last_name, first_name DESC
And to remove any duplicate rows from the result of the select, the
DISTINCT operator could be used:
- SELECT DISTINCT first_name, last_name FROM emp_details
- WHERE dept = `finance'
- ORDER BY last_name, first_name DESC
mSQL provides three regular expression operators for use in where
comparisons. The standard SQL syntax provides a very simplistic regular expression
capability that does not provide the power nor the flexibility UNIX programmers or users
will be accustomed to. mSQL supports the "standard" SQL regular expression
syntax, via the LIKE operator, but also provide further functionality if it is required.
The available regular expression operators are:
- LIKE - the standard SQL regular expression operator.
- CLIKE - a standard LIKE operator that ignores case.
- RLIKE - a complete UNIX regular expression operator.
Note : CLIKE and RLIKE are not standard SQL and may not be
available in other implementations of the language if you decide to port your application.
They are however very convenient and powerful features of mSQL.
The regular expression syntax supported by the LIKE and CLIKE operators
is that of standard SQL and is outlined below
| `_' |
matches any single character |
| `%' |
matches 0 or more characters of any value |
| `\' |
escapes special characters (e.g. `\%' matches % and `\\' matches \ ) |
|
all other characters match themselves |
As an example of the LIKE operator, it is possible to search for anyone
in the finance department who's last name consists of any letter followed by `ughes', such
as Hughes. The query to perform this operation could look like
SELECT first_name, last_name FROM emp_details
- WHERE dept = `finance' and last_name like `_ughes'
The RLIKE operator provides access to the power of the UNIX standard
regular expression syntax. The UNIX regular expression syntax provides far greater
functionality than SQL's LIKE syntax. The UNIX regex syntax does not use the '_' or '%'
characters in the way SQL's regex does (as outlined above). The syntax available in the
RLIKE operator is
| '.' |
matches any single character |
| '^' |
When used as the first charactr in a regex, the caret character forces
the match to start at the first character of the string |
| '$' |
When used as the last charactr in a regex, the dollar sign forces the
match to end at the last character of the string |
| '[ ]' |
By enclosing a group of single characters withing square brackets, the
regex will match a single character from the group of characters. If the ']' character is
one of the characters you wish to match you may specifiy it as the first character in the
group without closing the group (e.g. '[]abc]' would match any single character that was
either ']', 'a', 'b', or 'c'). Ranges of characters can be specified within the group
using the 'first-last' syntax (e.g. '[a-z0-9]' would match any lower case letter or a
digit). If the first charactr of the group is the '^' character the regex will match any
single character that is not contained within the group. |
| '*' |
If any regex element is followed by a '*' it will match zero or more
instances of the regular expression. |
The power of a relational query language starts to become apparent when
you join tables together during a select operation. Lets say you had two tables defined,
one containing staff details and another listing the projects being worked on by each
staff member, and each staff member has been assigned an employee number that is unique to
that person. You could generate a sorted list of who was working on what project with a
query like:
SELECT emp_details.first_name, emp_details.last_name,
project_details.project
- FROM emp_details, project_details
- WHERE emp_details.emp_id = project_details.emp_id
- ORDER BY emp_details.last_name, emp_details.first_name
mSQL places no restriction on the number of tables "joined"
during a query so if there were 15 tables all containing information related to an
employee ID in some manner, data from each of those tables could be extracted, by a single
query. One key point to note regarding joins is that you must qualify all column names
with a table name. mSQL does not support the concept of uniquely named columns spanning
multiple tables so you are forced to qualify every column name as soon as you access more
than one table in a single select.
mSQL also supports table aliases so that you can perform a join of a
table onto itself. This may appear to be an unusual thing to do but it is a very powerful
feature if there are rows within a single table relate to each other in some way. An
example of such a table could be a list of people including the names of their parents. In
such a table there would be multiple rows with a parent/child relationship. Using a table
alias you could find out any grandparents contained in the table using something like
- SELECT t1.parent, t2.child from parent_data=t1, parent_data=t2
- where t1.child = t2.parent
The table aliases t1 and t2 both point to the same table (parent_data in
this case) and are treated as two different tables that just happen to contain exactly the
same data.
The Delete Clause
The SQL DELETE construct is used to remove one or more entries from a
database table. The selection of rows to be removed from the table is based on the same where
construct as used by the SELECT clause. The syntax for mSQL's delete clause is
DELETE FROM table_name
- WHERE column OPERATOR value
- [ AND | OR column OPERATOR value ]**
OPERATOR can be <,>, =, <=, =, <>, LIKE, RLIKE, or CLIKE
for example
DELETE FROM emp_details WHERE emp_id = 12345
The Update Clause
The SQL update clause is used to modify data that is already in the
database. The operation is carried out on one or more rows as specified by the where
construct. The value of any number of fields on the rows matching the where construct can
be updated. mSQL places a limitation on the operation of the update clause in that it
cannot use a column name as an update value (i.e. you cannot set the value of one field to
the current value of another field). Only literal values may by used as an update value.
The syntax supported by mSQL is
UPDATE table_name SET column=value [ , column=value ]**
- WHERE column OPERATOR value
- [ AND | OR column OPERATOR value ]**
OPERATOR can be <,> , =, <=, =, <>, LIKE, RLIKE or CLIKE
for example
UPDATE emp_details SET salary=30000 WHERE emp_id = 1234
Mini SQL 2.0 (Beta) System Variables
Introduction
Mini SQL 2.0 includes internal support for system variables (often known
as pseudo fields or pseudo columns). These variables can be accessed in the same way that
normal table fields are accessed although the information is provided by the database
engine itself rather than being loaded from a database table. System variables are used to
provide access to server maintained information or meta data relating to the databases.
System variables may be identified by a leading underscore in the
variables name. Such an identifier is not valid in mSQL for table or field names. Examples
of the supported system variables and uses for those variables are provided below.
Available System Variables
The mSQL 2 engine currently supports the following system variables:
_rowid
The _rowid system variable provides a unique row identifier for any row
in a table. The value contained in this variable is the internal record number used by the
mSQL engine to access the table row. It may be included in any query to uniquely identify
a row in a table. An example of such queries could be :
- select _rowid, first_name, last_name from emp_details
- where last_name = 'Smith'
- update emp_details set title = 'IT Manager'
- where _rowid = 57
The candidate row module is capable of utilising _rowid values to increase the
performance of the database. In the second example query above, only 1 row (the row with
the internal record ID of 57) would be accessed. This is in contrast to a sequential
search through the database looking for that value which may result in only 1 row being
modified but every row being accessed. Using the _rowid value to constrain a search is the
fastest access method available in mSQL 2.0. As with all internal access decisions, the
decision to base the table access on the _rowid value is automatic and requires no action
by the programmer or user other than including the _rowid variable in the where
clause of the query.
_timestamp
The _timestamp system variable contains the time at which a row was last
modified. The value, although specified in the standard UNIX time format (i.e. seconds
since the epoch), is not intended for interpretation by application software. The value is
intended to be used as a point of reference via which an application may determine if a
particular row has was modified before or after another table row. The application should
not try to determine an actual time from this value as the internal representation used
may change in a future release of mSQL.
The primary use for the _timestamp system variable will be internal to
the mSQL engine. Using this information, the engine may determine if a row has been
modified after a specified point in time (the start of a transaction for example). It may
also use this value to synchronise a remote database for database replication. Although
neither of these functions is currently available, the presence of a row timestamp is the
first step in the implementation.
Example queries may be:
- select first_name, _timestamp from emp_details
- where first_name like '%fred%'
- order by _timestamp
- select * from emp_details
- where _timestamp 88880123
_seq
The _seq system variable is used to access the current sequence value of
the table from which it is being selected. The current sequence value is returned and the
sequence is update to the next value in the sequence (see the CREATE section of the
Language Specification section from more information on sequences).
An example query using _seq could be
- select _seq from staff
_sysdate
The server can provide a central standard for the current time and date.
If selected from any table, the _sysdate system variable will return the current
time and date on the server machine using the standard UNIX time format (e.g. seconds
since the epoch).
An example query using _sysdate could be
- select _sysdate from staff
_user
By selecting the _user system variable from any table, the server
will return the username of the user who submitted the query.
An example query using _user could be
- select _user from staff
Mini SQL 2.0 (Beta) Standard Programs
and Utilities
The monitor - msql
| Usage |
msql [-h host] [-f confFile] database |
| Options |
-h |
Specify a remote hostname or IP address on
which the mSQL server is running. The default is to connect to a server on the localhost
using a UNIX domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be
loaded. The default action is to load the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| Description |
The mSQL monitor is an interactive interface to
the mSQL server. It allows you to submit SQL commands directly to the server. Any valid
mSQL syntax can be entered at the prompt provided by the mSQL monitor.
Control of the monitor itself is provided by 4 internal commands. Each
command is comprised of a backslash followed by a single character. The available command
are
|
|
| \q |
Quit |
|
| \g |
Go (Send the query to the server) |
| \e |
Edit (Edit the previous query) |
| \p |
Print (Print the query buffer) |
Schema viewer - relshow
| Usage |
relshow [-h host] [-f confFile] [database [rel
[idx] ] ] |
| Options |
-h |
Specify a remore hostname or IP address on which the mSQL
server is running. The default is to connect to a server on the localhost using a UNIX
domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be loaded. The
default action is to load the the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| Description |
Relshow is used to display the structure of the
contents of mSQL databases. If no arguments are given, relshow will list the names of the
databases currently defined. If a database name is given it will list the tables defined
in that database. If a table name is also given then it will display the structure of the
table (i.e. field names, types, lengths etc).
If an index name is provided along with the database and table names,
relshow will display the structure of the specified index including the type of index and
the fields that comprise the index.
|
Admin program - msqladmin
| Usage |
msqladmin [-h host] [-f confFile] [-q] Command |
| Options |
-h |
Specify a remore hostname or IP address on
which the mSQL server is running. The default is to connect to a server on the localhost
using a UNIX domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be
loaded. The default action is to load the the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| -q |
Put msqladmin into quiet mode. If this flag is
specified, msqladmin will not prompt the user to verify dangerous actions (such as
dropping a database). |
| Description |
msqladmin is used to perform administrative
operations on an mSQL database server. Such tasks include the creation of databases,
performing server shutdowns etc. The available commands for msqladmin are |
| create db_name |
Creates a new database called db_name |
| drop db_name |
Removes the database called db_name from the server. This
will also delete all data contained in the database! |
| shutdown |
Terminates the mSQL server. |
| reload |
Forces the server to reload ACL information. |
| version |
Displays version and configuration information about the
currently running server. |
| stats |
Displays server statistics. |
|
Note : most administrative functions can only be executed by the
user specified in the run-time configuration as the admin user. They can also only be
executed from the host on which the server process is running (e.g. you cannot shutdown a
remote server process).
|
Data dumper - msqldump
| Usage |
msqldump [-h host] [-f confFile] [-c] [-v]
database [table] |
| Options |
-h |
Specify a remore hostname or IP address on which the mSQL
server is running. The default is to connect to a server on the localhost using a UNIX
domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be loaded. The
default action is to load the the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| -c |
Include column names in INSERT commands generated by the
dump. |
| -v |
Run in verbose mode. This will display details such as
connection results etc. |
| Description |
msqldump produces an ASCII text file containing
valid SQL commands that will recreate the table or database dumped when piped through the
mSQL monitor program. The output will include all CREATE TABLE commands required to
recreate the table structures, CREATE INDEX commands to recreate the indices, and INSERT
commands to populate the tables with the data currently contained in the tables.
Note : msqldump does not recreate sequences at this time.
|
Data exporter - msqlexport
| Usage |
msqlexport [-h host] [-f conf] [-v] [-s Char]
[-q Char] [-e Char] database table |
| Options |
-h |
Specify a remore hostname or IP address on which the mSQL
server is running. The default is to connect to a server on the localhost using a UNIX
domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be loaded. The
default action is to load the the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| -v |
Verbose mode |
| -s |
Use the character Char as the separation character. The
default is a comma. |
| -q |
Quote each value with the specified character |
| -e |
Use the specifed Char as the escape character. The default
is \ |
| Description |
msqlexport produces an ASCII export of the data
from the specified table. The output produced can be used as input to other programs such
as spreadsheets. It has been designed to be as flexible as possible allowing the user to
specify the character to use to separate the fields, the character to use to escape the
separator character if it appears in the data, and whether the data should be quoted and
if so what character to use as the quote character.
The output is sent to stdout with one data row per line.
|
Data importer - msqlimport
| Usage |
msqlimport [-h host] [-f conf] [-v] [-s Char]
[-e Char] [-c col,col...] database table |
| Options |
-h |
Specify a remore hostname or IP address on which the mSQL
server is running. The default is to connect to a server on the localhost using a UNIX
domain socket rather than TCP/IP (which gives better performance) |
| -f |
Specify a non-default configuration file to be loaded. The
default action is to load the the standard configuration file located in
INST_DIR/msql.conf (usually /usr/local/Hughes/msql.conf) |
| -v |
Verbose mode |
| -s |
Use the character Char as the separation character. The
default is a comma. |
| -e |
Use the specifed Char as the escape character. The default
is \ |
| -c |
A comma separated list of column names into which the data
will be inserted.
Note : there can be no spaces in the list. |
| Description |
msqlimport loads a flat ASCII data file into an
mSQL database table. The file can be formatted using any character as the column
separator. When passed through msqlimport, each line of the txt file will be loaded as a
row in the database table. The separation character as specified by the -s flag, will be
used to split the line of text into columns. If the data uses a specific character to
escape any occurence of the separation character in the data, the escape character can be
specified with the -e flag and will be removed from the data before it is inserted. |
Mini SQL 2.0 (Beta) Run Time
Configuration
Introduction
mSQL 1.x offered several configuration options, including such details
as the user the server should run as, the location of the TCP and UNIX sockets for
client/server communications, the location of the database files etc. The problem with
configuring mSQL 1.x was that all these details were hard-coded into the software at
compile time. Once the software was compiled and installed you couldn't easily change
those settings.
To overcome this problem, mSQL 2.0 utilises an external run-time
configuration file for definition of all these values. The file is called msql.conf
and is located in the installation directory (usually /usr/local/Hughes). An application
can choose to use a different configuration file by calling the new msqlLoadConfigFile(
) API function. All standard mSQL applications and utilities provide a command
line flag, -f ConfFile , that allows you to specify a non-standard
configuration file. When an application first calls the mSQL API library, a check is made
to see if a configuration file has been loaded via a call to the msqlLoadConfigFile( )
function. If no such call has been made, the API library loads the default config file.
Any values that are specified in that file will over-ride the normal operating paramaters
used by mSQL.
Structure of the config file
The configuration file is a plain text file organised into sections. The
file can contain blank lines and comments. A comment is a line that begins with the '#'
character. Each section of the configuration file has a section header, which is written
as the section name enclosed in square brackets (for example [ general ]).
Currently the only section defined is the general section although further sections
covering security and access control will be added later.
Configuration values within a section are presented using the config
parameter name followed by and equals sign and then the new value. There can only be one
entry per line and if an entry is defined multiple times in the one config file the last
value defined will be used. If a parameter is not defined in the config file then an
internal default value will be used at run-time.
Elements of the General section
The following configuration parameters are available in the general
section of the config file. Please note that %I may be used in configuration
entries to signify the mSQL installation directory (e.g. /usr/local/Hughes).
Parameter |
Default Value |
Definition |
| Inst_Dir |
/usr/local/Hughes |
The full path to the installation directory. This is the directory in
which all the mSQL files are located (such as the program files, the database files etc). |
| mSQL_User |
msql |
The user that the mSQL server should run as. If the server is started
by a user other than this user (e.g. it is started as root from a boot script) it will
change UID so that it runs as the specified user. |
| Admin_User |
root |
The user that is allowed to perform privileged operations such as
server shutdown, cration of databases etc. |
| Pid_File |
%I/msql2.pid |
The full path of a file in which the PID of the running mSQL server
process will be stored. |
| TCP_Port |
1114 |
The TCP port number on which the mSQL server will accept client/server
connections over a TCP/IP network. If this value is modified it must be modified on the
machine running the client software also. |
| UNIX_Port |
%I/msql2.sock |
The full path name of the UNIX domain socket created by the mSQL server
for connections from client applications running on the same machine. |
Example configuration file
Below is a sample configuration file. This file does not achieve
anything as it just sets the parameters to their default values.
#
# msql.conf - Configuration file for Mini SQL Version 2
#
#--------------------------------------------------------------
#
# This file is an example configuration and may require
# modification to suit your needs or your site. The values
# given are the default values and will be used by the
# software if either this file is missing or a specific value
# is not specified.
#
#--------------------------------------------------------------
[general]
Inst_Dir = /usr/local/Hughes
mSQL_User = msql
Admin_User = root
Pid_File = %I/msql2.pid
TCP_Port = 1114
UNIX_Port = %I/msql2.sock
|
|