Over the years a the world has seen the rise of a pletora of domain specific and generalized scripting languages. Although most of them offer some sort of database engine hook, a seamless SQL integration in the host language, be it in the form of embedded SQL or sympatetic grammar, is never in the cards. With the lack of a DBA domain specific scripting language, DBAs have to put up with either using multiple tools or clumsy language and operating system interaction (or both) in order to be able to pass statements to a database engine and get rows back.
The Structured Query Scripting Language aims to bridge this gap, bringing the ability to script in a SQL manner from outside the database engine, offering seamless connection, control and formatting constructs and interaction with the operating system. With syntax features comparable to the SQL language itself, the language is designed to be obvious to anyone with SQL, awk, sh, Informix SPL, Informix 4GL or PL/SQL expertise.
Back in the day, when in charge of designing and developing the ERP system at
my old employer, people would often come to me and ask to implement a simple
report which would consist at most of a couple of SELECT statements, minimal
glue code and some formatting.
Simple task you would think, except that the
publishing cycle for the application, considering the powerful technological
means at our disposal (our development machine was a NCR Tower32 200 with a
whooping 4M of RAM and 100MB hard disk), would be at the very least 24 hours, if
not 48.
Wouldn't it be nice, I said to myself (yes, being a geek can be a
very lonely affair), if I could just store the sql and formatting somewhere in
the database, and have a scripting language runner do the rest of the work for
me, or rather, my users?
A few late nights later a prototype implementation
had been deployed, with an expansion facility and formatting capabilities. SQSL
was born.
Suppose data needs to be migrated between two different database engines, be
it on a regular basis, or just as a one off. Suppose also that there is a
requirement to transform the data somehow, or collect aggregates.
Several
approaches spring to mind: for example unload to file from the source database
engine, load from file to the target engine and finally update the data and
obtain aggregates through simple SQL. Although feasible, the drawbacks of such
an approach are many, and obvious.
A more effective alternative would be to
write small application in java, perl or python, which selects from one source,
transforms or aggregates and finally inserts in the target, although the
complexity of interacting with two SQL sources from your chosen host language
might not make this a simple to implement or efficient proposition.
And your
friendly ETL salesperson will be very happy to hear about your latest
project.
Or you could do it the SQSL way:
CONNECT TO "db1" SOURCE db2; CONNECT TO "db2" SOURCE ifmx; LET min=1000000; LET max=0; LET avg=0::float; LET rows=0; SELECT col1, col2, col3, col4 FROM tab1 CONNECTION "db1" AGGREGATE rows=rows+1, avg=avg+($1/rows)::float, ($1>max) max=$1, ($1<min) min=$1 INSERT INTO tab1 VALUES (?, ?, ?, ?) CONNECTION "db2"; DISPLAY min, max, avg, rows FORMAT "%08d %08d %10f %08d" HEADERS "min", "max", "average", "nrows";
and if you wanted to speed up things, you maybe could
CLONE 10 INTO child, pid(child); CONNECT TO "db1" SOURCE db2; CONNECT TO "db2" SOURCE ifmx; SELECT * FROM tab1 WHERE col1>? AND col1<? USING (child-1)*100000, child*100000 CONNECTION "db1" INSERT INTO tab1 VALUES (?, ?, ?, ?) CONNECTION "db2"; DONE; WAIT FOR pid.* INTO r, d; DONE;
In addition to the above, and aside from standard control and simple statements, the language sports a number of other useful features such as a simple but effective expansion facility and the ability to load dinamically data sources or user defined functions. All of them are detailed in the language reference and demonstrated in the following sample scripts.
The language is implemented in the form a of a library (two API flavours available: c or Informix 4gl) which can be used to plug the interpreter in your application of choice, and a couple of sample clients, of which one has a readline library based interface and the other curses (I have also been toying with the idea of a KDE or gnome demo).
Connectivity is currently provided for Informix and DB2 engines, the first being built in, the second dinamically loadable.
Further reading