Category: MySQL / MariaDB


Releasing ocelotgui 1.0.0

Today ocelotgui, the Ocelot Graphical User Interface for MySQL and MariaDB, version 1.0.0, is generally available for Linux. Read the manual, see the screenshots, and download binary (.deb or .rpm) packages and GPL-licensed C++ source here.

Client-Side Predictive Parsing of MySQL/MariaDB Grammar

The Ocelot GUI client for MySQL/MariaDB is now beta. The final feature is client-side predictive parsing of every SQL clause and statement. Readers who only care how we did it can skip to the section Recursive Descent Parsers”. I’ll start by illustrating why this is a good feature.

Error Checks

Compare this snapshot from mysql client:
post_client_select
with this from ocelotgui:
post_ocelogui

The GUI advantage is that the error message is more clear and the error location is more definite. This is not always true. However, anybody who dislikes the famous message “You have an error …” should like that there is another way to hear the bad news. It’s like getting an extra opinion.

In theory the further advantage is: this saves time on the server, because the client catches the syntax errors. In practice that should not be important, because: (a) ocelotgui is lax and can be erroneous; (b) there are better ways to catch some syntax errors before sending them to the server, for example setting up a test machine.

Finding the end

All respectable clients can scan an SQL statement to find its tokens. For example the mysql client knows that anything enclosed in quote marks, except a pair of quote marks, is a single token. This is pretty well essential, because it has to know whether a semicolon is just part of a string, or is the end of a statement.

Unfortunately a semicolon might not be the end of a compound statement, and that’s where simple “tokenizers” like mysql’s can’t cope. So in my MySQL days we had to come up with the concept of delimiters (I say “we” because I think the hassle of delimiters might have been my idea, but others are welcome to take the credit). It should be clear that a statement like

CREATE PROCEDURE p() BEGIN WHILE 0 = 0 DO ... END WHILE;

is not complete and should not go to the server, so it’s nice that we can say we don’t need delimiters.

But they still can have their uses. The scenario I dread is that a user has a spelling error, causing the client to think that input is complete, and shipping it off to the server prematurely. If there was some symbol that always meant “ship it!” and no other symbol ever meant “ship it!”, that would become unlikely. That’s what delimiters are. But we should have decided on a fixed unchangeable symbol instead of a user-defined one.

Highlighting

All respectable GUIs have highlighting, that is, they can display different colours for literals, operators, or keywords. But most GUIs cannot figure out what is a keyword, unless they ask the MySQL-server’s parser. The SQL problem is that there are two kinds of keywords: reserved words (which are guaranteed to have syntactic meaning if they’re not errors), and ordinary keywords (which might have syntactic meaning but might be identifiers, depending where they appear). An example statement is

CREATE PROCEDURE begin () end: BEGIN END;

where “end” and “begin” are identifiers, but BEGIN and END are not.

(A much easier problem is that you’d need two keyword lists, because in MySQL 5.7 GENERATED and GET and IO_AFTER_GTIDS and IO_BEFORE_GTIDS and MASTER_BIND and STORED and VIRTUAL are reserved words, but in MariaDB they’re not. But, since we don’t think the MariaDB manual’s reserved-word list is correct, we think those are the only reserved-word differences.)

So we have to beware of anything that depends on a list of keywords. For example my favoured editor is a variant of Kate, which recognizes a hodgepodge of symbols in its “SQL Keyword” list, and displays utterly useless highlighting when my file’s extension is .sql. Apparently Vim would be similar. Most clients that claim to work with MySQL or MariaDB are better than editors — they at least use a DBMS-specific keyword list rather than a generic one — and they’re generally good because in a sense they train users not to use keywords as identifiers. For example, if users see that “begin” has a non-identifier colour while they’re typing, they’ll avoid creating objects named “begin” even though it’s not a reserved word.

I’m not sure whether, in the wider SQL world, well known GUI clients have advanced past the keyword-list stage. I see hints here and there that SQL Server Management Studio has not and Toad has not, and Oracle SQL Developer has, but nothing explicit, and I repeat I’m not sure.
(UPDATE 2016-03-28: Mike Lischke’s description of Oracle’s new approach is here.)

In the end, then, a GUI with keyword-list highlighting will be right 90% of the time (I’m just guessing here, but I think it will sound right to experienced readers). On the other hand, a GUI that recognizes grammar should be right 97% of the time, and — just guessing again — I expect that’s preferable.

Predicting

The next step upwards in intelligence is knowing what the next token might be before the user starts typing it, or knowing how the current token ends before the user finishes typing it. And here, if you contemplate the Error Message example that we started with, you might realize: the MySQL/MariaDB server parser can’t do this. Which is why I emphasized right in the title: this is “predictive” parsing.

Example 1: suppose you’ve typed SELECT * FROM T ORDER. The GUI will show what the next word must be …
post_order

Example 2: suppose you’ve typed CREATE T. The GUI will show what possible words start with T …
post_create_t

And in either case, hitting the Tab key will save a bit of typing time. I should note that “hitting the Tab key to complete” is something many can do, even the mysql client — but for identifiers not keywords. Technically we can do both, though we prefer to avoid discussing identifiers.

Initial Summary

For error checking, finding the end, highlighting, and predicting: whoopie for predictive parsing of the whole MySQL / MariaDB grammar.

As an additional point, I suppose it’s obvious that we wouldn’t have been able to incorporate a stored-procedure debugger in ocelotgui without parsing. Admittedly it is not using the new parsing code, but it is necessary for it to do a lot more than looking at keywords. So I class debugger capability as the fifth advantage of having client-side parsing.

Recursive Descent Parsers

The algorithms for recursive descent parsers are in most textbooks for compilers, and even in Wikipedia. The “recursive” means that the process can call itself; the rest of the algorithm looks like this:
If (next thing is X) accept it and get the next thing
if (next thing is Y) accept it and get the next thing
if (next thing is Z) accept it and get the next thing

else give up and say there’s an error.
Simple, eh?

Now for some quotes from those textbooks for compilers:

“The advantages of recursive descent parsers are that it’s easy to write, and once written, it’s easy to read and understand. The main disadvantage is that it tends to be large and slow.” — Ronald Mak, Writing Compilers And Interpreters, page 810

“The parser text shows an astonishingly direct relationship to the grammar for which it was written. This similarity is one of the great attractions of recursive descent parsing. … In spite of their initial good looks, recursive descent parsers have a number of drawbacks … repeated backtracking … often fails to produce a correct parser [text] … error handling leaves much to be desired.” — Dick Grune, Henri Bal, Ceriel Jacobs, Koen Langendoen, pages 117-119

And they’re right. Have a look at the parsing code of ocelotgui, which can be done by opening a source file here and searching for the first line that begins with “void MainWindow::hparse_f_” (line 9565 in today’s beta). Then scroll downwards till there are no more routines that begin with hparse_f_ — 5000 lines later. Readable, sure, because it’s the simplest of C code. But also tedious, repetitive, and yes, “large and slow”. And this is without knowing anything about object names, since it’s a purely syntactic syntax checker and won’t look at metadata.

On a server the disadvantages could be crippling, but on a client they don’t matter — the meanest laptop has megabytes to spare and the response time is still way faster than a user can blink. And SQL doesn’t require repeated backtracking because it’s rarely necessary to look ahead to the next tokens in order to figure out what the current token means. Here is the worst example that we ran into (I’m quoting the MariaDB 10.1 manual

                     
GRANT role [, role2 ... ] TO grantee [, grantee2 ... ]

which can be instantiated as

GRANT EXECUTE, ROLE55 TO JOE;

See the problem? It’s perfectly okay for EXECUTE to be a role name — it’s not reserved — but typically it’s in statements like GRANT EXECUTE … ON PROCEDURE. So we have to look ahead to find whether ON follows, or whether TO follows. Which we did … and then found out that MariaDB couldn’t. I’d mentioned that this looked tough when I wrote about MariaDB roles two years ago. But for some reason it was attempted anyway and the inevitable bug report has been labelled “stalled” for a long time.

Oh, and one more detail that you’ll find in those compiler textbooks: correctly speaking, ocelotgui has a “recognizer” not a “parser” because it doesn’t generate a tree. That’s why I’ve carefully said it “does parsing” but not said it contains a parser.

Beta Status

The above represents the last feature that we intend to add. It’s at last “ocelotgui beta” rather than “ocelotgui alpha”. The C source and Linux executables, as usual, are at
https://github.com/ocelot-inc/ocelotgui.

Comments in SQL Statements

First I will say what the comment syntax is for various vendors’ dialects. Then I will get specific about some matters that specifically affect MySQL or MariaDB.

Syntax Table

DBMS –… /*…*/ #… Nesting Hints
Standard YES YES NO YES NO
Oracle 12c YES YES NO NO YES
DB2 YES YES NO YES NO

SQL Server YES YES NO YES NO

PostgreSQL YES YES NO YES NO

MySQL/MariaDB YES(99%) YES YES NO YES

The first column is for the type of DBMS. “Standard” is the ISO/IEC SQL standard document. For the others, just click on the DBMS name to see the relevant documentation. The standard, incidentally, clarifies that strings of comments are to be treated as a newline, so if you hear somebody say “comments are ignored”, that’s slightly wrong.

The first column is for comments that begin with “–” (two hyphen-minus signs), what the standard document calls “simple comments”, the ones that look like this:

SELECT * FROM t; -- SIMPLE COMMENT

Everyone supports simple comments, the only problem with MySQL/MariaDB is their insistence that the — must be followed by a space. I’ve had it explained to me that otherwise the parser had problems.

The second column is for comments enclosed by /* and */, what the standard document calls “bracketed comments”, the ones that look like this:

SELECT * FROM t; /* BRACKETED COMMENT */

According to the standard document, bracketed comments are not mandatory, they are optional feature T351. However, it would be surprising to find a modern SQL implementation that doesn’t support them.

The third column is for comments that begin with “#” (what Unicode calls Number Sign but an American dictionary allows for the word Octothorpe ), the ones that look like this:

SELECT * FROM t; # OCTOTHORPE COMMENT

Notice how, in every row but the MySQL/MariaDB row, the key word is NO? In fact I’ve only encountered one other SQL DBMS that is octothorpophiliac: mSQL. Old-timers may recall that mSQL from Hughes Technologies was, for a while, an inspiration for one of MySQL’s founders. Anyway, it’s unnecessary because simple comments do the job just as well.

The fourth column is for nesting, that is, putting bracketed comments within bracketed comments, that look like this:

SELECT * FROM t; /* OUTER /* INNER */ COMMENT */

I’ve often been irritated that I can’t nest in C, so I approve of the DBMSs that support this standard requirement. But I never saw it as important in my MySQL-architect days. There were a few what I guess could be categorized
as “feature requests” (here and here and here) and I approved of my colleagues’ clear responses, it’s low priority.

The final column is for hints. A hint is a bit of syntax that the server might ignore, signalled by an extra character or two in a bracketed comment, like this:

SELECT /*+ HINT COMMENT */ * FROM t;

Typically a hint is a suggestion for an optimizer, like “use index X instead of the default”. It’s found in Oracle; it’s not found in PostgreSQL and some PostgreSQL folks don’t like it; but it’s found in EnterpriseDB’s “advanced PostgreSQL”; and of course it’s found in MySQL and MariaDB. A newish point is that MariaDB has an extra signal “/*M!###### MariaDB-specific code */” that MySQL won’t recognize, which is a good thing since the optimizers have diverged somewhat.

Passing comments to the server

In the MySQL 5.7 manual we see the client has an interesting option:

–comments, -c
Whether to preserve comments in statements sent to the server. The default is –skip-comments (discard comments), enable with –comments (preserve comments).

and a good question is: huh? Surely we should preserve comments, especially in stored procedures, no? Well, the obvious answer is that the parser has to spend time skipping over them, but I doubt that the effect is significant nowadays. The better answer is merely that behaviour changes are serious so let’s leave this up to the users. Our GUI client supports –comments too, which is no surprise since we support all mysql-client options that make sense in a GUI.

But what happens if it’s hard to tell where comments belong? Buried in the source download is a file named mysql-test/t/mysql_comments.sql which is checking these questions:
* Ignore comments outside statements, on separate lines?
* Ignore comments at the end of statements, on the same line but after the semicolon?
* Ignore comments inside CREATE PROCEDURE/FUNCTION/TRIGGER/EVENT, but not in the body?
The test should be updated now that compound statements in MariaDB don’t have to be inside CREATE PROCEDURE/FUNCTION/TRIGGER/EVENT.

Therefore

Steve McConnell’s “Code Complete” book advises: “A common guideline for Java and C++ that arises from a similar motivation is to use // synax for single-line comments and /* … */ syntax for larger comments.”

I guess that the equivalent for SQL purposes would be to say: use — for single-line comments and /* … */ for longer ones. But don’t use #, and be wary with standalone or endline comments, and turn –comments on.

Update

In an earlier blog post I predicted that ocelotgui, our GUI client for MySQL and MariaDB, would be beta in February. Now it’s February 29, so I have to modify that to: “any day now (watch this blog for updates or click Watch on the github project page)”. The latest feature additions are in the downloadable source code, by the way, but not in the binary release.

Privileges in MySQL and MariaDB: The Weed Of Crime Bears Bitter Fruit

Let’s look at how well MySQL and MariaDB support privileges (part of “access control” in standard terms), compared to other DBMSs, and consequences thereof.

Count the Privileges

I go to the DBMS manuals (here and starting here and here) and I count the privileges. This is like judging a town by the number of traffic lights it claims to have, but I’m trying to get an indicator for how granular the DBMS’s “authorization” is.

Number of privileges listed in the manuals

MySQL/MariaDB  Oracle 12c     DB2 9.7   SQL Server 2014
31             240            52        124

Pretty small number in the first column, eh? There are historical reasons that MySQL was reluctant to add new privileges, illustrated by Bug#43730.

What is the effect of having a limited number of privileges? Sometimes the same privilege has to be used for two different things. For example, the SUPER privilege is good for “CHANGE MASTER TO, KILL, PURGE BINARY LOGS, SET GLOBAL, and mysqladmin debug command”, while the PROCESS privilege is what you need for SHOW PROCESSLIST — but also for selecting from information_schema.innodb_sys_tables.

Why is this a flaw? If administrators want to allow access to a goose, they are forced to allow access to a gander as well — even when the gander is none of the grantee’s business. As an example that affected us: to make Ocelot’s stored-procedure debugger work, we have to be able to set values in a single global variable, which is impossible without the SUPER privilege, therefore to allow people to use the debugger you have to allow them to purge binary logs too.

Standard Requirements

The SQL standard mentions 9 privileges: INSERT UPDATE DELETE SELECT REFERENCES USAGE UNDER TRIGGER EXECUTE. MySQL and MariaDB do a fair job of handling them:

INSERT: perfect support, including column-level grants.

UPDATE: perfect support, including column-level grants.

DELETE: perfect support.

SELECT: perfect support, including column-level grants.

REFERENCES: perfect support, not including column-level grants. I think this is not well known yet. Deep in the caverns of the manual are the words: “The REFERENCES privilege is unused before MySQL 5.7.6. As of 5.7.6, creation of a foreign key constraint requires the REFERENCES privilege for the parent table.” Kudos to MySQL. The MariaDB manual, on the other hand, still says the REFERENCES privilege is “unused”. For some background about this new feature, click the high-level architecture tab in the old worklog task Foreign keys: reference privilege.

USAGE: no support. In standard SQL, USAGE is for access to domains or UDTs or sequence generators or transliterations or character sets or collations. In MySQL/MariaDB, USAGE is the minimum privilege — you can log in, that’s all. So USAGE is unsupported, but unimportant.

UNDER: no relevance. This is for optional UDT features.

TRIGGER: perfect support.

EXECUTE: perfect support.

Looking at the down side, MySQL and MariaDB don’t allow for the standard GRANT OPTION. Yes, they have a GRANT OPTION privilege, but that’s not standard — what’s needed (and what’s supported by the other serious DBMSs) is an option to grant a particular privilege, not a privilege to grant any privileges.

Packaging

The objection about having hundreds of possible privileges is: it’s hard to keep track of them, or even remember what they are. This should be a solved problem: allow a package of privileges, in other words support CREATE ROLE. This time the kudos go to MariaDB which has allowed roles for over two years. But what if you have MySQL and it’s tedious to grant multiple times?

It’s still simple. You either make a script which contains a bunch of GRANT statements, or you create a stored procedure. Certainly I’d recommend a stored procedure, because it will be “inside the database”, and therefore subject to tracking. Scripts are a tad more dicey security-wise, since changing or deleting files is a process outside the DBMS’s control.

After all, doing grants via an insecure mechanism would kinda mess up the idea of using grants for extra security.

information_schema

There is a standard and reasonable way to get at metadata: you can see the information_schema table, but you won’t see rows for database objects that you don’t have access to.

MySQL and MariaDB follow this plan, but there is a major exception: InnoDB. Consider INNODB_SYS_TABLES, which has information about other tables. Of course this table should not exist at all (the sensible place is information_schema.TABLES), but the more troubling fact is that the relevant privilege is not “whether you have access to the other tables”, but — wow — the PROCESS privilege. And to top it off, in MySQL (though not MariaDB) instead of an empty table you get an error message.

Statement: select * from information_schema.innodb_sys_tables;

Response from MySQL 5.7: ERROR 1227 (42000): Access denied; you need (at least one of) the PROCESS privilege(s) for this operation

We Live In A Just World

Therefore, here is how I can crash my MySQL 5.7 server. Provided, ironically, that I do not have any privileges on any database objects. In other words, I’ve logged in as a user who has been granted the minimum:

GRANT USAGE ON *.* to ‘peter’@’localhost’;

The important prerequisites are: MySQL 5.7.9 compiled from source, a new installation, and an unprivileged user. It doesn’t seem to happen under any other circumstances. So this is not a vulnerability alert. I like to show it, though, as an illustration of the punishment that awaits violators of the precepts of privileges.

As I indicated, I’ve logged in, and the database is empty. Now I say:

SELECT A.NAME, B.NAME
  FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES A
  LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES B
  ON A.SPACE = B.SPACE;

On the client I see
ERROR 2013 (HY000): Lost connection to MySQL server during query

On the server I see
mysqld: /home/pgulutzan/Downloads/mysql-5.7.9/sql/sql_error.cc:444: void Diagnostics_area::set_error_status(uint, const char*, const char*): Assertion `! is_set() || m_can_overwrite_status’ failed.
18:37:12 UTC – mysqld got signal 6 ;

As this is a crash and something is definitely wrong, the information collection process might fail.
etc.

Those who live in glass houses

I don’t think it would be fair to end this without confessing: us too.

For example, the ocelotgui GUI client for MySQL and MariaDB can crash if I ask it not to send /* comments */ to the server, and there is a very long comment at the end of a statement after the semicolon. We are all sinners.

However, that bug, and a few minor ones, have been found during alpha tests. I’m still hopeful that we’ll go beta within a few weeks, and invite anyone to try and find an embarrassing problem before that happens. The readme and the download are on github at https://github.com/ocelot-inc/ocelotgui.

Generated columns in MariaDB and MySQL

It has been seven years since the last time I blogged about generated columns, and a lot has happened — now both MariaDB and MySQL support them. So it’s time to look again, see how well they’re doing, and compare to the SQL standard’s Optional feature T175 Generated columns.

This is not an introductory description or an explanation why you’d want to use generated columns rather than (say) triggers and views. For that, I’d recommend the relevant manuals or the blog posts by Alexander Rubin and Anders Karlsson.

The Generation Clause

Standard            MariaDB 10.1             MySQL 5.7
---------           ------------             ---------
[data type]         data_type                data type
GENERATED ALWAYS    [GENERATED ALWAYS]       [GENERATED ALWAYS]
AS                  AS                       AS
(expression)        (expression)             (expression)
                    [VIRTUAL | PERSISTENT]   [VIRTUAL | STORED]       
[constraints]       [constraints]            [constraints]
                    [COMMENT 'string']       [COMMENT 'string']

The above side-by-side BNFs show the standard syntax and the syntax that MariaDB and MySQL actually allow at the time I’m writing this. The MariaDB manual says incorrectly that either VIRTUAL or PERSISTENT is mandatory. The MySQL manual suggests incorrectly that the clause order is fixed, and has a typo: there should be two “]]”s after “STORED”.

The first important deviation from standard SQL is that “data type” is not an optional clause in either MariaDB or MySQL. The data type can be figured out from the type of the expression, after all.

The second important deviation is that [GENERATED ALWAYS] is optional and there’s a way to say whether the column is virtual (column value is generated when it’s accessed) or persistent/stored (column value is generated when it’s set, and kept in the database). I call this a single deviation because it’s got a single backgrounder: compatibility with Microsoft SQL Server. In fact the original title of the worklog task (WL#411) was “Computed virtual columns as MS [i.e. Microsoft] SQL server has”. We changed it to “Generated columns”, but the perfume of Microsoftness lingers, and you’ll see traces in the vocabulary too. For example, MariaDB has an error message: “”HY000 A computed column cannot be based on a computed column”.

So the tip sheet is: for the sake of compatibility with the standard rather than with Microsoft, always say GENERATED ALWAYS, and call it a “generated” column not a “computed” column. It’s okay to say VIRTUAL, though, because Oracle does.

Restrictions

In standard SQL these restrictions apply:

“Every column reference contained in [the generation expression) shall reference a base column of [the same table].”
In other words, a generated column cannot be based on another generated column. MariaDB adheres to this, but MySQL, as a harmless extension, allows
CREATE TABLE t (a INT, b INT GENERATED ALWAYS AS (a), c INT GENERATED ALWAYS AS (b));

“[The generation expression] shall be deterministic.”
This is pretty reasonable, and both MariaDB and MySQL comply.

“[The generation expression] shall not contain a routine invocation whose subject routine possibly reads SQL-data.”
This is reasonable too, but MariaDB and MySQL go much further — they forbid every user-defined function, even if it’s declared that it’s deterministic and reads no SQL data.

“[The generation expression] shall not contain a query expression”.
In other words, GENERATED ALWAYS AS (SELECT …)” is a no-no. Again, reasonable, and I doubt it will occur to anyone to try.

Differences between MariaDB and MySQL

We’re actually looking at two different implementations — MariaDB’s generated columns come ultimately from a user contribution by Andrey Zhakov, while MySQL’s generated columns are younger and are more of an in-house development. (Update added later: Mr Zhakov deserves credit for the MySQL development too, see the comments.) Things worth noticing are:
* the PERSISTENT versus STORED syntax detail, mentioned earlier,
* GENERATED is a reserved word in MySQL but not in MariaDB,
* MariaDB has some restrictions about foreign keys that MySQL doesn’t have.

MySQL lacks some restrictions about foreign keys, eh? That could lead to interesting results. I tried this sequence of statements:

CREATE TABLE t1 (s1 INT PRIMARY KEY);
CREATE TABLE t2 (s1 INT,
                 s2 INT AS (s1) STORED,
                 FOREIGN KEY (s1) REFERENCES t1 (s1)
                 ON UPDATE CASCADE);
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1, DEFAULT);
UPDATE t1 SET s1 = 2;
SELECT * FROM t1;
SELECT * FROM t2;

And the results from the two SELECTs looked like this:

mysql> SELECT * FROM t1;
+----+
| s1 |
+----+
|  2 |
+----+
1 row in set (0.00 sec)

mysql> SELECT * FROM t2;
+------+------+
| s1   | s2   |
+------+------+
|    2 |    1 |
+------+------+
1 row in set (0.00 sec)

If you’re thinking “whoa, that shouldn’t be right”, then I’m sure you’ll understand why MariaDB doesn’t allow this trick.

Metadata

In the standard there are two relevant INFORMAIION_SCHEMA columns: IS_GENERATED and GENERATION_EXPRESSION. In MySQL all we get is a value in the EXTRA column: “VIRTUAL GENERATED”. In MariaDB also we get a value in the EXTRA column: “VIRTUAL”.

I’d say that both implementations are deficient here — you can’t even see what the “(expression)” was.

Hurray For Everybody

Both MariaDB and MySQL have slight flaws in their implementations of generated columns, but my complaints here shouldn’t cause worry. Both are robust, as I found out by wasting time looking for bugs. This feature can be used.

What about us?

In my last blog post I may have been over-confident when I predicted ocelotgui (our GUI for MariaDB and MySQL) would allow multiple connections and go beta this month. We will have something that, I hope, will be far more exciting.

Send messages between sessions on MySQL or MariaDB

Suppose you want to send a message from one SQL session to one or more other sessions, like “Hey, session#13, here is the latest figure for your calculation, please acknowledge”. I’ll say what Oracle and EnterpriseDB and DB2 do, then rate the various ways to implement something similar in MySQL and MariaDB, and finish with a demo of the procedure that we use, which is available as part of an open-source package.

The syntax was established by Oracle, with EnterpriseDB and IBM following suit. The details are in the Oracle 12c manual and the DB2 9.7 manual. The oversimplification is this:

  DBMS_PIPE.PACK_MESSAGE('message');
  SET status = DBMS_PIPE.SEND_MESSAGE('channel#1');
  SET status = DBMS_PIPE.RECEIVE_MESSAGE('channel#1');
  DBMS_PIPE.UNPACK_MESSAGE(target_variable);

The idea is that PACK_MESSAGE prepares the message, SEND_MESSAGE ships the message on a channel, RECEIVE_MESSAGE receives something on a channel, and UNPACK_MESSAGE puts a received message in a variable. The whole package is called DBMS_PIPE because “pipe” is a common word when the topic is Inter-process communication.

MySQL and MariaDB don’t have DBMS_PIPE, but it’s possible to write it as an SQL stored procedure. I did, while working for Hewlett-Packard. Before saying “here it is”, I want to share the agony that we endured when deciding what, at the lowest level, would be the best mechanism.

The criteria are:

size -- how many bits can a message contain?
isolation -- how many conversations can take place simultaneously?
privilege -- how specific is the authorization, if any? is eavesdropping easy?
danger -- what are the chances of fouling up some other process?
reliability -- can messages be delayed or destroyed?
utility -- can it be used at any time regardless of what's gone before?

There is no “overhead” criterion because messaging should be rare.

These are the 5 candidate mechanisms.

1. Tables.
Session#1 INSERTs to a table, Session#2 SELECTs from the table.
Rating: size=good, isolation=good, privilege=good, danger=low, reliability=good.
But utility=terrible. First: with some storage engines you have to COMMIT in order to send and this might not be a time that you want to COMMIT. Second and more important: there’s a bit of fine print in the MySQL Reference Manual:

A stored function acquires table locks before executing, to avoid inconsistency in the binary log due to mismatch of the order in which statements execute and when they appear in the log.

Think about it. It means that you can’t read a message that’s sent by a function while the function is running. And you can’t work around that by writing the messaging code in a stored procedure — there’s no guarantee that the stored procedure won’t be called from a function.

2. Files.
Session#1 uses SELECT … INTO OUTFILE. Session#2 says LOAD_FILE.
(We don’t consider LOAD DATA because it won’t work in stored procedures.)
Rating: size=good, isolation=good, privilege=good, utility=good.
But danger=high, reliability=bad. The problem is that you can’t overwrite a file, so the messages would pile up indefinitely.

3. information_schema.processlist.
Session#1 says SELECT SLEEP(0.5),’message’. Session#2 says SELECT from information_schema.processlist.
Rating: size=bad, isolation=bad, privilege=good, danger=low, utility=bad.
This is okay for short messages if you’re not worried about eavesdropping. But notice that the message can only be a literal, like ‘message’. It cannot always be a variable, because then it’s dynamic SQL, and dynamic SQL is illegal in functions, and now you’ve got the same problem as with mechanism number 1.

4. GET_LOCK.
Session#1 says GET_LOCK(). Session#2 says IS_USED_LOCK().
Rating: size=bad, isolation=good, privilege=good, danger=low, utility=good.
Technically reliability=low because the message disappears when the server goes down, but in many situations that would actually be a good thing. The rating “size=bad” is easy to understand: effectively there’s only one bit of information (yes/no) that Session#2 is getting by checking IS_USED_LOCK(). However, one-bit signals are great for lots of applications so this would still fit in a toolkit if it weren’t for The Great GET_LOCK Showstopper. Namely, you can only have one GET_LOCK at a time.

Now for the good news. Multiple GET_LOCK invocations are on their way. The person to thank (and I say “thank” because this was a contribution done to the whole community) is Konstantin Osipov, who wrote a patch and a blog post — “MySQL: multiple user level locks per connection”. As I understand matters, this was a basis for the code that is coming in a future MySQL version and is now in the MySQL 5.7.5 manual. Konstantin Osipov, by the way, nowadays writes for the Tarantool NoSQL DBMS and Lua application server, to which I am pleased to contribute in small ways.

5. System variables.
Session#1 says SET @@variable_name = ‘message’. Session#2 says target = @@variable_name.
Rating: size=bad, isolation=good, privilege=good, danger=high, utility=good.
The system variable must be a string, must be dynamically writable, and must not change the server’s behaviour if you write a bad value. Only one item does all that: @@init_connect. It’s actually easy to ensure that changes to @@init_connect won’t affect its official purpose — just put the message /* inside a comment */. However, I still rate it as danger=high because anybody could overwrite the message inadvertently.

And the winner, as far as we’re concerned, is … #5 System variables. Remember, Ocelot is supplying a debugger for MySQL/MariaDB routines. It would be a pretty poor debugger that used a message mechanism that wouldn’t work with functions, so mechanism#1 and mechanism#3 are out. The GET_LOCK of mechanism#4 is in fact used by a different debugger, but in my opinion that means it’s hard to have two debugger sessions on the same server, or to run without pausing after every statement. So our implementation involves setting @@init_connect.

If you want to see our implementation, here is how (on Linux).

Download, install, and start ocelotgui. The instructions are in the
README.md file at https://github.com/ocelot-inc/ocelotgui (just scroll past the list of files). Connect to a MySQL/MariaDB server as a user with privileges to create databases and tables, and execute routines. Then type in, on the statement widget

$INSTALL

After this, you actually won’t need ocelotgui any more. So, although I think the real “demo” would be to use the debugger now that you’ve installed it, I’ll show how to use pipes with the mysql client instead.

Start a shell. Start mysql. You need the SUPER privilege, and the EXECUTE privilege for routines in the xxxmdbug database. Send a message.

MariaDB [(none)]> call xxxmdbug.dbms_pipe_send('channel#1','message');
Query OK, 0 rows affected (0.00 sec)

Start a second shell. Receive a message.

MariaDB [(none)]> call xxxmdbug.dbms_pipe_receive
    -> ('channel#1',1,@message_part_1,@message_part_2);
Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> select @message_part_1;
+-----------------+
| @message_part_1 |
+-----------------+
| message         |
+-----------------+
1 row in set (0.00 sec)

You can see how it’s implemented by saying

SELECT *
FROM information_schema.routines
WHERE routine_schema = 'xxxmdbug'
AND routine_name like 'dbms_pipe_%';

The dbms_pipe_send and dbms_pipe_receive routines are GPL and copyrighted by Hewlett-Packard.

I might not moderate comments on this blog while on vacation. Tomorrow I leave for Reykjavik, Amsterdam and London.

MYSQL_HISTFILE and .mysql_history

The MySQL manual says:
“mysql Logging
On Unix, the mysql client logs statements executed interactively to a history file. By default, this file is named .mysql_history in your home directory. To specify a different file, set the value of the MYSQL_HISTFILE environment variable.”
The trouble with that is: it doesn’t tell you what you don’t need to know. So I’ll tell you.

Heritage

The history-file concept that MySQL and MariaDB are following is indeed “on Unix” and specifically is like the GNU History Library. There is a dependence on external libraries, Readline or EditLine/libedit, depending on the licence. The similarity to (say) the Bash shell history is clear when one compares some names and some actions.

Thing Bash MySQL
Default file name $HOME/.bash_history $HOME/.mysql_history
Environment variable for name HISTFILE MYSQL_HISTFILE
Environment variable for ignoring HISTIGNORE MYSQL_HISTIGNORE

This explains why the history file is hidden, why the facility doesn’t work on Windows, and why there’s no orthogonality (that is, one can set an environment variable but one cannot set a configuration-file or command-line parameter).

Is it really a log?

The manual says .mysql_history is a log, but that’s a bit misleading, since it’s not much like the “logs” that the server produces (the transaction log, the slow query log, etc.). There are no timestamps; there is a large class of SQL statements that are ignored such as ones that come via file input; there is no way to know which statements succeeded and which ones ended with errors.

So the only thing .mysql_history is truly good for is making some edits easier. If you like to say “repeat last statements”, or some sophisticated variant that requires knowing what earlier statements were, then the mysql client — actually, the Readline or EditLine library — needs the history for that.

Also it’s occasionally useful for answering questions like “How did I get myself into this mess?” when odd things start to happen, and especially so when you have to write a bug report and your friendly MySQL or SkySQL or Percona support worker enquires: “Could you please show us the things you were doing just before the, er, anomaly occurred.” A tee file would be better, but sometimes a histfile might do.

Format

Not all versions of the mysql client will write the .mysql_history file in the same way. That is why sometimes you will see “X Y” if you print the file, but sometimes you will see “X\040Y”. Of course 040 is the octal for the ASCII for the space character, but this has led to bug reports such as Bug#14634386 – “History written by Libedit is not readable by Readline” and Bug#16633124 – Compatibility issue with mysql history (“\040” instead of space). These are Oracle internal bug numbers, so don’t bother trying to find them on bugs.mysql.com. They’re unimportant bugs, since they should not cause problems for editing.

A more severe matter is that multi-line statements are logged twice. First there is a record of each line. Then there is a record of the statement as a whole.

Stopping

Although disk space is cheap, people do worry sometimes when they realize their computer might be writing secret stuff in a file that burglars could see. The easiest prevention is to start the mysql client with –batch or an option that implies –batch. However, those options all do something else, as well as stopping writing to .mysql_history, so they’re not “well targeted” solutions.

There’s a recommendation to set the MYSQL_HISTFILE environment variable to say “/dev/null”, or to link the .mysql_history file to /dev/null. Loosely one might think: oh, statements are being sent to /dev/null and therefore they won’t get saved. More accurately, they might not be sent anywhere at all, as one can see from these snippets of mysql.cc code (Copyright (c) 2000, 2014, Oracle and/or its affiliates):

  if (my_readlink(link_name, histfile, 0) == 0 &&
    strncmp(link_name, "/dev/null", 10) == 0)
  {
    my_free(histfile);
    histfile= 0;
  }
  ...
  if (histfile && strncmp(histfile, "/dev/null", 10) == 0)
    histfile= NULL;

Limiting

Despite the similarities to Bash, there is no MYSQL_HISTSIZE environment variable to be like Bash’s HISTSIZE. With some unreliable fiddling, one can try to tell the underlying library what the maximum history size should be, that is, how many lines are allowed in the history file before starting to discard. For example, instead of saying “/usr/bin/mysql”, say

rlwrap -a --histsize=5 /usr/bin/mysql

Usually this particular trick is best avoided because of its side effects.

MYSQL_HISTIGNORE

In MySQL’s mysql, there’s a newish environment variable:

export MYSQL_HISTIGNORE=pattern

It’s not in my copy of MariaDB’s mysql, but I suppose we can expect it soon. The idea is: if a statement matches the pattern, then it doesn’t get saved in the history file. But there are some flaws …

  • There are only two wildcards for the pattern: “?” meaning “one occurrence of any character”, and “*” meaning “zero or more occurrences of any character”. There is no regular-expression variation, and the choice is un-SQL-ish (in SQL one would use _ and % not ? and *).
  • A certain pattern is always in effect so that passwords won’t be saved. A fine default, but it’s regrettable that there’s no way to remove it.
  • It appears to be impossible to escape certain characters. For example, if I wanted the pattern to include a colon, I couldn’t say “export MYSQL_HISTIGNORE=a\:b”.

ocelotgui

The objective of Ocelot Computer Services Inc. is that the ocelotgui program will do anything that the mysql client can do, and one can dream that there are ways to do even more (porting to Windows, recognizing history size, matching more expressions, etc.) along with the advantages that a GUI automatically has over a command-line editor. But at the moment we don’t have a histfile, we only support tee. I think that’s the last big feature that ocelotgui needs before we call it ‘feature complete’, which is why I’ve been looking at .mysql_history details.

The latest ocelotgui release is 0.6.0, source and binaries are on github.com/ocelot-inc/ocelotgui.

Connecting to MySQL or MariaDB with sockets on Linux

The MySQL manual says

–socket=file_name, -S file_name … On Unix, the name of the Unix socket file to use, for connections made using a named pipe to a local server.
The default Unix socket file name is /tmp/mysql.sock.

which might surprise folks who’ve had to contend with the error message
“Can’t connect to local MySQL server through socket ‘[something-other-than-/tmp/mysql.sock]'”.

I’ll try to explain here why the name is often something quite different, how to know what the MySQL server is really listening for, what the fixes are for either users or application developers, and why it still matters.

Why the name is not always /tmp/mysql.sock

First, the Linux Foundation publishes a document “Filesystem Hierarchy Standard”. Version 2.3 says in the section about the /var/run directory: “Programs that maintain transient UNIX-domain sockets must place them in this directory.” Unfortunately Version 3 says something a bit different
in the section about the /run directory: “System programs that maintain transient UNIX-domain sockets must place them in this directory or an appropriate subdirectory as outlined above.” But version 3 also says: “In general, programs may continue to use /var/run to fulfill the requirements set out for /run for the purposes of backwards compatibility.” so /var/run is still standard.

Second, there’s a bit of fine print tucked away in an appendix of the MySQL manual: “For some distribution formats, the directory might be different, such as /var/lib/mysql for RPMs.” That’s a vague way of saying it’s determined at source-installation time by -DINSTALL_LAYOUT={STANDALONE|RPM|SVR4|DEB} which in effect causes this:

SET(INSTALL_UNIX_ADDRDIR_STANDALONE     "/tmp/mysql.sock")
SET(INSTALL_UNIX_ADDRDIR_RPM            "/var/lib/mysql/mysql.sock")
SET(INSTALL_UNIX_ADDRDIR_DEB            "/var/run/mysqld/mysqld.sock")
SET(INSTALL_UNIX_ADDRDIR_SVR            "/tmp/mysql.sock")

but anybody can override that by setting MYSQL_UNIX_ADDR to something else.

And so different machines have different defaults. The following comes from notes I made long ago so may not be the latest information:

Non-Linux e.g. FreeBSD or Solaris: /tmp/mysql.sock
Debian-based e.g. Ubuntu, and archlinux: /var/run/mysqld/mysqld.sock
SUSE (after v11.2): /var/run/mysql/mysql.sock
Red Hat, and SUSE (before v11.2): /var/lib/mysql/mysql.sock
archlinux (very old versions): /tmp/mysqld.sock

Sometimes you can find out what the real default on your machine was,
by typing mysql_config –socket.

Finding what the server is really listening for

If you’re not the one who started the server, or the starting has disappeared in the mists of memory, there are various flawed ways to find what socket
it’s really opened.

Possible Method #1: netstat -lpn | grep mysqld
Example:

$ netstat -lpn | grep mysqld
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::3306 :::*  LISTEN 4436/mysqld     
unix 2 [ ACC ] STREAM LISTENING 101567 4436/mysqld ./mysql_sock

… This method’s flaw is that, as the warning says, you won’t see everything unless you’re root. Also the “grep mysqld” filtering means it’s assumed the server’s name is mysqld.

Possible Method #2: find directory-list -name “mysql*.sock”
Example:

$ find /tmp /var/lib/mysql /var/run/mysqld -name "mysql*.sock"
find: 'var/lib/mysql': Permission denied
/var/run/mysqld/mysqld.sock

… This method’s flaw is that you have to guess in advance what directories the socket might be on.

Possible Method #3: ask via TCP
Example:

mysql -h 127.0.0.1 -e "select @@socket"
+-----------------------------+
| /var/run/mysqld/mysqld.sock |
+-----------------------------+

… This method’s flaw is in the assumption that the port is the default (3306), that the local host is accessible (I think it’s theoretically possible that it won’t be), and that everyone has the privilege to access MySQL this way.

Possible Method #4: look at the running process
Example (after finding with ps -A that mysqld process ID = 3938):

$ ps -fp 3938
UID   PID  PPID  C STIME TTY    TIME CMD
1000  3938 18201 0 09:58 pts/2  00:00:00 bin/mysqld --socket=./sock

… This method’s flaw is that it only works if –socket was specified explicitly on a mysqld command line, overriding the default configuration.

What a user can do

Once you know that the server is listening on socket X, you can redirect with ln, for example

ln -s /var/run/mysqld/mysqld.sock /tmp/mysql.sock

… The flaw this time is that it’s trying to solve a MySQL/MariaDB problem with a Linux workaround. The only reason that I mention it is that I’ve seen it recommended on Internet forums, so I guess some people see advantages here, which I don’t.

A better answer, then, would be: tell the client program what the socket is. On a client that follows the MySQL guidelines (such as the mysql client itself, or ocelotgui), this would mean setting the MYSQL_UNIX_PORT environment variable, or starting with –socket=X on the command line, or changing one of the configuration files such as ~/.my.cnf to add the line socket = X. Beware that the socket location might also be stored in other places, such as /etc/mysql/debian.cnf or php.ini.

What a client program should do

People who write client programs shouldn’t pass this socket hassle on to the user, if they don’t have to.

The mysql client tries to make things easier by, in effect, hard-coding a socket name so it’s the same as what the server got installed with. That’s a good try. My only criticism is that mysql –help will say that the socket parameter has “(No default value)” when, in a sense, it does.

I’ve been told that another client-GUI product tries to make things easier by automatically going with TCP. It’s difficult to criticize this — I’ve always thought that MySQL made trouble for itself by deciding that even when a user says “host=localhost” we should ignore the usual implication that the user is trying to use TCP and try to find a socket anyway — but here at Ocelot we try to behave the way the mysql client behaves, flawed or not.

So ocelotgui will try to make things easier by hard-coding the most likely paths. That is, if the user doesn’t specify a socket but the host is either default or ‘localhost’, then ocelotgui will try to connect via /tmp/mysql.sock, and if that fails then /var/lib/mysql/mysql.sock, and if that fails then /var/run/mysqld/mysqld.sock. That’s just the plan; currently ocelotgui isn’t doing it.

Is it worth caring about?

Couldn’t the Gordian socket be cut by saying protocol=TCP? What’s the point of finding the socket, anyway?

The answer usually is performance. The transfer rate with sockets is faster than with TCP because why do I need to worry about protocol when I’m talking to myself? I haven’t done a benchmark myself but Brandon D. Northcutt did and it does appear that sockets are worth the trouble if huge quantities of data are coming through.

However, really the TCP-versus-socket difference is a Linux problem, so why isn’t it being solved by the Linux crowd rather than the MySQL or MariaDB crowds? Well, unsurprisingly, that question has been asked before. So there is a fix, or more precisely there is a Google patch, which turns a TCP/IP connection into a socket connection if the target is localhost. There was an article about it on lwn.net in 2012. What happened after that, I don’t know.

By the way

The current state of ocelotgui (our open-source GUI client for MySQL and MariaDB) is still alpha, but it’s stable and good-looking now. Some screenshots are here. The latest release is 0.7, here.

The ocelotgui debugger

I have merged a debugger for MySQL/MariaDB stored procedures and functions into our GUI client and posted the source and binaries on github. It allows breakpoint, clear, continue, next, skip, step, tbreakpoint, and variable displays. Features which are rare or missing in other debuggers include:
its current platform is Linux;
it allows breakpoints on functions which are invoked within SQL statements;
it never changes existing stored procedures or functions;
it is integrated with a general GUI client;
it allows commands like gdb and allows menu items / shortcut keys like ddd;
it is available on github as C++ source with GPL licence.

It’s alpha and it’s fragile but it works. Here is a demo.

Start the client and connect to a running server, as root. Actually the required privileges are merely for creation of certain objects and SUPER, but I’m making this simple.
connection_dialog_box

Type the statement: DELIMITER // (For all typed-in instructions, execute by typing carriage-return after the delimiter or by clicking Run|Execute.)delimiter_1

Create a function fdemo which updates and returns a counter.

CREATE FUNCTION fdemo () RETURNS INT
  BEGIN
    IF @counter IS NULL THEN SET @counter = 0; END IF;
    SET @counter = @counter + 1;
    RETURN @counter;
    END
  //


fdemo

Create a procedure pdemo which contains a loop of “INSERT INTO tdemo VALUES (fdemo())” statements.

CREATE PROCEDURE pdemo ()
  BEGIN
    DECLARE i INT;
    CREATE TABLE IF NOT EXISTS tdemo (s1 INT);
    SET i = 0;
    WHILE i < 100 DO
      INSERT INTO tdemo VALUES (fdemo());
      SET i = i + 1;
      END WHILE;
    SET i = i + 1;
    END
  //


pdemo

Type the statement: DELIMITER ;
delimiter_2

Type the statement: $install;
install

Type the statement: $setup fdemo, pdemo;
setup

Type the statement: $debug pdemo;
After this statement is executed a tabbed widget appears. The first line of pdemo is highlighted. There is always a breakpoint before the first line.
debug

Click on the Debug menu to see what options are available. Debugger instructions may be done via the menu, via Alt keys, or via command-line statements. For example to enter a Next instruction one may now click Debug|Next, or type Alt+3, or type the statement: "$next;".
menu

Enter a Next instruction repeatedly, watching how the highlighted line changes, until the INSERT line is highlighted.
next

Enter a Step instruction. After this is executed, the function now comes into the foreground.
step

Enter three more Step (or Next) instructions. After these are executed, the procedure now is in the foreground again.
three_steps

Set a breakpoint for the final executable line of the procedure. This can be done by clicking over the line number, or by moving the cursor to the line and then clicking Debug|Breakpoint, or by typing the statement "$breakpoint pdemo 9;". After this is executed, line 9 of pdemo has a small red mark showing that it has a breakpoint.
breakpoint

Enter a Continue instruction. After this is executed, watch the debugger highlight hop around 100 times as it moves through the loop, and finally settle on line 9.
continue

Type the statements "$refresh variables;" and "select old_value, value from xxxmdbug.variables;". After this is executed, the result-set widget will contain the old value of variable i (99) and the current value (100), This is the way that one examines DECLAREd variables (there are other statements for user variables and server variables).
select

Enter an $exit instruction. This stops the debugging session, so the effects of the earlier $debug instruction are cancelled. The effects of the earlier $install and $setup instructions are not cancelled, so they will not have to be repeated for the next debugging session involving pdemo and fdemo.
exit

Thus ends our demo. If you would like to confirm it: an introduction for how ocelotgui in general works is in the earlier blog post An open-source MySQL/MariaDB client on Linux" and the source + executable download files for version 0.3.0 are on github.com/ocelot-inc/ocelotgui.

If there is anything Ocelot can do to make your next debugging trip more enjoyable, please leave a comment. If you have private thoughts, please write to pgulutzan at ocelot.ca.

Stored Procedures: critiques and defences

I’ve gathered the main critiques of MySQL / MariaDB stored procedures, and will try some defences.

Monoglots

The critique:

SQL/PSM is the standard 4GL and it was the work of Andrew Eisenberg. Andy based it on ADA. Unless you are military, you have never seen ADA. Be grateful it is dead.
— Joe Celko, reminiscing about his days on the SQL standard committee

Actually I believe Mr Celko likes SQL/PSM, which is the standard that MySQL and MariaDB follow. Here at last is your chance to see some Ada code, and compare with MySQL code …

Ada
MySQL
declare a: integer;
begin

  a := 0;
  loop
    a := a + 100;
    exit when a = 200;
  end loop;
  if a /= 300 then
    a := 400;
  else
    a := 500;
  end if;
  case a is
    when 600 a := 700;
    when others a := 800;
  end case;
end;

begin
  declare a integer;
  set a = 0;
  x:loop
    set a = a + 100;
    if a = 200 then leave x; end if;
  end loop;
  if a <> 300 then
    set a = 400;
  else
    set a = 500;
  end if;
  case a
    when 600 then set a = 700;
    else set a = 800;
  end case;
end;

The trouble is, Ada isn’t even among the top ten programming languages according to O’Reilly’s count of book sales, and the choice of only 1.6% of programmers according to the Language Popularity Index. So programmers aren’t used to seeing syntax like the above. And I, who am just as good a psychologist as I am an Ada programmer, explain: they’re not familiar with it, so they call it ugly.

The defence:

Look at the first alternative: UDFs. Percona posts point out that UDFs can be faster. of course.
But look at the example in Managing & Using MySQL Second Edition. Do you understand it without reading the long commentary?

Now look at the second alternative: External stored procedures. There’s a worklog task for stored procedures in other languages, of course.
But it’s moribund. An implementation on launchpad, “External Language Stored Procedures for MySQL”, exists but doesn’t seem to have been updated since 2009. Are you going to go to the trouble of downloading and adjusting it?

If you answered yes to either of the above questions, please check the length of your nose now.

Migrants

The critique:

“Migrating a stored procedure is much more complex than rewriting one because the relevant standards of various vendors differ greatly. In this situation, users have no choice but stick to one database vendor rigidly. There is not any room left for users to beat down the price if database vendors overcharge them on upgrading their servers, storages, and user license agreements.”
— Couchbase, “Alternative to Difficult Stored Procedures in Big Data Computation”

The defence: That’s not true. There’s only one relevant standard, there are multiple migration paths, and there’s no charge.

The other DBMSs that follow the ANSI/ISO SQL/PSM standard are: DB2, Mimer, SolidDB, Sybase iAnywhere (Advantage Database Server).

Selective quotes from a case study about “experiences with porting stored procedures written in SQL PL (DB2) to MySQL”:

To find out how “standardized” the MySQL implementation of the SQL/PSM specification really is, we tried to port all our DB2 stored procedures to MySQL. First, we ran into DB2 non-standard extensions of SQL PL that were used in our existing procedures. Actually, there was only one such extension [SQLCODE] … So we first rewrote our DB2 procedures –in DB2–, making sure not to use SQLCODE anymore. Instead we had to introduce the corresponding continue handler(s), thereby introducing an additional “flag” variable. This worked out fine: the new procedures ran perfectly in DB2. … Now we observed some syntactic differences, luckily not in the body of the procedures but in the optional clauses specified just before the body … The only option clause which we could keep was the “LANGUAGE SQL”: required in DB2, optional with MySQL. After these small modifications, the CREATE PROCEDURE statements from DB2 worked on MySQL!
But did they run properly? To verify this, we had to create identical tables on both systems, have the same test data in both, and migrate then run the unit test programs from DB2 on MySQL. And indeed: it turned out that MySQL worked exactly as expected!
— Peter Vanroose (ABIS), MySQL: stored procedures and SQL/PSM

As for Oracle, its PL/SQL language is not standard, but read the Oracle documentation: “PL/SQL is based on the programming language Ada.” Due to the common Ada heritage that PL/SQL and SQL/PSM share, I’ve been able to convert Oracle stored procedures to MySQL stored procedures at a rate of about a line per minute. I expect I’d achieve the same speed with NuoDB (NuoDB architect Jim Starkey once assured me that their stored procedures will follow PL/SQL), and with PostgreSQL (PostgreSQL’s stored procedures are deliberately Oracle-like). There’s also an SQL/PSM add-on for PostgreSQL, although I don’t think it’s popular.

Finally, there are some commercial tools that try to automate the migration process. I think the one from ispirer is the best known.

Toolmakers

The critique:

Your stored routines are not likely to integrate well with your IDE. … [SQL-oriented GUI tools] do not integrate (well?) with your source control … While engineers are keen on writing unit tests for every class and method they create, they are less keen on doing the same for stored routines. … MySQL routines have [no?] built in debugging capabilities …
— Shlomi Noach, “Why delegating code to MySQL Stored Routines is poor engineering practice”

The defence:

These are important considerations, but strictly speaking they’re about what people should have written to go with stored procedures, not about stored procedures themselves. It’s not the wiener’s fault if there’s no mustard.

Take versioning. For Oracle it can be done with a client GUI named Oracle SQL Developer. For MySQL there are open-source utilities like dbv. Either way, it’s not a server task.

Take unit tests. Hmm, okay, you can’t, but some people might be satisfied with MyTAP or maybe even STK/Unit. And did you notice in the quote above that one can make unit tests with DB2 tools and run them in MySQL?

Take debuggers. I’m aware of five, although I haven’t tried any of them, except the one that I wrote myself. I am going to integrate it with ocelotgui.

Could-do-betters

The critique:

Stored Procedures … do not perform very well
— Percona employee, How can we bring query to the data”

The defence:

Supposedly there are more than 12 million installations and that means there are 12 million which are, as the old saying goes, “not Facebook”. I haven’t interviewed them all (only Monty Widenius can do that), but the ones that I’ve talked to are concerned more about the effects of badly-written queries or database design mistakes.

Grapevines

The critique:

It’s easy to find articles and blog posts with titles like “Why I hate stored procedures”, “Goodbye Stored procedures, It’s the time to move on”, “Stored Procedures – End of days”, “Why I avoid stored procedures and you should too”.

The defence:

Usually those articles are about SQL Server. They are by people who got fed up with Microsoft’s T-SQL language, or got enamoured of ORM (Object Relational Management). Inevitably some of the bad vibes get picked up in the MySQL / MariaDB community due to morphic resonance. Filter out the Microsoft material, and it gets harder to find such articles or blog posts, and it gets easier to find articles with more walking-in-the-sunshine outlook.

Besides, some of the alternatives might just be fads. As Winston Churchill said:

Many forms of Database Code have been tried, and will be tried in this world of sin and woe. No one pretends that Stored Procedures are perfect or all-wise. Indeed, it has been said that Stored Procedures are the worst form of Database Code except all those other forms that have been tried from time to time.
— Winston Churchill, Hansard, November 11 1947

Okay, he didn’t actually say Database Code or Stored Procedures, he said Government and Democracy. Close enough.

Sloths

The critique:

“The following features have been added to MySQL 5.6: … GET DIAGNOSTICS”
“The following features have been added to MySQL 5.7: … GET STACKED DIAGNOSTICS”
— MySQL user manual for 5.6 and 5.7

All right, you have to read between the lines a bit to see the critique here.
What MySQL/Oracle is implying, by omission, is: progress is glacial. As I said in a previous post I think GET DIAGNOSTICS is good to see. But the first MySQL 5.6 releases were four years ago. So, one significant new feature every two years.

What about MariaDB? Well, I did see a new trick recently in a blog post by Federico Razzoli: How MariaDB makes Stored Procedures usable, about using cursors even for SHOW statements. And I suppose that MariaDB’s “compound statements” feature could be looked on as at least a feature that’s closely related to stored procedures. Still, small potatoes.

The defence:

* The current implementation has pretty well all the significant matters required by the standard.
* There has been no official announcement that any significant stored procedure feature is deprecated.

Next page →
← Previous page