Month: February 2016


Comments in SQL Statements

First I will say what the comment syntax is for various vendors’ dialects. Then I will get specific about some matters that specifically affect MySQL or MariaDB.

Syntax Table

DBMS –… /*…*/ #… Nesting Hints
Standard YES YES NO YES NO
Oracle 12c YES YES NO NO YES
DB2 YES YES NO YES NO

SQL Server YES YES NO YES NO

PostgreSQL YES YES NO YES NO

MySQL/MariaDB YES(99%) YES YES NO YES

The first column is for the type of DBMS. “Standard” is the ISO/IEC SQL standard document. For the others, just click on the DBMS name to see the relevant documentation. The standard, incidentally, clarifies that strings of comments are to be treated as a newline, so if you hear somebody say “comments are ignored”, that’s slightly wrong.

The first column is for comments that begin with “–” (two hyphen-minus signs), what the standard document calls “simple comments”, the ones that look like this:

SELECT * FROM t; -- SIMPLE COMMENT

Everyone supports simple comments, the only problem with MySQL/MariaDB is their insistence that the — must be followed by a space. I’ve had it explained to me that otherwise the parser had problems.

The second column is for comments enclosed by /* and */, what the standard document calls “bracketed comments”, the ones that look like this:

SELECT * FROM t; /* BRACKETED COMMENT */

According to the standard document, bracketed comments are not mandatory, they are optional feature T351. However, it would be surprising to find a modern SQL implementation that doesn’t support them.

The third column is for comments that begin with “#” (what Unicode calls Number Sign but an American dictionary allows for the word Octothorpe ), the ones that look like this:

SELECT * FROM t; # OCTOTHORPE COMMENT

Notice how, in every row but the MySQL/MariaDB row, the key word is NO? In fact I’ve only encountered one other SQL DBMS that is octothorpophiliac: mSQL. Old-timers may recall that mSQL from Hughes Technologies was, for a while, an inspiration for one of MySQL’s founders. Anyway, it’s unnecessary because simple comments do the job just as well.

The fourth column is for nesting, that is, putting bracketed comments within bracketed comments, that look like this:

SELECT * FROM t; /* OUTER /* INNER */ COMMENT */

I’ve often been irritated that I can’t nest in C, so I approve of the DBMSs that support this standard requirement. But I never saw it as important in my MySQL-architect days. There were a few what I guess could be categorized
as “feature requests” (here and here and here) and I approved of my colleagues’ clear responses, it’s low priority.

The final column is for hints. A hint is a bit of syntax that the server might ignore, signalled by an extra character or two in a bracketed comment, like this:

SELECT /*+ HINT COMMENT */ * FROM t;

Typically a hint is a suggestion for an optimizer, like “use index X instead of the default”. It’s found in Oracle; it’s not found in PostgreSQL and some PostgreSQL folks don’t like it; but it’s found in EnterpriseDB’s “advanced PostgreSQL”; and of course it’s found in MySQL and MariaDB. A newish point is that MariaDB has an extra signal “/*M!###### MariaDB-specific code */” that MySQL won’t recognize, which is a good thing since the optimizers have diverged somewhat.

Passing comments to the server

In the MySQL 5.7 manual we see the client has an interesting option:

–comments, -c
Whether to preserve comments in statements sent to the server. The default is –skip-comments (discard comments), enable with –comments (preserve comments).

and a good question is: huh? Surely we should preserve comments, especially in stored procedures, no? Well, the obvious answer is that the parser has to spend time skipping over them, but I doubt that the effect is significant nowadays. The better answer is merely that behaviour changes are serious so let’s leave this up to the users. Our GUI client supports –comments too, which is no surprise since we support all mysql-client options that make sense in a GUI.

But what happens if it’s hard to tell where comments belong? Buried in the source download is a file named mysql-test/t/mysql_comments.sql which is checking these questions:
* Ignore comments outside statements, on separate lines?
* Ignore comments at the end of statements, on the same line but after the semicolon?
* Ignore comments inside CREATE PROCEDURE/FUNCTION/TRIGGER/EVENT, but not in the body?
The test should be updated now that compound statements in MariaDB don’t have to be inside CREATE PROCEDURE/FUNCTION/TRIGGER/EVENT.

Therefore

Steve McConnell’s “Code Complete” book advises: “A common guideline for Java and C++ that arises from a similar motivation is to use // synax for single-line comments and /* … */ syntax for larger comments.”

I guess that the equivalent for SQL purposes would be to say: use — for single-line comments and /* … */ for longer ones. But don’t use #, and be wary with standalone or endline comments, and turn –comments on.

Update

In an earlier blog post I predicted that ocelotgui, our GUI client for MySQL and MariaDB, would be beta in February. Now it’s February 29, so I have to modify that to: “any day now (watch this blog for updates or click Watch on the github project page)”. The latest feature additions are in the downloadable source code, by the way, but not in the binary release.

Privileges in MySQL and MariaDB: The Weed Of Crime Bears Bitter Fruit

Let’s look at how well MySQL and MariaDB support privileges (part of “access control” in standard terms), compared to other DBMSs, and consequences thereof.

Count the Privileges

I go to the DBMS manuals (here and starting here and here) and I count the privileges. This is like judging a town by the number of traffic lights it claims to have, but I’m trying to get an indicator for how granular the DBMS’s “authorization” is.

Number of privileges listed in the manuals

MySQL/MariaDB  Oracle 12c     DB2 9.7   SQL Server 2014
31             240            52        124

Pretty small number in the first column, eh? There are historical reasons that MySQL was reluctant to add new privileges, illustrated by Bug#43730.

What is the effect of having a limited number of privileges? Sometimes the same privilege has to be used for two different things. For example, the SUPER privilege is good for “CHANGE MASTER TO, KILL, PURGE BINARY LOGS, SET GLOBAL, and mysqladmin debug command”, while the PROCESS privilege is what you need for SHOW PROCESSLIST — but also for selecting from information_schema.innodb_sys_tables.

Why is this a flaw? If administrators want to allow access to a goose, they are forced to allow access to a gander as well — even when the gander is none of the grantee’s business. As an example that affected us: to make Ocelot’s stored-procedure debugger work, we have to be able to set values in a single global variable, which is impossible without the SUPER privilege, therefore to allow people to use the debugger you have to allow them to purge binary logs too.

Standard Requirements

The SQL standard mentions 9 privileges: INSERT UPDATE DELETE SELECT REFERENCES USAGE UNDER TRIGGER EXECUTE. MySQL and MariaDB do a fair job of handling them:

INSERT: perfect support, including column-level grants.

UPDATE: perfect support, including column-level grants.

DELETE: perfect support.

SELECT: perfect support, including column-level grants.

REFERENCES: perfect support, not including column-level grants. I think this is not well known yet. Deep in the caverns of the manual are the words: “The REFERENCES privilege is unused before MySQL 5.7.6. As of 5.7.6, creation of a foreign key constraint requires the REFERENCES privilege for the parent table.” Kudos to MySQL. The MariaDB manual, on the other hand, still says the REFERENCES privilege is “unused”. For some background about this new feature, click the high-level architecture tab in the old worklog task Foreign keys: reference privilege.

USAGE: no support. In standard SQL, USAGE is for access to domains or UDTs or sequence generators or transliterations or character sets or collations. In MySQL/MariaDB, USAGE is the minimum privilege — you can log in, that’s all. So USAGE is unsupported, but unimportant.

UNDER: no relevance. This is for optional UDT features.

TRIGGER: perfect support.

EXECUTE: perfect support.

Looking at the down side, MySQL and MariaDB don’t allow for the standard GRANT OPTION. Yes, they have a GRANT OPTION privilege, but that’s not standard — what’s needed (and what’s supported by the other serious DBMSs) is an option to grant a particular privilege, not a privilege to grant any privileges.

Packaging

The objection about having hundreds of possible privileges is: it’s hard to keep track of them, or even remember what they are. This should be a solved problem: allow a package of privileges, in other words support CREATE ROLE. This time the kudos go to MariaDB which has allowed roles for over two years. But what if you have MySQL and it’s tedious to grant multiple times?

It’s still simple. You either make a script which contains a bunch of GRANT statements, or you create a stored procedure. Certainly I’d recommend a stored procedure, because it will be “inside the database”, and therefore subject to tracking. Scripts are a tad more dicey security-wise, since changing or deleting files is a process outside the DBMS’s control.

After all, doing grants via an insecure mechanism would kinda mess up the idea of using grants for extra security.

information_schema

There is a standard and reasonable way to get at metadata: you can see the information_schema table, but you won’t see rows for database objects that you don’t have access to.

MySQL and MariaDB follow this plan, but there is a major exception: InnoDB. Consider INNODB_SYS_TABLES, which has information about other tables. Of course this table should not exist at all (the sensible place is information_schema.TABLES), but the more troubling fact is that the relevant privilege is not “whether you have access to the other tables”, but — wow — the PROCESS privilege. And to top it off, in MySQL (though not MariaDB) instead of an empty table you get an error message.

Statement: select * from information_schema.innodb_sys_tables;

Response from MySQL 5.7: ERROR 1227 (42000): Access denied; you need (at least one of) the PROCESS privilege(s) for this operation

We Live In A Just World

Therefore, here is how I can crash my MySQL 5.7 server. Provided, ironically, that I do not have any privileges on any database objects. In other words, I’ve logged in as a user who has been granted the minimum:

GRANT USAGE ON *.* to ‘peter’@’localhost’;

The important prerequisites are: MySQL 5.7.9 compiled from source, a new installation, and an unprivileged user. It doesn’t seem to happen under any other circumstances. So this is not a vulnerability alert. I like to show it, though, as an illustration of the punishment that awaits violators of the precepts of privileges.

As I indicated, I’ve logged in, and the database is empty. Now I say:

SELECT A.NAME, B.NAME
  FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES A
  LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES B
  ON A.SPACE = B.SPACE;

On the client I see
ERROR 2013 (HY000): Lost connection to MySQL server during query

On the server I see
mysqld: /home/pgulutzan/Downloads/mysql-5.7.9/sql/sql_error.cc:444: void Diagnostics_area::set_error_status(uint, const char*, const char*): Assertion `! is_set() || m_can_overwrite_status’ failed.
18:37:12 UTC – mysqld got signal 6 ;

As this is a crash and something is definitely wrong, the information collection process might fail.
etc.

Those who live in glass houses

I don’t think it would be fair to end this without confessing: us too.

For example, the ocelotgui GUI client for MySQL and MariaDB can crash if I ask it not to send /* comments */ to the server, and there is a very long comment at the end of a statement after the semicolon. We are all sinners.

However, that bug, and a few minor ones, have been found during alpha tests. I’m still hopeful that we’ll go beta within a few weeks, and invite anyone to try and find an embarrassing problem before that happens. The readme and the download are on github at https://github.com/ocelot-inc/ocelotgui.

Generated columns in MariaDB and MySQL

It has been seven years since the last time I blogged about generated columns, and a lot has happened — now both MariaDB and MySQL support them. So it’s time to look again, see how well they’re doing, and compare to the SQL standard’s Optional feature T175 Generated columns.

This is not an introductory description or an explanation why you’d want to use generated columns rather than (say) triggers and views. For that, I’d recommend the relevant manuals or the blog posts by Alexander Rubin and Anders Karlsson.

The Generation Clause

Standard            MariaDB 10.1             MySQL 5.7
---------           ------------             ---------
[data type]         data_type                data type
GENERATED ALWAYS    [GENERATED ALWAYS]       [GENERATED ALWAYS]
AS                  AS                       AS
(expression)        (expression)             (expression)
                    [VIRTUAL | PERSISTENT]   [VIRTUAL | STORED]       
[constraints]       [constraints]            [constraints]
                    [COMMENT 'string']       [COMMENT 'string']

The above side-by-side BNFs show the standard syntax and the syntax that MariaDB and MySQL actually allow at the time I’m writing this. The MariaDB manual says incorrectly that either VIRTUAL or PERSISTENT is mandatory. The MySQL manual suggests incorrectly that the clause order is fixed, and has a typo: there should be two “]]”s after “STORED”.

The first important deviation from standard SQL is that “data type” is not an optional clause in either MariaDB or MySQL. The data type can be figured out from the type of the expression, after all.

The second important deviation is that [GENERATED ALWAYS] is optional and there’s a way to say whether the column is virtual (column value is generated when it’s accessed) or persistent/stored (column value is generated when it’s set, and kept in the database). I call this a single deviation because it’s got a single backgrounder: compatibility with Microsoft SQL Server. In fact the original title of the worklog task (WL#411) was “Computed virtual columns as MS [i.e. Microsoft] SQL server has”. We changed it to “Generated columns”, but the perfume of Microsoftness lingers, and you’ll see traces in the vocabulary too. For example, MariaDB has an error message: “”HY000 A computed column cannot be based on a computed column”.

So the tip sheet is: for the sake of compatibility with the standard rather than with Microsoft, always say GENERATED ALWAYS, and call it a “generated” column not a “computed” column. It’s okay to say VIRTUAL, though, because Oracle does.

Restrictions

In standard SQL these restrictions apply:

“Every column reference contained in [the generation expression) shall reference a base column of [the same table].”
In other words, a generated column cannot be based on another generated column. MariaDB adheres to this, but MySQL, as a harmless extension, allows
CREATE TABLE t (a INT, b INT GENERATED ALWAYS AS (a), c INT GENERATED ALWAYS AS (b));

“[The generation expression] shall be deterministic.”
This is pretty reasonable, and both MariaDB and MySQL comply.

“[The generation expression] shall not contain a routine invocation whose subject routine possibly reads SQL-data.”
This is reasonable too, but MariaDB and MySQL go much further — they forbid every user-defined function, even if it’s declared that it’s deterministic and reads no SQL data.

“[The generation expression] shall not contain a query expression”.
In other words, GENERATED ALWAYS AS (SELECT …)” is a no-no. Again, reasonable, and I doubt it will occur to anyone to try.

Differences between MariaDB and MySQL

We’re actually looking at two different implementations — MariaDB’s generated columns come ultimately from a user contribution by Andrey Zhakov, while MySQL’s generated columns are younger and are more of an in-house development. (Update added later: Mr Zhakov deserves credit for the MySQL development too, see the comments.) Things worth noticing are:
* the PERSISTENT versus STORED syntax detail, mentioned earlier,
* GENERATED is a reserved word in MySQL but not in MariaDB,
* MariaDB has some restrictions about foreign keys that MySQL doesn’t have.

MySQL lacks some restrictions about foreign keys, eh? That could lead to interesting results. I tried this sequence of statements:

CREATE TABLE t1 (s1 INT PRIMARY KEY);
CREATE TABLE t2 (s1 INT,
                 s2 INT AS (s1) STORED,
                 FOREIGN KEY (s1) REFERENCES t1 (s1)
                 ON UPDATE CASCADE);
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1, DEFAULT);
UPDATE t1 SET s1 = 2;
SELECT * FROM t1;
SELECT * FROM t2;

And the results from the two SELECTs looked like this:

mysql> SELECT * FROM t1;
+----+
| s1 |
+----+
|  2 |
+----+
1 row in set (0.00 sec)

mysql> SELECT * FROM t2;
+------+------+
| s1   | s2   |
+------+------+
|    2 |    1 |
+------+------+
1 row in set (0.00 sec)

If you’re thinking “whoa, that shouldn’t be right”, then I’m sure you’ll understand why MariaDB doesn’t allow this trick.

Metadata

In the standard there are two relevant INFORMAIION_SCHEMA columns: IS_GENERATED and GENERATION_EXPRESSION. In MySQL all we get is a value in the EXTRA column: “VIRTUAL GENERATED”. In MariaDB also we get a value in the EXTRA column: “VIRTUAL”.

I’d say that both implementations are deficient here — you can’t even see what the “(expression)” was.

Hurray For Everybody

Both MariaDB and MySQL have slight flaws in their implementations of generated columns, but my complaints here shouldn’t cause worry. Both are robust, as I found out by wasting time looking for bugs. This feature can be used.

What about us?

In my last blog post I may have been over-confident when I predicted ocelotgui (our GUI for MariaDB and MySQL) would allow multiple connections and go beta this month. We will have something that, I hope, will be far more exciting.