18.104.22.168 Reasons NOT to Use Foreign Keys constraints
There are so many problems with foreign key constraints that we don't know where to start:
Foreign key constraints make life very complicated ...,
The speed impact is terrible ...,
There is also a need to hold locks on many more tables ...,
And so on.
And so on.
Of course those fierce words quietly disappeared from the manual some time after InnoDB came out with foreign key support in 2001. But since then the attempts to get foreign key support outside of InnoDB have usually gotten stuck.
* SolidDB had it, and I made a big deal out of that while predicting a bright future for the SolidDB storage engine.
* Falcon had it, although there were bugs in the implementation (they're marked "to be fixed later").
* PBXT had it
* IBMDB2I had it.
All of those storage engines are, for one reason or another, gone. So, is any independent storage engine left that supports foreign keys? I know of only one: ScaleDB. ScaleDB is targeted for clusters, and I'm not sure how much traction its makers are getting in that market, but I'm always joyous when I see a company paying attention to the standard feature requirements.
Anyway, to finish our look at MySQL documentation on the matter, I'll quote once more:
"Foreign key support generic for all storage engines is scheduled for MySQL 6.x."
This ancient promise is still in the current manual but I predict it too will silently disappear. The promise originally was made because there was progress on a project to do a lot for foreign keys, and the detailed plan is still publicly available on the mysql.com page for Worklog Item #148 (click the "High Level Architecture" button to view the important part). Though the project got put on the shelf, the worklog document is still good for seeing what, at the time, everyone agreed needed work.
The document's "Feature Comparison Chart" compared InnoDB feature details with the ISO standard, SQL Server, DB2, Oracle, and PostgreSQL. It's a long list. I'll just pick the five standard "core" items where every vendor follows the standard. Except InnoDB.
1. References Primary/Unique
InnoDB lets foreign keys refer to keys that are not primary or unique.
CREATE TABLE pk (s1 CHAR, KEY (s1), s2 INT); CREATE TABLE fk (s1 CHAR, FOREIGN KEY (s1) REFERENCES pk(s1)); INSERT INTO pk VALUES ('a',1),('a',2); INSERT INTO fk VALUES ('a');
So now there's a foreign key that points to two different parent-key rows. Guess what happens if you delete only one of the referenced-key rows, thus:
DELETE FROM pk WHERE s2 = 2;
If you guessed "that's illegal because you can't delete a row if there's a foreign-key referring to it", congratulations.
If you guessed "that's legal because the foreign key is still referencing a valid undeleted row", congratulations.
Either answer is logical, and it doesn't matter which one is right.
What matters is that neither logic nor precedent are of any help when InnoDB goes boldly beyond both theory and common practice.
2. Default Primary Reference
It ought to be possible to just say that the foreign key references table PK, and not have to list the columns -- if PK has a primary key, the DBMS can figure it out. But, partially as a result of the previous problem, InnoDB doesn't figure it out.
3. Statement Check
Sometimes a check for a foreign-key violation should occur when the statement is truly over, when all other changes (due to triggers or cascading updates) have finished, so that the check isn't affected by some temporary situation that would be cleared up before the statement ended.
4. Column REFERENCES
This looks okay, and there's no error or warning:
CREATE TABLE pk (s1 INT); CREATE TABLE fk (s1 INT REFERENCES pk(s1));
... but, woe to the uninitiated, it doesn't make a foreign key. You can't make a foreign key by putting a REFRENCES clause in the column definition.
5. REFERENCES Privilege
I suppose that newcomers might also try this:
GRANT REFERENCES ON t TO joe;
and think that they're affecting security for foreign keys. In fact they're creating a privilege that does nothing.
The current situation is less than perfect, and I cast the blame on all the closet xenokleidophobes. You know who you are.
Many people have tested whether the TokuDB storage engine runs faster and compresses better than the default storage engine, InnoDB. I am more concerned about TokuDB's features. Or, to put it unfairly: can it do everything that InnoDB can do, in a current version?
Vadim Tkachenko of Percona supplied a binary download for a platform that I happen to have (Ubuntu 12.04), so I got MySQL 5.6 + TokuDB 7.1 up and running within 15 minutes. Then I found that I could not break anything within 60 minutes. I I conclude that "ease of use" and "stability" are okay. (For anyone who thinks I'm rushing: yes, but usually finding a bug in a new version takes less than an hour so I'm judging from experience.)
I tried compound indexes, long strings, the relatively-new TIMESTAMP(6) data type with fractional seconds, utf8mb4 varchar columns with German collation, transaction rollbacks with or without savepoints, crash recovery, and partitions.
I always got the same results as I get with InnoDB. It seems that the Tokutek folks looked at compatibility before declaring that their product is a "drop-in replacement storage engine for MySQL and MariaDB".
These statements didn't cause an error but the FOREIGN KEY clause was ignored:
CREATE TABLE tp (s1 INT, PRIMARY KEY (s1)) engine=tokudb; CREATE TABLE tf (s1 INT, FOREIGN KEY (s1) REFERENCES tp (s1)) engine=tokudb;
If I use InnoDB, the statements succeed. In basic English, TokuDB doesn't support referential integrity.
This is a big deal because foreign keys are part of "core" standard SQL. In mitigation, I have to allow that many users eschew foreign keys in this benighted sharded world, and that in theory one could replace some foreign-key functionality with triggers. But I doubt that anybody ever does.
These statements resulted in an error:
CREATE TABLE tq (s1 TEXT) engine=tokudb; CREATE FULLTEXT INDEX iq ON tq (s1);
If I use InnoDB, the statements succeed, because InnoDB (since version 5.6.4) supports full-text indexing, TokuDB doesn't.
I'm sure I could hear objections like "that's not what TokuDB is for" (true), "InnoDB's implementation is bad" (actually that's my own opinion but I'm sure somebody somewhere shares it), or "use Sphinx" (I don't understand the logic here but I heard it when I complained about a full-text deficiency in another product).
To refute all objections, I went to bugs.mysql.com, clicked "Return only bugs with status = ALL", clicked "Show = ALL", clicked "Restrict to bugs in selected categories = MySQL Server: FULLTEXT search", and clicked "Search". I got 107 hits So full text must be important to somebody, else they wouldn't report bugs.
I started two connections, both with autocommit = 0 and default transaction isolation = REPEATABLE READ. On the first connection I said
CREATE TABLE t2 (s1 INT, UNIQUE(s1)) engine=tokudb; INSERT INTO t2 VALUES (0);
On the second connection I said:
INSERT INTO t2 VALUES (1);
Result: ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction.
If I use engine=innodb, I do not see this error, both INSERTs work.
I also see that there's a documented behaviour difference with SELECT FOR UPDATE.
I conclude that TokuDB's different locking methods mean that some SQL statements will have different effects in multi-user environments, if one switches from InnoDB.
A storage engine maker cannot muck with MySQL's syntax, with two exceptions: (1) this rule doesn't apply to InnoDB; (2) new variables or information_schema tables can be added that are specific to tuning or monitoring.
So I said "SHOW VARIABLES LIKE 'tokudb%'" ... and I did see some things that bothered me. For example:
() tokudb_lock_timeout seems to be something like innodb_lock_wait_timeout so it's unfortunate that they didn't call it tokudb_lock_wait_timeout
() tokudb_commit_sync seems to do some things like innodb_flush_log_at_trx_commit so it's unfortunate that they didn't use the same numbers
() tokudb_fs_reserve_percent? well, I couldn't tell if it's unfortunate or not because I couldn't find it in the TokuDB user manual.
Tokutek's FAQ says that TokuDB was tested on Centos 5 and CentOS 6. Doubtless it also works on all other Linux 64-bit distros (I downloaded for Ubuntu 12.04 and the MariaDB manual mentions Fedora). I see that Antony Curtis got it going on his Macbook.
That leaves three major items from the list of platforms that MySQL supports: FreeBSD, Solaris, and Windows. The trouble is that one of them, Windows, is what beginners or students might start with. As a result, MySQL + InnoDB or MariaDB + InnoDB will remain the default choice for introductory texts and some personal uses.
I can't see how this will change, since there's no money in the entry-level market and there's no point in a "Windows server" for Tokutek's real objective, which is to capture a piece of the must-have-performance market. But if they don't care, the mind share won't be there.
I fear that, by concentrating on TokuDB's performance advantages, people have failed to worry about losing features that are available with InnoDB. Even if you conclude that you want to use TokuDB -- especially if you conclude that you want to use TokuDB -- you should be aware of them.