Monthly Archives: December 2022


Offset and fetch clauses in MariaDB

A new feature in MariaDB 10.6 is "Implement SQL-standard SELECT ... OFFSET ... FETCH".

I'll say what's not in the manual: mention other vendors, add details, talk style, show how to get the wrong answer, describe WITH TIES, mention alternatives, experiment with locks, and list 11 minor flaws.

Other vendors

Products with support for OFFSET ... FETCH:
Oracle, DB2, SQL Server.

Products with support for LIMIT ... OFFSET:
Tarantool, MySQL, Vertica, PostgreSQL.

Products with support for neither:
Firebird, ScyllaDB, Actian.

This isn't a complete list -- there are lots of SQL products nowadays -- but it does include the "Big Three". All of them support SELECT ... OFFSET ... FETCH.

Syntax details

The BNF actually is:

[ OFFSET start { ROW | ROWS } ]
[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES } ]

The "start" and count" must be unsigned integers if they're literals, but if they're declared variables then they can be NULL (in which case they're treated as zeros), or can be negative (in which case they're treated as -infinity for "start" and +infinity for "count") (but depending on undefined behaviour isn't always fun).

The lowest offset is zero which is the first row, and the lowest count is zero, that is, it is possible to select zero rows. If count is omitted, then 1 is assumed.

Style

If you haven't read the descriptive SQL style guide and clicked like, then you won't understand this section, skip it.

OFFSET and FETCH are both either "clause starts" or "sub-clause starts" so if the choice is to put separate clauses on separate lines, then they are.

Deciding whether to use ROW rather than ROWS, or FIRST rather than NEXT, (neither of which make a difference for server behaviour), one could base a rule on English grammar:
(1) If there was an OFFSET clause with a non-zero number-of-rows, then NEXT, else FIRST. If number-of-rows is greater than one, or unknown, then ROWS, else ROW.
or one could base a rule on what comes first in the MariaDB manual BNF:
(2) The BNF has {FIRST|NEXT} and {ROW|ROWS} so FIRST and ROW.
or one could base a rule on what the examples in the manual use most often:
(3) FIRST and ROWS
or one could base a rule on whatever is shortest:
(4) NEXT and ROW.

I like (3), for example

  OFFSET 5 ROWS
  FETCH FIRST 1 ROWS ONLY;

but there's no established rule yet so I'll vary the format in what follows.

How to get the wrong answer

Try this:

CREATE TABLE td (char_1_column CHAR(1) DEFAULT '1',
                 null_column INT,
                 char_2_column CHAR(2));
CREATE INDEX int_char_1_column_plus_char_2_column
  ON td (char_1_column, char_2_column);
INSERT INTO td (char_2_column)
  VALUES ('a'),('b'),('c'),('d'),('e'),('ab');
SELECT char_2_column, null_column
  FROM td
  WHERE char_1_column > '0'
  OFFSET 0 ROWS
  FETCH FIRST 4 ROWS ONLY;
SELECT char_2_column, null_column
  FROM td WHERE char_1_column > '0'
  OFFSET 4 ROWS
  FETCH FIRST 4 ROWS ONLY;

Result of first SELECT:
('a',NULL),('ab',NULL),('b',NULL),('c',NULL)

Result of second SELECT:
('e',NULL),('ab',NULL)

So there's no ('d',NULL) and there's a duplicate ('ab',NULL). Wrong answer.

I don't think it's a bug because, in the absence of an ORDER BY statement, there is no guarantee which rows will be skipped in the second SELECT. But luckily it's due to an exotic situation which is easy to avoid: the combined index on a non-unique column and a multi-character string column. Just don't use the index, and all is well.

WITH TIES

The effects of the new syntax are mostly the same as with LIMIT ... OFFSET, but WITH TIES is new.

Suppose there are 6 rows, selected in order (there must be an ORDER BY clause):
(NULL),(NULL),(1),(1),(2),(2).
Then OFFSET 0 ROWS FETCH NEXT 1 ROWS WITH TIES will gives us (NULL),(NULL).
Thus WITH TIES doesn't mean that the two values are technically "equal", they merely must be peers, so NULLS tie. The interesting thing is that when you ask for "NEXT 1 ROW" you get the next 2 rows. But that only applies to FETCH, it doesn't apply to OFFSET, therefore this could be a mistake:

SELECT * FROM t
  ORDER BY pk
  OFFSET 0 ROWS
  FETCH NEXT 1 ROWS WITH TIES;
SELECT * FROM t
  ORDER BY pk
  OFFSET 1 ROWS
  FETCH NEXT 1 ROWS WITH TIES;

With the example 6 rows, the second SELECT will return a row that was already returned for the first SELECT. It would be better to say, instead of "OFFSET 1 ROWS", "OFFSET (number of rows returned by the previous SELECT) ROWS".

Alternatives

A standard way to fetch only one row:

BEGIN NOT ATOMIC
  DECLARE v INT;
  DECLARE c CURSOR FOR SELECT * FROM t;
  OPEN c;
  FETCH NEXT FROM c INTO v;
END;

A non-standard way to fetch only one row: (remember to restore @@sql_select_limit later):

SET sql_select_limit = 1;
SELECT * FROM t;

A way to offset by 10 and fetch 10 (this assumes that pk has all the integers in this range i.e. it depends on user knowing about contents and indexes, and assumes that user only wants to go forward without skipping)

SELECT * FROM t
WHERE pk > 10 AND pk <= 10 + 10;

A way to let the user do the offsets and fetches (this depends on the client being a GUI and the user knowing how to click on a scroll bar)

SELECT * FROM t;

... although users often only look at the first few pages (think of how often you use Google and only click Next a few dozen times).

Locking

Will locking, or SET TRANSACTION ISOLATION LEVEL, guarantee consistency?

For a test, I started two clients. On the first, I said:

CREATE TABLE pks (pk INT PRIMARY KEY);
BEGIN NOT ATOMIC
  DECLARE v INT DEFAULT 0;
  WHILE v <= 10000 DO
    INSERT INTO pks VALUES (v);
    SET v = v + 1;
  END WHILE;
END;
SET @@autocommit=0;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
SELECT * FROM pks
  WHERE pk > 400
  OFFSET 10 ROWS
  FETCH FIRST 2 ROWS ONLY FOR UPDATE;

On the second, I said:

SET @@autocommit=0;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
DELETE FROM pks WHERE pk = 7227;
DELETE FROM pks WHERE pk = 405;
DELETE FROM pks WHERE pk = 411;

Result: The first DELETE succeeds, the second DELETE hangs until I say ROLLBACK; on the first client. The third DELETE would also have hung if I had said it first.

In other words: the rows between 401 and 410 (the rows skipped due to OFFSET) are locked, and the rows between 411 and 412 (the rows picked up by FETCH) are locked, but the rows after 412 (the rest of the rows matching WHERE) are not locked.

What this means, of course, is that ordinarily you can't be certain that you're paging through a consistent set. If this is a concern (it shouldn't always be), make a copy on either the server or the client.

Minor Flaws

A few incompatibilities with standard SQL or another DBMS or another bit of MariaDB 10.10 syntax.

Table t has 4 rows created thus:

CREATE TABLE t (s1 INT, s2 INT);
INSERT INTO t VALUES (1,2),(2,1),(3,2),(4,1);

1.

WITH x AS (SELECT * FROM t OFFSET 5 ROWS) SELECT * FROM x;

Result: Success, all 4 rows in table t are returned. It appears that the OFFSET clause is ignored.

2.

BEGIN NOT ATOMIC
  DECLARE var1 DECIMAL(0);
  SELECT * FROM t OFFSET var1 ROWS;
END;

Result: Error 1691 (HY000) A variable of a non-integer based type in LIMIT clause
Never mind whether var1 should be accepted. The error message is obsolete, I was not using a LIMIT clause.

3.

BEGIN NOT ATOMIC
  DECLARE var1 BIT DEFAULT NULL;
  SELECT * FROM t OFFSET var1 ROWS;
END;

Result: Success, all rows are returned.
Sometimes non-integer variables are illegal, but here a BIT variable is legal.
MariaDB calls BIT a numeric data type but SELECT * FROM t OFFSET b'1' ROWS wouldn't work.

4.

SET sql_mode='oracle';
CREATE PROCEDURE p2 AS
  var1 NUMBER := 1;
  BEGIN
    SELECT * FROM t OFFSET var1 rows;
  END;

(Don't forget to restore sql_mode later.)
Result: Error 1691 (HY000) A variable of a non-integer based type in LIMIT clause
That's right for standard SQL ("The declared type ... shall be an exact numeric with scale 0") but Oracle says it might be more forgiving ("If rowcount includes a fraction, then the fractional portion is truncated.")

5.

SELECT * FROM t
  OFFSET 1 ROWS
  FETCH FIRST 0 ROWS ONLY;

Result: Success, zero rows are returned.
That's wrong for standard SQL, which would return "data exception — invalid row count in fetch first clause".

6.

SELECT * FROM t
  FETCH FIRST 1 ROWS ONLY;

Result: Success.
But the manual's BNF is

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }

which might suggest to newbies that the OFFSET clause is compulsory before FETCH.

7.

SET @x = 1;
SELECT * FROM t OFFSET @x ROWS;

Result: Syntax error.
I can find, by saying CREATE TABLE .. AS SELECT @x; and then saying SHOW CREATE TABLE or SELECT ... FROM INFORMATION_SCHEMA, that @x is BIGINT. BIGINT declared variables are legal, and I can use variables that start with @ in some other situations. This is how things go when variables have variable types.

8.

SHOW WARNINGS OFFSET 1 ROW;

Result: Success. I believe this is reasonable, and of course any non-standard syntax can be handled in an implementor-defined way. But it's not documented on the documentation page for SHOW WARNINGS.

9.

BEGIN NOT ATOMIC
  DECLARE only INT DEFAULT 2e2;
  DECLARE row INT DEFAULT (SELECT 5);
  SELECT * FROM t OFFSET ROW ROW FETCH FIRST ONLY ROW ONLY;
END;

Result: Success.
In standard SQL:2016 ROW and ONLY are reserved words.
MariaDB tends to accept goofiness since it's not your nanny, but I believe in egregious cases the standard strictness is preferable after reasonable notice.

10.

SELECT * FROM t
  FETCH FIRST 1 ROWS ONLY ROWS EXAMINED 1;

Result: Error.
Another inconsistency, since LIMIT 1 ROWS EXAMINED 1 is legal.

11.

VALUES (1),(2),(3) OFFSET 1 ROW FETCH FIRST ROW ONLY;

Result: Success.
This is fine, but once again undocumented.

ocelotgui support

The Ocelot GUI client recognizes MariaDB syntax when connected to a MariaDB server, so if you need features like autocomplete after OFFSET to be correct then download it from github.