Monthly Archives: November 2016

Pronouncing Database Terms

It is the business of educated people to speak so that no-one may be able to tell in what county their childhood was passed. -- A. Burrell, A Handbook for Teachers in Public Elementary School, 1891

The terms that reveal where a person (mis)spent a DBMS-related childhood are "char", "data", "GIF", "gigabyte", "GUI", "JSON", "query", "schema", "tuple", "_", "`", and "«".


(1) Like "Care" because it's short for "Character" (so hard C and most folks say "Character" that way)?
(2) Like "Car" because it's short for "Character" (so hard C and a few folks in the British Isles say it that way and perhaps all other English words ending in consonant + "ar" are pronounced that way)?
(3) Like "Char" (the English word for a type of trout)?
C/C++ programmers say (3), for example Bjarne Stroustrup of C++ fame says that's illogical but usual. However, people who have not come to SQL from another programming language may be more likely to go with (2), leading one online voter to exclaim "I've known a lot of people who say "car" though. (Generally SQL-y people; is this what they teach in DBA classes?)" and Tom Kyte of Oracle fame reportedly says "var-car" .


The Oxford English Dictionary (OED) shows 4 (four!) variations:
"Brit. /ˈdeɪtə/, /ˈdɑːtə/, U.S. /ˈdædə/, /ˈdeɪdə/".
It's only the first syllable that matters -- DAY or DA?
I haven't seen the Longman Pronunciation Dictionary, but a blog post says the results of Longman's preference poll were:
"BrE: deɪtə 92% ˈdɑːtə 6% ˈdætə 2% AmE: ˈdeɪțə 64%ˈdæțə 35% ˈdɑːțə 1%" (notice it's ț not t for our American friends). By the way OED says in a computing context it's "as a mass noun" so I guess "data is" is okay.


It's "jif", says its creator.


That letter at the start is a hard G; The "Jigabyte" pronunciation is unknown to Merriam-Webster, Cambridge, and Oxford dictionaries.


No question it's "gooey", for all the dictionaries I checked. So pronounce our product as "osselot-goey""osselot-gooey".


The author of "Essential COM" says

The exact pronunciation of GUID is a subject of heated debate among COM developers. Although the COM specification says that GUID rhymes with fluid, the author [Don Box] believes that the COM specification is simply incorrect, citing the word languid as setting the precedent.

The COM specification is a standard and therefore cannot be incorrect, but I can't find it, and I like setting-a-precedent games, so let's use the exact word Guid, eh? It appears in Hugh MacDiarmid's masterpiece "A Drunk Man Looks At The Thistle"

But there are flegsome deeps
Where the soul o'Scotland sleeps
That I to bottom need
To wauk Guid kens what deid

.. which proves that Guid is a one-syllable word, though doubtless MacDiarmid pronounced it "Gweed".


Doug Crockford of Yahoo fame, seen on Youtube, says:

So I discovered JAYsun. Java Script Object Notation. There's a lot of argument about how you pronounce that. I strictly don't care. I think probably the correct pronunciation is [switching to French] "je sens".

The argument is mostly between people who say JAYsun and people who say JaySAWN. It's controversial, and in our non-JSON environment it's a foreign word, so spelling it out J S O N is safe and okay.


In the 1600s the spelling was "quaery", so it must have rhymed with "very", and it still does, for some Americans. But the OED says that both American and British speakers say "QUEERie" nowadays.


It's "Skema". The "Shema" pronunciation is unknown to Merriam-Webster, Cambridge, and Oxford dictionaries.


See the earlier post "How to pronounce SQL" which concluded:

In the end, then, it's "when in Rome do as the Romans do". In Microsoft or Oracle contexts one should, like Mr Ellison, respect Microsoft's or Oracle's way of speaking. But here in open-source-DBMS-land the preference is to follow the standard.


See the earlier post "Tuples". It's "Tuhple".


According to Swan's "Practical English Usage" the _ (Unicode code point 005F) character is more often called underline by Britons, more often called underscore by Americans. (The SQL-standard term is underscore.) (The Unicode term is LOW LINE; SPACING UNDERSCORE was the old Unicode-version-1.0 term.)

` `

This is a clue for telling if people have MySQL backgrounds -- they'll pronounce the ` (Unicode code point 0060) symbol as "backtick". Of course it also is found in other programming contexts nowadays, but there are lots of choices in the Jargon File:

Common: backquote; left quote; left single quote; open quote; ; grave. Rare: Backprime; [backspark]; unapostrophe; birk; blugle; back tick; back glitch; push; ; quasiquote.

By the way The Jargon File is a good source for such whimsical alternatives of ASCII names.

« »

You might be fooled by an Adobe error, as I was, into thinking that these French-quote-mark thingies are pronounced GEELmoes. Wrong. They are GEELmays. (The Unicode term is left-point or right-point double angle quotation marks.) This matter matters because, as Professor Higgins said, "The French don't care what they do actually, as long as they pronounce it properly."

Meanwhile ...

Enhancements made to the source code for the next version of ocelotgui, Ocelot's Graphical User Interface for MySQL and MariaDB, are: error messages are optionally in French, and grid output is optionally in HTML. As always, the description of the current version is on and the downloadable source and releases are on github.