Naming Rules
This page is part of the book SQL-99 Complete, Really, by Peter Gulutzan & Trudy Pelzer. The authors have graciously allowed us to reproduce the contents of the book here. Because the book is about the SQL-99 standard, the contents of this and other pages in the book may not directly apply to MariaDB. Use the navigation bar to navigate the book.
Everyone says that onomatopoeia is the oldest profession. Or, at least, they would say that, if they knew that onomatopoeia originally meant "the making of names", and that Adam's first job was to name all the beasts in the Garden of Eden.
16. Give everything a name
The DBMS often lets you skip giving an Object a name: it just assigns a default name. But this default name is arbitrary. And besides, no two DBMSs use the same rules for default names. So, give explicit names to expressions in select lists. For example:
SELECT (length + width) AS length_and_width FROM Widgets;
Consider giving explicit names to Constraints in CREATE TABLE
,
ALTER TABLE
, CREATE DOMAIN
, and ALTER DOMAIN
statements. If you
don't, how will you drop the Constraints later? And how will you interpret the
diagnostics, which include <Constraint name>s? Here's an example:
CREATE TABLE Widgets (length INT, CONSTRAINT Widgets_Length_Checker CHECK (length > 0));
Exception: Usually one does not give names to simple Column Constraints like
NOT NULL
or PRIMARY KEY
.
Actually, naming is just one prominent example of a case where the DBMS will assign some "implementation-dependent" value if you don't specify one yourself. In all such cases, it's probably safer to specify.
17. When a name has two parts, separate the parts with the underscore character (_). For example: ytd_sales, initial_extent.
Alternative: For <Table name>s especially, you can keep the parts unseparated but capitalize the second word. For example: OrderItems, DepartmentNumbers.
18. Avoid names that might be reserved words in some SQL dialect
The way to do this is to use names that refer to objects in the real world that you're modelling with your database. You can be fairly sure that names like CandyStores, book_title or swather are not names that the DBMS needs for its own purposes. If you must be absolutely sure, you can take further measures — but there are problems with all of them.
- You can use the list of <keyword>s, in our chapter on general SQL concept. This list includes reserved words used in major SQL dialects, as well as reserved words used in all standard SQL variations at the time of printing. It's better to look it up here rather than depend on a vendor's manual. But it's impossible to keep such a list up to date.
- You can check by passing to your DBMS an SQL statement containing the
<identifier> and looking for an error message. For example, try to execute
something like "
CREATE TABLE
<word> (<word>INT
);" If the SQL statement works, <word> is not a reserved word. However, this won't tell you if some other DBMS reserves that word, or if the next version of your DBMS will reserve it. - You can put underscores (_) in names. This is unpopular. The SQL Standards
committee doesn't intend to add <keyword>s containing underscores in any
future SQL version. However, there are some exceptions: words that begin with
- CURRENT_
or
SESSION_or
SYSTEM_, or words that end with
- _LENGTH
. Underscores have special meanings when used with introducers, with
LIKEpredicates and with
SIMILARpredicates. The SQL Standards committee will also avoid <keyword>s containing digits in all future versions. So try
Mussels4. But first read Rule 25.
- CURRENT_
- You can enclose all names with quotes (""). But <delimited identifier>s cause their own problems: see Rule 19.
19. Avoid <delimited identifier>s.
The troubles with them are, first, that double quote marks are false signals to many people who are used to thinking that quote marks appear around strings instead of names. Second, there's case sensitivity — "X" is not the same as "x". Third, quote marks are ugly.
Exception: <Table name>s might require <delimited identifier>s, because some DBMSs use files for Tables. File names include special characters — . or / or \ or : — that are illegal in regular <identifier>s.
Exception: Microsoft Access programmers often use <delimited identifier>s for <Table name>s (Access is a non-standard SQL which uses []s instead of ""s to mark the delimitation).
Exception: Applications which generate SQL statements, such as user interfaces, might automatically enclose all <identifier>s inside ""s.
Exception: Of course, if you use SQL <special character>s in your names, you must use <delimited identifier>s.
With all these exceptions, you might decide to take the minority line and use <delimited identifier>s regularly. If you do, at least avoid names that have lead or trailing spaces. Some DBMSs' processes include an automatic TRIM.
20. Names of Tables are plural; names of all other Objects are singular
Thus, in the INFORMATION_SCHEMA
, we have a View named SCHEMATA
and the
Columns of this View are: CATALOG_NAME
, SCHEMA_NAME
and so on. Often a
plural is a collective noun, for example: INVENTORY
. Admittedly, this means
that <Table name>s will be longer (at least in English), but it's a subtle
signal that distinguishes <Table name>s from other <identifier>s.
Alternative: The dissenting minority points out that the English phrases
for many tabular items are singular: "ADDRESS BOOK
"
(not "ADDRESSES BOOK
"), "PHONE BOOK
", "INVESTMENT PORTFOLIO
",
"RESTAURANT LIST
", etc.
21. Use words in your national language.
The fact that SQL <keyword>s look like English is irrelevant. For example, this sample SQL statement appeared in an article in a Polish magazine:
UPDATE studenci SET nazwisko='Kowalski';
This does mean that names will sometimes include characters outside the regular English alphabet. Obviously the effect on portability is unfortunate, but if your DBMS doesn't support accented characters in names then it doubtless won't properly support them in data values either, so why would you use such a DBMS anyway? Precisely because you don't know what a nazwisko is, you can see that a Pole would have trouble understanding the word that you use instead of nazwisko.
22. Don't worry about how <Column name>s appear when a Table is displayed on the screen
That's something that changes anyway (use AS
clauses). Instead, worry about
how names appear if you print out a program. Remember: The goal is long-term
comprehension, so ephemeral considerations such as screen-display deserve low
priority.
23. Names should be descriptive, but not too descriptive
Minimally, you should avoid algebra like "UPDATE k SET k1=4
"
— where no one could possibly guess what k
and
k1
are supposed to represent. Medianly, you should avoid non-specific
descriptors like PHONE_NUMBER
— where no one can
be sure whether the referent is a home- or office- or general-contact-
telephone number. But stop there! Avoid names like SOLDIERS_IN_THE_ARMY
because (presumably) all the soldiers in the database are in the army; the "in
the army" bit is only helpful if you also have soldiers in the navy and you
have to distinguish between them. This part of the rule
— avoid making accidents part of the identification
— is analogous to one of the normalization rules.
24. If two Columns from different Tables are based on the same Domain, they should have the same name
In fact, they should have the Domain's name. For example:
CREATE DOMAIN surname VARCHAR(25); CREATE TABLE Students (surname surname, ...); CREATE TABLE Professors (surname surname, ...);
This rule would apply even if your DBMS doesn't support explicit creation of Domains, or if you use SQL3's user-defined type feature — you're still using the concept of Domains.
Exception: This rule does not apply for two Columns in the same Table.
Incidentally, when <Column name>s are the same, NATURAL JOIN is easier. That's usually a blessing, but some caution is required — you certainly don't want to cause a join over two Columns which have the same name by accident.
25. Digits are a bad sign
Too often we use digits as arbitrary distinguishers —
e.g.: Lines_1
/ Lines_2
— when there is some intrinsic
difference between Lines_1
and Lines_2
that could be expressed in the names,
for example, Lines_Freshwater
and Lines_Longitude
. Particularly bad are
the digits '0
' and '1
', which look too much like the letters 'O
'
and 'l
'.
26. Try to stop a name at 18 characters: the maximum length allowed in SQL-89
Mainly, it's hard to remember a long name. For example, do you remember if the
name mentioned in rule 8 was Parts_Which_Have_No_Serial_Numbers
? Or was it
Parts_Which_Have_No_Serialnumber
?
27. Repeat the <Table name> in the <Column name> ... not
Firstly, you'd end up violating rule 24. Secondly, if you make a View of the
Table you'll have to either violate this rule, or make View <Column name>s not
equal to Table <Column name>s. For example, the INFORMATION_SCHEMA
View
called GRANTS
has a Column called IS_GRANTABLE
instead of
GRANT_IS_GRANTABLE
. Remember, if you really need to make it clear what
Table the Column is in, you can use a <Column reference>:
GRANTS.IS_GRANTABLE
.
Exception: A Column which is part of the primary key of the Table could
include the <Table name>, in the singular. For example, the
INFORMATION_SCHEMA
View called SCHEMATA
has a Column called
SCHEMA_NAME
— and any foreign keys that reference
SCHEMATA
would be Columns called SCHEMA_NAME
too (assuming that Views
could have such Constraints). There are several conflicting conventions for
foreign keys. In any case, though, it is not part of your mandate to ensure
that all <Column name>s in the database must be unique.
28. Depend on a dialect ... not
This can be subtle, e.g.: UCASE
is a function name that some people seem to
think is standard SQL (in fact it's ODBC). Write with lowest-common-denominator
syntax when you can, but test it first with an SQL3 parser to make sure you're
not going to violate a rule when you upgrade.
29. Sometimes <Correlation name>s (or aliases) are simply necessary because the actual <Table name> is unwieldy, containing qualifiers or lengthy path names
In Oracle, use of <Correlation name>s actually helps the optimizer. But should
you always use <Correlation name>s? No — they're most
appropriate in SELECT
statements where <Column name>s must be qualified.
30. Abbreviations
Legacy SQL code has frequent abbreviations: PROV
for PROVINCE
, DEPT
for DEPARTMENT
, LEN
for LENGTH
, FNAME
for FIRST NAME
and so
on. Judging from trends in other computer languages, this taste will become
obsolete. At this moment it's still a matter of taste. A few abbreviated
prefixes/suffixes are used for some common Domains: _id
for single-Column
candidate key (e.g.: author_id
, program_id
), _no
for ordinal number
(e.g.: player_no
, receipt_no
), qty_
for quantity (e.g.:
qty_of_goods_sold
), avg_
for average (e.g.: avg_qty_of_goods_sold
),
min_
for minimum (e.g.: min_weight
), max_
for maximum (e.g.:
max_length
) and sum_
for total (e.g.: sum_balance
). Notice that
some of the prefixes are derived from SQL <keyword>s.
- Examples of <Domain name>s/<Column name>s Some names that we have seen in use
in databases for banking/libraries /retail shops/government
include:
firstname
,lastname
orsurname
,street
,houseno
,aptno
orunitno
orsuiteno
,city
,state
orprovince
,country
,phoneno
oremail
,sex
,birth_date
,account_id
,balance
,account_open_date
,account_close_date
,transaction_code
,author_firstname
,author_lastname
,title
,callno
,isbn
,year_published
,checkout_date
,loan_type
,amount
,itemno
,transaction_time
,transaction_code
.
Certainly we've seen many other names too, in many styles. We picked ones that generally fit the criteria that we've described heretofore.
31. Host language conventions
There is certainly an argument that this C code snippet looks fine:
EXEC SQL INSERT INTO Recordings (szRecording) VALUES (:szRecording);
The point here is that the C host variable szRecording
associates with the SQL
Column szRecording
. Hence the same name. In general we could say that SQL
Object names are often influenced by conventions used in the most common host
language, such as C in this case. We don't condemn this practice, we just
ignore it, since our concern is SQL conventions rather than host language
conventions.
One detail about the szRecording
in the preceding example: it's in a Polish
notation, that is, the sz
in the name indicates the data type (string
zero). We will concern ourselves solely with the question: is it good SQL to
embed <data type> information in names, for example szrecording
or
name_char
or (more subtly) namestring
? The answer, judging as usual
from what seems to be common practice, is yes that's okay, but nobody is doing
so systematically. Sometimes we do see <Column name>s that end in _date
or
num[eric]
, but we don't see consistency.
For the narrower concept — Domains — we have Rule 24.
32. User Names
If you have control over user names, prefer first names: Ralph, Mike, Lucien. Where necessary add the first letter of the last name: JeanC, RalphK, LucienB. This convention appears to derive from names on the Internet.
Often there is no choice in this regard, because the operating system feeds user names to the DBMS.
33. Comma Lists
Whenever a list is disordered, people wonder why. For example, these SQL statements look a trifle curious:
SELECT firstname, travel_allowance, surname FROM SalesPersons; SELECT * FROM States WHERE state_abbreviation IN('WY','MI','AK','CO');
If there is some hidden order, add a note explaining what it is. Otherwise, change to a natural or alphabetical order.