Naming Rules


This page is part of the book SQL-99 Complete, Really, by Peter Gulutzan & Trudy Pelzer. The authors have graciously allowed us to reproduce the contents of the book here. Because the book is about the SQL-99 standard, the contents of this and other pages in the book may not directly apply to MariaDB. Use the navigation bar to navigate the book.

Contents

  1. 16. Give everything a name
  2. 17. When a name has two parts, separate the parts with the underscore character (_). For example: ytd_sales, initial_extent.
  3. 18. Avoid names that might be reserved words in some SQL dialect
  4. 19. Avoid <delimited identifier>s.
  5. 20. Names of Tables are plural; names of all other Objects are singular
  6. 21. Use words in your national language.
  7. 22. Don't worry about how <Column name>s appear when a Table is displayed on the screen
  8. 23. Names should be descriptive, but not too descriptive
  9. 24. If two Columns from different Tables are based on the same Domain, they should have the same name
  10. 25. Digits are a bad sign
  11. 26. Try to stop a name at 18 characters: the maximum length allowed in SQL-89
  12. 27. Repeat the <Table name> in the <Column name> ... not
  13. 28. Depend on a dialect ... not
  14. 29. Sometimes <Correlation name>s (or aliases) are simply necessary because the actual <Table name> is unwieldy, containing qualifiers or lengthy path names
  15. 30. Abbreviations
  16. 31. Host language conventions
  17. 32. User Names
  18. 33. Comma Lists

Everyone says that onomatopoeia is the oldest profession. Or, at least, they would say that, if they knew that onomatopoeia originally meant "the making of names", and that Adam's first job was to name all the beasts in the Garden of Eden.

16. Give everything a name

The DBMS often lets you skip giving an Object a name: it just assigns a default name. But this default name is arbitrary. And besides, no two DBMSs use the same rules for default names. So, give explicit names to expressions in select lists. For example:

SELECT (length + width) AS length_and_width
FROM   Widgets;

Consider giving explicit names to Constraints in CREATE TABLE, ALTER TABLE, CREATE DOMAIN, and ALTER DOMAIN statements. If you don't, how will you drop the Constraints later? And how will you interpret the diagnostics, which include <Constraint name>s? Here's an example:

CREATE TABLE Widgets
    (length     INT,
     CONSTRAINT Widgets_Length_Checker CHECK (length > 0));

Exception: Usually one does not give names to simple Column Constraints like NOT NULL or PRIMARY KEY.

Actually, naming is just one prominent example of a case where the DBMS will assign some "implementation-dependent" value if you don't specify one yourself. In all such cases, it's probably safer to specify.

17. When a name has two parts, separate the parts with the underscore character (_). For example: ytd_sales, initial_extent.

Alternative: For <Table name>s especially, you can keep the parts unseparated but capitalize the second word. For example: OrderItems, DepartmentNumbers.

18. Avoid names that might be reserved words in some SQL dialect

The way to do this is to use names that refer to objects in the real world that you're modelling with your database. You can be fairly sure that names like CandyStores, book_title or swather are not names that the DBMS needs for its own purposes. If you must be absolutely sure, you can take further measures but there are problems with all of them.

  • You can use the list of <keyword>s, in our chapter on general SQL concept. This list includes reserved words used in major SQL dialects, as well as reserved words used in all standard SQL variations at the time of printing. It's better to look it up here rather than depend on a vendor's manual. But it's impossible to keep such a list up to date.
  • You can check by passing to your DBMS an SQL statement containing the <identifier> and looking for an error message. For example, try to execute something like "CREATE TABLE <word> (<word> INT);" If the SQL statement works, <word> is not a reserved word. However, this won't tell you if some other DBMS reserves that word, or if the next version of your DBMS will reserve it.
  • You can put underscores (_) in names. This is unpopular. The SQL Standards committee doesn't intend to add <keyword>s containing underscores in any future SQL version. However, there are some exceptions: words that begin with
    1. CURRENT_ or SESSION_ or SYSTEM_, or words that end with
    2. _LENGTH. Underscores have special meanings when used with introducers, with LIKE predicates and with SIMILAR predicates. The SQL Standards committee will also avoid <keyword>s containing digits in all future versions. So try Mussels4. But first read Rule 25.
  • You can enclose all names with quotes (""). But <delimited identifier>s cause their own problems: see Rule 19.

19. Avoid <delimited identifier>s.

The troubles with them are, first, that double quote marks are false signals to many people who are used to thinking that quote marks appear around strings instead of names. Second, there's case sensitivity "X" is not the same as "x". Third, quote marks are ugly.

Exception: <Table name>s might require <delimited identifier>s, because some DBMSs use files for Tables. File names include special characters . or / or \ or : that are illegal in regular <identifier>s.

Exception: Microsoft Access programmers often use <delimited identifier>s for <Table name>s (Access is a non-standard SQL which uses []s instead of ""s to mark the delimitation).

Exception: Applications which generate SQL statements, such as user interfaces, might automatically enclose all <identifier>s inside ""s.

Exception: Of course, if you use SQL <special character>s in your names, you must use <delimited identifier>s.

With all these exceptions, you might decide to take the minority line and use <delimited identifier>s regularly. If you do, at least avoid names that have lead or trailing spaces. Some DBMSs' processes include an automatic TRIM.

20. Names of Tables are plural; names of all other Objects are singular

Thus, in the INFORMATION_SCHEMA, we have a View named SCHEMATA and the Columns of this View are: CATALOG_NAME, SCHEMA_NAME and so on. Often a plural is a collective noun, for example: INVENTORY. Admittedly, this means that <Table name>s will be longer (at least in English), but it's a subtle signal that distinguishes <Table name>s from other <identifier>s.

Alternative: The dissenting minority points out that the English phrases for many tabular items are singular: "ADDRESS BOOK" (not "ADDRESSES BOOK"), "PHONE BOOK", "INVESTMENT PORTFOLIO", "RESTAURANT LIST", etc.

21. Use words in your national language.

The fact that SQL <keyword>s look like English is irrelevant. For example, this sample SQL statement appeared in an article in a Polish magazine:

UPDATE studenci SET nazwisko='Kowalski';

This does mean that names will sometimes include characters outside the regular English alphabet. Obviously the effect on portability is unfortunate, but if your DBMS doesn't support accented characters in names then it doubtless won't properly support them in data values either, so why would you use such a DBMS anyway? Precisely because you don't know what a nazwisko is, you can see that a Pole would have trouble understanding the word that you use instead of nazwisko.

22. Don't worry about how <Column name>s appear when a Table is displayed on the screen

That's something that changes anyway (use AS clauses). Instead, worry about how names appear if you print out a program. Remember: The goal is long-term comprehension, so ephemeral considerations such as screen-display deserve low priority.

23. Names should be descriptive, but not too descriptive

Minimally, you should avoid algebra like "UPDATE k SET k1=4" where no one could possibly guess what k and k1 are supposed to represent. Medianly, you should avoid non-specific descriptors like PHONE_NUMBER where no one can be sure whether the referent is a home- or office- or general-contact- telephone number. But stop there! Avoid names like SOLDIERS_IN_THE_ARMY because (presumably) all the soldiers in the database are in the army; the "in the army" bit is only helpful if you also have soldiers in the navy and you have to distinguish between them. This part of the rule avoid making accidents part of the identification is analogous to one of the normalization rules.

24. If two Columns from different Tables are based on the same Domain, they should have the same name

In fact, they should have the Domain's name. For example:

CREATE DOMAIN surname VARCHAR(25);
CREATE TABLE Students (surname surname, ...);
CREATE TABLE Professors (surname surname, ...);

This rule would apply even if your DBMS doesn't support explicit creation of Domains, or if you use SQL3's user-defined type feature you're still using the concept of Domains.

Exception: This rule does not apply for two Columns in the same Table.

Incidentally, when <Column name>s are the same, NATURAL JOIN is easier. That's usually a blessing, but some caution is required you certainly don't want to cause a join over two Columns which have the same name by accident.

25. Digits are a bad sign

Too often we use digits as arbitrary distinguishers e.g.: Lines_1 / Lines_2 when there is some intrinsic difference between Lines_1 and Lines_2 that could be expressed in the names, for example, Lines_Freshwater and Lines_Longitude. Particularly bad are the digits '0' and '1', which look too much like the letters 'O' and 'l'.

26. Try to stop a name at 18 characters: the maximum length allowed in SQL-89

Mainly, it's hard to remember a long name. For example, do you remember if the name mentioned in rule 8 was Parts_Which_Have_No_Serial_Numbers? Or was it Parts_Which_Have_No_Serialnumber?

27. Repeat the <Table name> in the <Column name> ... not

Firstly, you'd end up violating rule 24. Secondly, if you make a View of the Table you'll have to either violate this rule, or make View <Column name>s not equal to Table <Column name>s. For example, the INFORMATION_SCHEMA View called GRANTS has a Column called IS_GRANTABLE instead of GRANT_IS_GRANTABLE. Remember, if you really need to make it clear what Table the Column is in, you can use a <Column reference>: GRANTS.IS_GRANTABLE.

Exception: A Column which is part of the primary key of the Table could include the <Table name>, in the singular. For example, the INFORMATION_SCHEMA View called SCHEMATA has a Column called SCHEMA_NAME and any foreign keys that reference SCHEMATA would be Columns called SCHEMA_NAME too (assuming that Views could have such Constraints). There are several conflicting conventions for foreign keys. In any case, though, it is not part of your mandate to ensure that all <Column name>s in the database must be unique.

28. Depend on a dialect ... not

This can be subtle, e.g.: UCASE is a function name that some people seem to think is standard SQL (in fact it's ODBC). Write with lowest-common-denominator syntax when you can, but test it first with an SQL3 parser to make sure you're not going to violate a rule when you upgrade.

29. Sometimes <Correlation name>s (or aliases) are simply necessary because the actual <Table name> is unwieldy, containing qualifiers or lengthy path names

In Oracle, use of <Correlation name>s actually helps the optimizer. But should you always use <Correlation name>s? No they're most appropriate in SELECT statements where <Column name>s must be qualified.

30. Abbreviations

Legacy SQL code has frequent abbreviations: PROV for PROVINCE, DEPT for DEPARTMENT, LEN for LENGTH, FNAME for FIRST NAME and so on. Judging from trends in other computer languages, this taste will become obsolete. At this moment it's still a matter of taste. A few abbreviated prefixes/suffixes are used for some common Domains: _id for single-Column candidate key (e.g.: author_id, program_id), _no for ordinal number (e.g.: player_no, receipt_no), qty_ for quantity (e.g.: qty_of_goods_sold), avg_ for average (e.g.: avg_qty_of_goods_sold), min_ for minimum (e.g.: min_weight), max_ for maximum (e.g.: max_length) and sum_ for total (e.g.: sum_balance). Notice that some of the prefixes are derived from SQL <keyword>s.

  • Examples of <Domain name>s/<Column name>s Some names that we have seen in use in databases for banking/libraries /retail shops/government include: firstname, lastname or surname, street, houseno, aptno or unitno or suiteno, city, state or province, country, phoneno or email, sex, birth_date, account_id, balance, account_open_date, account_close_date, transaction_code, author_firstname, author_lastname, title, callno, isbn, year_published, checkout_date, loan_type, amount, itemno, transaction_time, transaction_code.

Certainly we've seen many other names too, in many styles. We picked ones that generally fit the criteria that we've described heretofore.

31. Host language conventions

There is certainly an argument that this C code snippet looks fine:

EXEC SQL
INSERT INTO Recordings (szRecording)
VALUES (:szRecording);

The point here is that the C host variable szRecording associates with the SQL Column szRecording. Hence the same name. In general we could say that SQL Object names are often influenced by conventions used in the most common host language, such as C in this case. We don't condemn this practice, we just ignore it, since our concern is SQL conventions rather than host language conventions.

One detail about the szRecording in the preceding example: it's in a Polish notation, that is, the sz in the name indicates the data type (string zero). We will concern ourselves solely with the question: is it good SQL to embed <data type> information in names, for example szrecording or name_char or (more subtly) namestring? The answer, judging as usual from what seems to be common practice, is yes that's okay, but nobody is doing so systematically. Sometimes we do see <Column name>s that end in _date or num[eric], but we don't see consistency.

For the narrower concept Domains we have Rule 24.

32. User Names

If you have control over user names, prefer first names: Ralph, Mike, Lucien. Where necessary add the first letter of the last name: JeanC, RalphK, LucienB. This convention appears to derive from names on the Internet.

Often there is no choice in this regard, because the operating system feeds user names to the DBMS.

33. Comma Lists

Whenever a list is disordered, people wonder why. For example, these SQL statements look a trifle curious:

SELECT firstname, travel_allowance, surname
FROM   SalesPersons;

SELECT *
FROM   States
WHERE  state_abbreviation IN('WY','MI','AK','CO');

If there is some hidden order, add a note explaining what it is. Otherwise, change to a natural or alphabetical order.


Tags