The main goal of a primary key is to stop you storing duplicate values in its columns. You can also achieve this using unique constraints.
Both of these can be the target of foreign keys. So you could build a schema (with foreign keys) that only has unique constraints.
But by convention foreign keys reference primary keys, not unique constraints. This is because a table can have many unique constraints, but at most one primary key. By having all foreign keys point to primary keys, you make it easier for other developers/DBAs to understand the schema.
You also limit cascading update problems when you change unique/primary column values. This is where you update foreign key referenced columns in the parent table, then need to apply this update to all rows in all child tables.
To further limit the cascading update problem, often these primary keys are surrogates - synthetic values such as incrementing integers or GUIDs. These have no meaning outside the application, so should never change. Unlike the values in unique constraints (aka candidate or business keys) such as SSNs, vehicle license plate numbers, passport numbers, etc. which can and do change.
It's also worth noting that many are unaware a foreign key can point to a unique constraint. By sticking to primary keys, you reduce the "surprise factor" for the application. Which makes it easier to maintain.
Finally many frameworks expect tables to have single-column primary keys. It can be harder to get these to work with tables with compound keys.
Is there some underlying rule or DBA agreement that every table should have a primary key?It's rare for a table to need neither a primary key nor any unique constraints. If both are missing, this is usually a sign of a modelling failure.
Not all tables need
surrogate keys though. This is most likely when the business keys are standardized short character codes. For example, ISO currency and country codes or IATA airport codes. In these cases I would make these values the primary key:
create table currencies (
currency_code varchar2(3) primary key,
...
)
That said there are
few types of table where neither constraint are appropriate. Staging tables are the main ones that come to mind.
My assumption is that a primary key is only needed when it has a child table?No. As stated above, a table may not need a
surrogate key. But I would strive to find the unique constraints and designate one the primary key.
It's also worth bearing in mind: just because there are no child tables
now, as the application evolves it's likely many childless tables will become parents at some stage!
On the other side, if a large, frequently accessed table doesn't have any child tables and doesn't have a unique column, should we create a column based off the value of a sequence and add a unique index (instead of primary key) on it?I would first ask what the keys are for the table - the columns that shouldn't have duplicate values. Then define unique constraints on the keys and either
- pick one to be the primary key
- add a surrogate key to the table and making that the PK.
Which I'd do depends on what's stored, how it's used by the application, etc. I wouldn't create a surrogate key "just because".
The optimizer chooses to perform a full table when both those columns are used in the "Where" clause since the primary key isn't very selectiveThis statement makes no sense to me. If you've got equality conditions on all the columns of the primary key, you're accessing at most one row. It's selective by definition.
Please share an example of this situation.