|
Database models differ in the way that they represent data to users and designers by using metaphors and abstractions that make the data more meaningful and useful to humans. A given DBMS acts as a bridge between the abstract model and the physical or 'stored' database.
The important metaphor used by relational databases is the two dimensional table called a relation. In this view,
- a relational database is a collection of two dimensional tables.
- each table has a unique name within the database
- each row of the table is an instance of that table (also called a tuple)
- each row in the table is unique in the table
- each column of the table has a unique name within the table
- each column contains an attribute of the table
- each row of a column is called a data value
- data values are atomic. further subdivision loses meaning
- the number of columns in a table is its degree
- the number of rows in a table is its cardinality
- the order or rows and columns does not matter
- all data within a column must be of the same domain.
A well designed relational database is structured to capture the semantic content of the data and, in theory, should be able to produce any report that makes sense. This is possible only if the database is normalized. The following normalization concepts are important.
- a table may be thought of as a statement of functional dependencies (FD)
- every non key attribute (NKA) is FD on the key, the whole key, and nothing but the key
If the set of tables is not normalized the database will lose query flexibility and may become subject to addition, deletion, and update anomalies. Examples
- addition: can't hire new faculty without a teaching assignment
- deletion: deleting the course 'bus422' also deletes jamal from faculty
- update: change of course number to 'bus423' requires changing many rows
Correspondence between ERD and the normalized set
- frequently the ERD directly corresponds with the table set, or, at any rate, an ERD could be constructed post hoc to represent the normalized set
- ETs become tables; each entity is a row in that table and their common attributes are columns in the table
- each N:M RT becomes a table with a composite key structure drawn from the ETs it connects; each relationship is a row in the table and their common attributes are columns in the table
- 1:N RTs usually do not become tables but indicate that the key of the '1' ET exists as a foreign key in the 'N' ET
- 1:1 RTs are often indicate that one ET may become a column in the other
The collection of relations containing attributes is able to provide information because of keys: primary keys and foreign keys
- non key attribute: column containing descriptive information only
- key attribute: column containing defining information
- a primary key: one or more columns whose values together identify a single row in the table
- foreign key in tableA: a key that appears in tableA but does not have a defining function in tableA. It has a defining function in another table.
- foreign keys allow us to model RTs in the table metaphor
- sometimes database designers must invent keys that are not drawn from the users' information needs.
All relational operations work on one or more source tables and always produce one result table. The operations may be classified as selection, projection, and join
- selection: pick out certain rows from a table depending on conditions using the WHERE clause
- projection: pick out certain columns of a table using the SELECT clause
- join: the result table contains columns from more than one table. an additional WHERE clause is needed that supplies the join condition.
Some data in a table may be set as NULL which represents a non-existent value and is used for
- data that are unknowable
- data that are knowable but missing
- data that may not apply to this row
- for data integrity some columns must be forced to be NOT NULL
In an RDBMS all data including the data dictionary and structural information about the internal view are organized into tables
- the base tables contain source data
- the system tables contain meta data
- index tables are not required but may be generated to speed up queries
- the COL table contains information on all columns (column name, table name, domain, description)
- the TAB table contains information on all tables (table name, degree,description)
- the SYS table contains user access privilege information
A useful object in RDBs is the VIEW.
- A view is created as a result table that serves some specific information need.
- The table itself does not exist but it may be referred to as if it were a table.
- Each reference actually triggers the saved SQL and generates the table.
- Views simplify queries and are useful in the data security function.
Data integrity
- table integrity: primary key columns are NOT NULL
- table integrity: rows are unique
- referential integrity: foreign key exists as primary key in another table
- referential integrity: foreign key set to null if it does not exist as a primary key elsewhere
- example of loss of referential integrity: an auto part is deleted from the PARTS table but its part number exists as a foreign key in many rows of the invoice table
- RDBMS contain methods for conserving referential integrity
|