References on Database Design

                         Document #: US-13264,EN
------------------------------------------------------------------------------

Major subject: analysis    Minor subjects: checklist

Keywords: tech_notes

Abstract:
References on Database Design - Equivalent to INGRES Release 
6 Technical Note #24 or Release 5 Technical Note #41.


Expert note:
References on Database Design 
=============================

Overview
--------
Database design in a relational environment is  much  easier  than  with
other  DB models because the logical design phase is separated from phy-
sical design phase. It is possible to do  the  logical  design  ignoring
physical  design  issues, and then adjust the physical design at a later
stage or even adjust it during production use of the  DB.  This  is  the
basic principle of provide a fast application development environment.

Database design is 50% common sense; the primary objective  being  rela-
tions  that  avoid  any 'data redundancy' and thus prevent inconsistency
that can occur between multiple copies of the same data.  It  is  advis-
able,  however,  to  use a more structured method to achieve this objec-
tive. These approaches are well documented and this technical note  will
briefly  list  two common approaches, their advantages and disadvantages
and some of the easier-to-read references that explain them.

o	The Different Approaches to DB Design
o	Entity Relationship Method
o	Normalization
o	Other References
o	Summary

The Different Approaches to DB Design
-------------------------------------
There are two schools of thought on how logical design should be carried
out:

     top down analysis:      e.g. Entity-Relationship modelling
     bottom up analysis:     e.g. Normalization from functional dependencies

Top down analysis tends to be done with the aid of diagrams where it  is
easy  to identify the critical areas at an early stage. This is particu-
larly important with very complex or large databases (large here meaning
many  potential  relations).  This  approach considers the relationships
between 'entities'. Entities are 'things' or 'concepts' about which data
is collected. The entities are described by attributes. The final design
usually generates one relation per entity with the  attributes  becoming
columns in relations.

Bottom-up analysis works at a lower-level and deals  with  relationships
between the attributes directly. A list of 'functional dependencies' are
built up and these are decomposed (normalized) into sets of dependencies
that  lead to relations. This decomposition goes through stages known as
'normal forms'. For many applications, third normal form is sufficient.


Entity Relationship Method
--------------------------
Advantages:

     * good for early stages in design when not all attributes known
     * good for semantically complex databases

     * diagrams are easy to read
     * more oriented to common sense than mathematics
     * good for postmortem analysis of database and of queries
             that run on database; good for analysis of areas of high
             traffic volume in heavy concurrent situations
     * multivalued dependencies easy to understand
     * a good 'uncluttered' medium for discussion purposes

Disadvantages:

     * not as rigorous as normalization
     * confusing at first, needs time to acquire art
     * more 'iterative' than normalization
     * it is not easy to define and identify entities

References:

Martin, J. and McClure, C., "Diagramming Techniques for Analysts and
                Programmers" (publ: Prentice Hall 1985)

-an excellent encyclopedic text on diagrammatic methods for all phases of system
        and DB design, chap. 20 covers the ER approach.

Howe, D. R., "Data Analysis for Database Design" (publ: Edward Arnold 1983)

-a readable text book covering both ER and normalization approaches.
        Not often seen in bookstores, try contacting the publishers:
        (301)-539-1529

Yao, S. B. (editor) "Principles of Database Design Volume 1"
                        (publ: Prentice Hall 1985)
        see chapter 5:
                Chen, P. S., "Database Design based on Entity and Relationship"

- a very easy to read introduction

Chen P. "The Entity-Relationship Model-Toward a Unified view of Data"
        ACM Trans on Database systems vol. 1 ,no. 1 Mar 1976

- the first paper on the subject


Normalization
-------------
Advantages:

     * a rigorous, methodical, approach to DB design

Disadvantages:

     * some mathematics involved
     * diagrams to aid the design can become difficult to read for large
             or complex databases
     * overlapping composite keys makes decomposition more difficult
     * its not clear what the latest normal form is or how relevant it is!
     * not good for post mortem analysis after database in use

References:

Date, C.J.  "An Introduction to Database Systems Vol 1 Forth edition" ,
                Chapter 17 "Further Normalization" (publ: Addison Wesley 1986)

-classic textbook introduction to normalization

Kent, W., "A Simple Guide to Five Normal Forms in Relational Database Theory"
        Communications of ACM vol 26, no 2 ,Feb 1983

- a very pragmatic introductory paper

Smith, H., "Database Design: Composing fully normalized tables from a rigorous
                        dependency diagram,"
                Communications of the  ACM vol. 28, no 8, Aug 1985

- presents decomposition in a diagrammatic manner

(D. R. Howe's book above also covers normalization)


Other References
----------------

Aranow, E.B.  Developing Good Data Definitions, "Database 
	Programming ad Design," Vol. 2, No. 8 (Aug. 1989), pp. 36-39.

Codd, E.F.  Is Your Database Really Relational? "Computerworld," 
	(Oct. 14, 1985).

Cutts, G.  "SSADM: Structured Systems Analysis and Design Methodology,"
	Van Nostrand, 1988.

Date, C.J.  A Practical Approach to Database Design, Chapter 19 
	of "Relational Database: Selected Writings," Addison-Wesley, 1986.

Davis, C. (editor) "Entity-Relationship Approach to Software Engineering'
                (publ: North Holland 1983)
                see chapter 1: Kent W., 'Fact-Based Data Analysis and Design'

-introduces a connectivity graph approach

Finkelstein, Richard R.  No Easy Road to Good Design, "Database 
	Programming and Design," premier issue, 1987, pp. 9-14.

Inmon, W.H.  Optimizing Performance with Denormalization, "Database 
	Programming and Design," premier issue, 1987, pp. 34-39.

Kent, W,        "Data and Reality",
                (publ: North Holland 1978)

- a very readable book on the problems of modelling data

Martin, J., and C. McClure.  "Structured Techniques: The Basis for 
	CASE," Prentice Hall, 1988.

Ross, R.g.  "Entity Modeling: Techniques and Application," Database 
	Research Group, 1987.

Sclaer, S., and S. Mellor.  "Object Oriented Systems Analysis," 
	Prentice Hall, 1988.

Weinberg, G.M.  WIGGLE Charts a Sketching Tools for Designers, 
	Part VII of "Rethinking Systems Analysis & Design," Dorset House, 
	1988.

Roussopoulos N. and Yeh, R. "An Adaptable Methodology for Database Design"
                IEEE Computer , vol. 17, no. 5 May 1984

- a good overview on all aspects of DB design

(note ACM above refers to the Association for Computing Machinery,
        ACM, 11 west 42nd St, New York, 10036)


Summary
-------
Two approaches to DB design have been mentioned with references that  go
into  detail.  However, users without any DB experience would be advised
to attend the INGRES Application Development Class that introduces  this
subject.  Also  available are the Ingres Application Note #3 and the DBA
guide (available soon).

It should be pointed out that when performance is an issue, the rules of
DB  design  are  sometimes  adjusted to bias the design towards certains
classes of operation. In particular  'controlled  redundancy'  might  be
included.  This is safe as long as all access to the database is through
Equel or OSL applications that manage the consistency of any redundancy.
This topic is covered in 'Ingres Advanced Performance Seminar.'



Releases affected:     all(all.all) -   Releases not affected: 
Errors:                                                             
Bugs/SIRS:                                                             
------------------------------------------------------------------------------
Ingres Database Reference
To William's Home Page

© William Yuan 2000

Email William