Data Modeling Standards and Guidelines

Listen to this article

Logical data models and database models are essential components of technology today. The following lists data modeling standards and guidelines in addition to outlining some data modeling principles.

Conceptual Data Modeling (CDM)

  • CDM consists of data entities and their relationships.
  • CDM describes key business information by subject area from a data perspective.
  • CDM should be divided into subject areas of manageable size. In practical terms, this means a model usually has between 4 and 15 entities per subject area.
  • Every subject area must have a unique title.

Logical Data Modeling (LDM) Standards and Guidelines

  • A corresponding logical data model also has associative classes to resolve many-to-many relationships, is fully attributed, and is normalized to Third Normal Form (3NF).
  • If a CDM was used as a foundation for adding details to develop a logical data model, then the non-specific relationship line between entities will be replaced with identifying or non-identifying relationships.
  • A LDM also shows all native (that is, non-foreign key) primary key attributes and non-key attributes in the attribute area.
  • A fully attributed logical data model will be in Third Normal Form (3NF). This means that each entity instance has exactly one unique record. All non-key attributes fully depend on primary key attributes, and no non-key attributes depend on any other non-key attributes.
  • Depending on the particular logical data modeling methodology and tool used, there are a number of acceptable ways to indicate cardinality or multiplicity on the ends of relationships between two equally important entities.

Physical Data Model (PDM) Standards and Guidelines

  • Designate a unique primary key column for every table.
  • Each column name should contain all of the elements of the logical attribute from which it was derived, but should be abbreviated to fit within the maximum length.
  • Do not use hyphens in table or column names because some programming languages interpret hyphens as subtraction operators.
  • Implement table and column names in a way that is supported by all target DBMS tools.
  • The physical model will assign lengths and data types to all columns. Data types should be specific to the target DBMS tool.
  • The physical data model will, at a minimum, provide examples of possible values for identifier, indicator, and code columns.
  • A certain amount of demoralization is usually necessary when implementing the physical data model.
  • Estimate the expected storage requirements for each table based on the size of each row, expected growth, number of rows, and archiving requirements.
  • Understand the capabilities of the specific database product. Performance improvements may be realized by taking advantage of features such as clustered indices, caching, and index optimization.