Data Modeling

What is a data model?

An abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of several other elements which, in turn, represent the color and size of the car and define its owner

The term data model can refer to two distinct but closely related concepts. Sometimes it refers to an abstract formalization of the objects and relationships found in an application domain: for example, the customers, products, and orders found in a manufacturing organization. At other times it refers to the set of concepts used in defining such formalizations: for example, concepts such as entities, attributes, relations, or tables. So, the “data model” of a banking application may be defined using the entity-relationship “data model”.

A data model organizes data elements and standardizes how the data elements relate to one another. Since data elements document real life people, places and things and the events between them, the data model represents reality. Data models are often used as an aid to communication between the business people defining the requirements for a computer system and the technical people defining the design in response to those requirements. They are used to show the data needed and created by business processes.

A data model refers to the logical inter-relationships and data flow between different data elements involved in the information world. It also documents the way data is stored and retrieved. Data models facilitate communication business and technical development by accurately representing the requirements of the information system and by designing the responses needed for those requirements. Data models help represent what data is required and what format is to be used for different business processes.

Why is having a common data model important?

  • Ensures that all data objects required by the database are accurately represented. Omission of data will lead to creation of faulty reports and produce incorrect results.
  • A data model helps design the database at the conceptual, physical and logical levels.
  • Data Model structure helps to define the relational tables, primary and foreign keys and stored procedures.
  • It provides a clear picture of the base data and can be used by database developers to create a physical database.
  • It is also helpful to identify missing and redundant data.
  • Though the initial creation of data model is labor and time consuming, in the long run, it makes your IT infrastructure upgrade and maintenance cheaper and faster.

What is data modeling?

Data modeling is a process for defining and ordering data for use and analysis by certain business processes. The goal of data modeling is to produce high quality, consistent, structured data for running business applications and achieving consistent results. Data modeling is influenced by the type of IT systems used and the objectives of the organization involved. Data modeling encompasses three progressive steps:

  • Conceptual: to define the model’s parameters;
  • Logical: to structure the data for use; and,
  • Physical: to produce sets of data in a common format for analysis and use by the organization.

Data Model Examples

There are two primary approaches to data modeling: top-down and bottom-up.

  • Top-down data modeling starts with a question and works to arrive at an answer. For example, an organization may believe that certain inefficiencies are costing it money, and so it gathers data from its various departments to analyze resource utilization and identify ways to eliminate redundancies and better allocate available capital.
  • Bottom-up data modeling starts with the collection of data and seeks to find interesting insights and correlations. For example, by running all of its transactional data, an organization might learn that customers are more likely to respond.

Top-down and bottom-up data modeling are complementary. Bottom-up modeling may reveal questions that top-down modeling can then answer; and top-down modeling may be used to identify different sources and combinations of data for analysis.

Data modeling has been used to achieve better outcomes in nearly every industry, including:

  • Business management
  • Civil engineering
  • Education
  • Environmental science
  • Financial services
  • IT management
  • Manufacturing
  • Marketing
  • Research and development
  • Software development
  • Transportation and logistics

What are some data modeling best practices?

Because data modeling has so many applications, but may be difficult for some to understand, it is recommended that teams tasked with data modeling for their organization gain buy-in from the groups they’ll be working with by communicating—in non-technical terms—the benefits of cooperation. Once buy-in has been achieved, focus on simpler tasks first to demonstrate success and cultivate champions. Other best practices to follow include:

  • Outline processes and objectives in advance
  • Document each step of your process to facilitate course correction and ensure replicability
  • Work collaboratively with all associated teams in order to leverage domain expertise
  • Revisit your model regularly and modify based on updated objectives and data sources

Why is having a common data model important?

In a large, complex enterprise, there are many and diverse sources of structured, semi-structured, and unstructured data. All of it may be useful for understanding operations, but if the formats are inconsistent the application of the data will produce inconsistent and unreliable results. Data modeling transforms diverse data sets into a common format, maximizing its usefulness to the business processes to which it is being applied.

« Back to Glossary Index

 

X