Call (+1) 650-212-1212

Blog

Data Modeling 101

Sean Salleh 17 Jun 2013 Modeling methods

Think data modeling is just for the geeks? Think again! Even if deep down in the IT layers data models can make Egyptian hieroglyphs look like a Dr. Seuss book by comparison (like ‘The Cat in the Hat’ – a classic!), everybody should know a little about data modeling. The reason is that a data model is a record of how the information in an enterprise is recorded and manipulated. If your data model is not good, meaning that it does not reflect reality, your information processing will be faulty and your business decisions likely to be wonky. Therefore, business people and technical people must work together to produce clear data models of good quality.

Rough data model and spreadsheet Image source: Flickr.com

What Does a Data Model Do?

A data model is typically a diagram that shows relationships between different data. When business people need software to perform a certain function, the data model is a representation of the business need that business people and IT engineers can discuss together. It bridges the gap between the two worlds by showing how different data are related: for example, ‘Customer Order’ is linked to ‘Shipping’ and then to ‘Billing, but also to ‘Warehouse Inventory’ that in turn is linked to ‘Production’. Depending on the system concerned, a data model can be more or less complex. However, a good data model means a solid start to making a satisfactory IT system.

 

Diagram of business data model. Image source: wikimedia.org

 

Recognized as Good Development Practice

Modern software development standards take business priorities into account. The American Standard IEEE 1074-1995 for example specifies the following three documents to help keep business goals and technical activities connected:

  • Systems requirement specification (overall/business oriented)
  • Software requirements specification (what the software will do, but not how)
  • Software description document (how the software does its job).

Standards like this express good practice, rather than best practice. Nevertheless, their targeted application to a software development project can bring significant improvement for relatively little cost.

 

Different Types of Data Model

Types of data models include:

  • Relational Model. Already close to the design of the software itself, this model specifies data and relationships using pre-defined graphical symbols
  • Graph Model. Based on graph theory with its nodes and edges. Again, already close to the IT design itself
  • Hierarchical Model. Looks like the organizational charts commonly used by HR. Closer to the business point of view, although IT colleagues may need more detail.
  • Network Model. Similar to the Hierarchical Model, with multiple links between entities to give more detail on relationships.
  • Dimensional Model. An extension of the Relational Model where the data are also labeled with additional information to help understand business impact.
  • Object Relational Model. A newer model with a technical bias, now finding a niche in engineering and scientific sectors.

 

The Relationship Between Data Modeling and Predictive Modeling

Predictive modeling is similar to data modeling in that it identifies a number of factors likely to have an impact on future developments or behavior. In the example we gave above, ‘Customer Order’ might also be linked to a model for the estimated share price of the business, also determined by ‘Billing’, ‘Inventory’ and ‘Production’. Predictive modeling may differ from other types of data modeling in that the software model used to predict future results may be updated as additional information becomes available.

 

Image3: File:Data modeling context.svg - Wikimedia Commonscommons.

Diagram of data modeling process Image source: wikimedia.org

 

These possible extensions of a predictive model are a reason also for making sure the software modeling solution you use is capable of handling changes, without requiring extensive rework.

» Back

Sean Salleh

Sean Salleh is a data scientist with experience in guiding marketing strategy from building marketing mix models, forecasting models, scenario planning models, and algorithms. He is passionate about consumer technologies and resource management. He has master's degrees in Operations Research from University of California Irvine and Mathematics from Northeastern University.

Leave a Comment