Tag: Design
Five Common Dimensional Modeling Mistakes and How to Solve Them
When designing your dimensional model, it is worthwhile to watch out for mistakes that commonly occur during the process. Specifically, they can occur in the relationships between tables, both in fact-to-dimension and dimension-to-dimension relationships. In this post, we’re going to take a closer look at five common modeling mistakes and what you can do about them.
As you start a BI-related project, bulletproof dimensional design is hugely important. What makes a design bulletproof is the early mitigation of common design mistakes.
Using Configuration Tables to Define the Actual Workflow
The first part of this series introduced some basic steps for managing the lifecycle of any entity in a database. Our second and final part will show you how to define the actual workflow using additional configuration tables. This is where the user is presented with allowable options each step of the way. We’ll also demonstrate a technique for working around the strict reuse of ‘assemblies’ and ‘sub-assemblies’ in a Bill of Materials structure.
Crow’s Foot Notation in Vertabelo
Various ERD notations follow different styles for entities, relationships, and attributes. Usually there isn’t much standardization between them, so notations bear little resemblance to each other. Among the plethora of ERD diagram notations, crow’s foot notation is definitely the most used. In this article, we’ll investigate its components within the Vertabelo database model.
Before we start looking into crow’s foot notation, we must understand that there are various levels of Entity-Relationship diagrams: conceptual data model – an overview of what should be included in the general database model.
Using Workflow Patterns to Manage the State of Any Entity
Have you ever come across a situation where you need to manage the state of an entity that changes over time? There are many examples out there. Let’s start with an easy one: merging customer records.
Suppose we are merging lists of customers from two different sources. We could have any of the following states arise: Duplicates Identified – the system has found two potentially duplicate entities; Confirmed Duplicates – a user validates the two entities are indeed duplicates; or Confirmed Unique – the user decides the two entities are unique.
13 Blog Articles on Database Design Best Practices and Tips
There’s a lot to keep in mind when you’re designing a database, and very few of us can remember every valuable tip and trick we’ve learned. So, let’s take a look at some online resources that feature database design tips and best practices. As we go, I’ll share my own opinions on the ideas presented, based on my experience in database design. Obviously, this article is not an exhaustive list, but I’ve tried to review and comment on a cross section of sources.
Viewing Holidays With Data Modeler’s Eyes
Celebration!! Family time!! Long drive!! A day at the beach!! All these words buzz into our minds when we think of holidays. Have you ever considered how a multinational company keeps track of holidays across the globe? There must be a data dictionary to maintain all these details so that they can ensure seamless business with their local partners.
This article will explain such a data model.
The Project Requirements in a Nutshell I have quite simple and straightforward requirements this time.
19 Online Resources for Learning About Database Design Errors
We all make mistakes, and we can all learn from other people’s mistakes. In this post, we’ll take a look at numerous online resources for avoiding poor database design that can lead to many problems and cost both time and money. And in an upcoming article, we’ll tell you where to find tips and best practices.
Database Design Errors and Mistakes to Avoid There are numerous online resources to help database designers avoid common errors and mistakes.
Renting out Cars Is as Simple as Driving: A Data Model for a Car Rental Company
You might have rented a car on your last vacation. You reserved your car online, and then picked it up from its designated location after paying all the previously-agreed charges. Once you were done, you returned it to the agency and perhaps paid some additional fees. Did you ever think about the system that makes all these things happen? In this article, we’ll look at a data model for a car rental system.
Database Modeling
I wrote a song about dental floss but did anyone’s teeth get cleaner?
Frank Zappa
When we think of the dental office, our first associations are the drill, the pain, and the fear. OK, that sounds bad. Besides taking care of teeth, a dentist has many other obligations that are professional, legal, or both. All of them require proper data management.
To meet this documentation requirement, many dental and medical offices use paper records.
Database Modeling Course (2)
You’re finally ready to get down to real data modeling. We’ll start off with entities and their attributes. Entities are the basic building block of every data model. In this post, you’ll find out what they are and how to identify them.
What Is an Entity? What is a Specific Instance of an Entity? Data models help us to identify what kind of information we’ll store in our system. We use such models to address the question What will the data in our system be about?
Naming Conventions in Database Modeling
What’s In A Name? The Database Edition Database models require that objects be named. While several facets of naming an object deserve consideration, in this article we’ll focus on the most important one: defining a convention and sticking to it.
Why Use Naming Conventions? Look at the database model below. I went a bit overboard and removed as many traces of a naming convention as I could. This proves my first point: a naming convention is an important part of a well-built data model.
Building an Information Mart With Your Data Vault
In the 3rd post in this series, we looked at how we prepare data for use with a concept called the Business Data Vault. Now, in this final part, I will show you the basics of how we project the Business Vault and Raw DV tables into star schemas which form the basis for our Information Marts.
Raw Data Mart vs. Information Marts As of Data Vault 2.0, the terminology changed a bit to be more precise.
Creating Tables for Products and Services
Buying books was a way anyone could acquire a work of art for very little.
Solomon “Sol” LeWitt, American artist, 1928–2007
Selling products and services can be two very different propositions. This originates in their differing definitions and real-world representations. Previously in this series, we discussed the table basics in the context of database design and sales. In this post, we’ll analyze the differences between products and services, how they impact the database model, and how we can accommodate both on one database.
The Business Data Vault
In my last post, we looked at the basics of modeling your data warehouse using the Data Vault 2.0 technique. This time, we get into the details of how we prepare the DV tables for business user access.
What is a Business Data Vault? If you have done any investigation into Data Vault on various blogs or the LinkedIn discussion group, you have seen a few terms used that often cause confusion.
Applying Simple Access Control to a Small Data Model
“Information is the lifeblood of any organization…” We hear a lot of statements like this, or about an “information age,” or an “information economy.” When we agree with belief that amplifies the importance of information in the world today, we have to consider how to make that all-important information secure. Who can see my bank account? Was the facilities maintenance contract lost? Why can’t I get the latest lab report?
Data Vault 2.0 Modeling Basics
In my last post, we looked at the need for an Agile Data Engineering solution, issues with some of the current data warehouse modeling approaches, the history of data modeling in general, and Data Vault specifically. This time we get into the technical details of what the Data Vault Model looks like and how you build one.
For my examples I will be using a simply Human Resources (HR) type model that most people should relate to (even if you have never worked with an HR model).
Database Modeling Course (1)
Data modeling is an essential step in the process of creating any complex software. It helps developers understand the domain and organize their work accordingly. In this article, which begins a new series devoted to database modeling, we’ll try to convince you why you should include it in your projects and what it looks like.
Do I Really Need Data Modeling? As a novice developer, you often start your programming adventure with simple applications like the sieve of Eratosthenes or enumerating the Fibonacci sequence.
Tips for Better Database Design
Over the years, working as a data modeler and database architect, I have noticed that there are a couple rules that should be followed during data modeling and development. Here I describe some tips in the hope that they might help you. I have listed the tips in the order that they occur during the project lifecycle rather than listing them by importance or by how common they are.
What Is the Actual Definition of First Normal Form (1NF)?
The First Normal Form (1NF) is exceptional. The other normal forms (2NF, 3NF, BCNF) talk about functional dependencies and 1NF has nothing to do with functional dependencies. Moreover, we have precise definitions for other normal forms and there is no generally accepted definition of 1NF.
Does 1NF Equate to “Atomic Attributes”? When you look at various descriptions of 1NF the word that you see most often is atomic. It is common to say that a relation is in 1NF if all its attributes are atomic.
Tackling Your Troubles – Building a Bug and Problem Database
Death and taxes – add “software problems” to that list of the inevitable. There is always a new issue, a new failure, a new key opportunity that an organization must address. And to avoid repeating the problems, or to revise your prior fixes, it is critical to capture the problems accurately and completely. You need a history of what happened and when. In this piece, we create the logical model for a problem or “bug” reporting system.
Defining Identifying and Non-Identifying Relationships in Vertabelo
Various data modeling tools allow modelers to define relationships in a data-model as identifying or non-identifying. We can define a relationship as identifying or non-identifying in Vertabelo as well. This article will explain the way to do so.
Introduction Before moving ahead with the article, I’d like to explain what identifying or non-identifying mean.
Let’s take a real time example of a book storing system. In the system, a book belongs to an owner, and an owner can own multiple books.
Managing Roles and Statuses in a System
There are many ways to solve a problem, and that’s the case with administering roles and user statuses in software systems. In this article you’ll find a simple evolution of that idea as well some useful tips and code samples.
Basic Idea In most systems, there is usually a need to have roles and user statuses.
Roles are related to rights that users have while using a system after successfully logging in.
How to Keep Track of What the Users Do
In several of the projects we have worked on, customers have asked us to log more user actions in the database. They want to know all of the actions the users perform in the application, but capturing and recording all human interactions can be challenging. We had to log all modifications of data performed via the system. This article describes some of the pitfalls we encountered and the approaches that we used to overcome them.
Email Confirmation and Recovering Passwords
Modern applications have plenty of authentication features beside registration and login. In this article we will take a look at how to design the database for two such features: email confirmation and password recovery.
Email Confirmation What Is It? Most people familiar with the Internet know what an activation email is. An activation email is sent to the user after he or she registers for an account on a website or web application and contains a link that will allow the user into the system.
7 Common Database Design Errors
Why Talk About Errors? Model Setup 1 – Using Invalid Names 2 – Insufficient Column Width 3 – Not Indexing Properly 4 – Not Considering Possible Volume or Traffic 5 – Ignoring Time Zones 6 – Missing Audit Trail 7 – Ignoring Collation Why Talk About Errors? The art of designing a good database is like swimming. It is relatively easy to start and difficult to master.
How to Store Authentication Data in a Database. Part 1
How difficult is it to program a user login function for an application? Novice developers think it’s very easy. Experienced developers know better: it is the most sensitive process in your application. Errors in login screens can lead to serious security issues. In this article we take a look at how to store authentication data in your database.
The most common way to authenticate users nowadays is with user name and password.
Database Design: More Than Just an ERD
Database design is the process of producing a detailed model of a database. The start of data modelling is to grasp the business area and functionality being developed.
Before Modeling: Talk to the Business People This is a key principle in information technology. We must remember that we provide a service and must deliver value to the business. In data modeling that means solving a business problem from the data-side such that the required data is available in a responsive and secure way.
Spider Schema – a Bridge Between OLTP and OLAP?
Introduction As I mentioned in my article “OLAP for OLTP practitioners”, I am working on a project that needs to create an analytical database for on-line analytical processing (OLAP). I have mostly worked with on-line transaction processing (OLTP) with some limited reporting features. OLAP is a new area for me. In OLAP, the main focus of the database itself is simply to store data for analysis; there is limited maintenance of data.
A Book Review
Book and Author Importance of Data Model Quality Takeoff Checklist Merciless Review Merciless Humiliation? Is it Agile? Conclusion Book and Author Today I’m going to review “A Check List for Doing Data Model Design Reviews” by Kent Graziano. This publication is available as an e-book on Amazon.com.
The book is very short – it will take you less than an hour to read it. But don’t let the small volume mislead you.
OLAP for OLTP Practitioners
I am currently working on a project where we need to create a database that will be primarily used to store data for reporting and forecasting. In the past, I have mostly worked with databases used for typical CRUD (create, retrieve, update, and delete) operations of data with some limited reporting features. When performing CRUD operations, normalization is important; while in analytics, a de-normalized structure is generally preferred.
5 Steps for an Effective Database Model
Database design is the process of producing a detailed model of a database. This model contains the necessary logical (table names, column names) and physical (column datatypes, foreign keys) choices to translate the design into a data definition language (aka SQL), which can be used to create the actual physical database.
When I need to create the design for a new database, in other words, the data layer for an application, I follow a few mental steps that I think can help others when they need to go through the same process.
5 Must-Read Database Modeling Books?
I recently realized that our database modeling library could use a few more advanced titles. So I headed over to Amazon to see what they had on offer.
There are plenty of introductory books for beginners that tell you how to normalize data, and introduce you to indexes, but what about something for the professional, grown-up database modeler? Here are 5 of the best database modeling books I found (listed in no particular order) that go beyond the basics and come highly recommended by Amazon reviewers.
How Do You Make Your Database Speak Many Languages?
The Scenario You are the owner of an online store, located in Poland. The majority of your customers are from Poland and they speak Polish. But you want to sell your products abroad too and your international customers mainly speak English. So you want your online store to be available in both Polish and English. You also expect that your products will sell well in France, so you anticipate that you’ll have to prepare a French version of the store as well (and maybe Spanish too, because why not?
UML Notation
UML is popular for its notations. We all know that UML is for visualizing, specifying, and documenting the components of software and non software systems. What’s more, UML has many types of diagrams which are divided into two categories. Some types represent structural information, others general types of behaviors. Among these, there is one that is commonly used for entity relationship diagrams.
In UML, an entity is represented by a rectangle:
Arrow Notation
Arrow notation has become one of the less recognized notations in entity relationships diagrams in recent years. Let’s discuss its elements.
Entity and relationships As you can see below, an entity is always represented by a rectangle, which is common to most notations (there isn’t a distinction if it is dependent or independent entity). Relationships and cardinality are represented by various combinations of arrows as the diagram below presents.
IDEF1X Notation
IDEF1X (Integration DEFinition for Information Modeling) is a method for designing relational databases with a syntax that supports constructs in developing conceptual schema.
Not everyone knows that this notation has an interesting history. Indeed, the need for semantic data models was first recognized by the U.S. Air Force in the mid-1970s. As a result, the ICAM Program came into being (It identified a need for better analysis and communication techniques for people involved in improving manufacturing productivity), that later developed a series of techniques known as the IDEF; IDEF1X being one of them.
Chen Notation
Continuing our trip through different ERD notations, let’s review the Chen ERD notation.
Peter Chen, who developed entity-relationship modeling and published his work in 1976, was one of the pioneers of using the entity relationship concepts in software and information system modeling and design. The Chen ERD notation is still used and is considered to present a more detailed way of representing entities and relationships.
Entities An entity is represented by a rectangle which contains the entity’s name.
Barker’s Notation
When looking at different kinds of ERD notations, it is hard not to come across Barker’s ERD notation, which is commonly used to describe data for Oracle. Richard Barker and his coworkers developed this ERD notation while working at the British consulting firm CACI around 1981, and when Barker joined Oracle, his notation was adopted.
Let’s take a closer look at Barker’s syntax.
The most important components in the ERD diagram are:
ERD Notations in Data Modeling
An entity relationship diagram (ERD) is a diagram that defines the structure of database instances. Choosing which notation to use is typically left up to personal preference or conventions. Here, you can find some useful information about each notation:
Part 1 – Barker’s Notation Part 2 – Chen Notation Part 3 – IDEF1X Notation Part 4 – Arrow Notation Part 5 – UML Notation Part 6 – Crow’s Foot Notation Which ERD notation are you using?
How to Model Inheritance in a Relational Database
In the process of designing our entity relationship diagram for a database, we may find that attributes of two or more entities overlap, meaning that these entities seem very similar but still have a few differences. In this case, we may create a subtype of the parent entity that contains distinct attributes. A parent entity becomes a supertype that has a relationship with one or more subtypes.
First, let’s take a closer look at a simple class diagram.
N-ary relationship types
When we design a database, we draw an entity relationship diagram (ERD). It helps us understand what kind of information we want to store and what kind of relationships there are. It is imperative that this diagram is easy to read and understand.
The number of entities in a relationship is the arity of this relationship. The aim of this article is to give some examples and show how big an impact the arity of relationships has on not only the readability of the diagram, but also the database itself.
Database Design 101: How to Start Creating a Database Model
In this video you will learn how to start creating your database model. You will find out why nouns are important and how you should handle them when creating a database model.
If you want to learn more, read our beginner tutorial on how to create a database model.
iframe.video-plugin { width: 735px; height: 415px; border: 0px solid #CCC; margin: 0px; } @media all and (max-width: 767px) { iframe.
The Boyce-Codd Normal Form (BCNF)
Why do you need all of this normalization stuff? The main goal is to avoid redundancy in your data. Redundancy can lead to various anomalies when you modify your data. Every fact should be stored only once and you should know where to look for each fact. The normalization process brings order to your filing cabinet. You decide to conform to certain rules where each fact is stored.
Nowadays the go-to normal forms are either the Boyce-Codd normal form (BCNF), which we will cover here today, or the third normal form (3NF), which will be covered later.
Update Anomalies
Let’s take a look at the following table:
Customer Purchase date Product name Amount Price Total price Joe Smith 2014-02-14 Yoga mat 1 80 80 Jane Bauer 2014-02-16 Yoga block 2 30 60 Joe Smith 2014-02-14 Yoga block 2 30 60 Joe Smith 2014-02-14 Yoga strap 1 10 10 Thomas Apple 2014-02-18 Dumbbells 2kg 2 30 60 Jane Bauer 2014-02-16 Yoga mat 1 80 80 What’s wrong with this table?
How to Create a Database Model From Scratch
So you want to create your first database model but you don’t know how to start? Read on!
I assume you already know a little about tables, columns, and relationships. If you don’t, watch our video tutorials before you continue.
Start With a System Description You should always start creating a database model with a description of a system. In a classroom situation, a system description is given to you by a teacher.
Database Design 101: References
In this video tutorial you will learn about references – how to create a relationship between the tables, how it affects their structure, and how it looks in the data.
iframe.video-plugin { width: 735px; height: 415px; border: 0px solid #CCC; margin: 0px; } @media all and (max-width: 767px) { iframe.video-plugin { width: 250px; height: 141px; } }
Designing a Database: Should a Primary Key Be Natural or Surrogate?
Suppose we design a database. We’ve created some tables, each one has a few columns. Now we need to choose columns to be primary keys (PK) and make references between tables. And here some inexperienced designers face the dilemma – should a primary key be natural or surrogate?
There’s one and only one answer to that question: it depends. If anyone ever tried to convince you that you should have only natural keys or only surrogate keys, just smile :)
Good news: we now support views
Today we are happy to announce that Vertabelo has a new feature we've been working on for some time – views. Now you can easily add them to the diagram, change their definition and other properties and have them generated in your SQL scripts.
Quick overview of views Let's take a look at how simple it is to create a new view in our editor.
Our goal is to make a view which joins three tables as follows:
Five reasons why we built Vertabelo
We are launching a new product on a market, so let me tell you the short story about how and why Vertabelo was born.
e-point and I personally have more than fifteen years of experience in building business applications. All of them use relational databases as a storage for their data. Most of those applications are rather big - think about hundreds of tables in a database and hundreds of screens in a UI.