Speaking of Relational Databases, for some programmers, the normalization process and its levels are the subject of discussion.
For me, its clear that a good database design must include foreign keys (of course, if the database engine allows it). At least for conceptual design, the connections between relationships (or tables) must be clear. So, here I will explain what it is to normalize a database, benefits, disadvantages and other concepts related to this process. Let’s start.
Normalization is the process through a data collection is organized into smaller pieces in order to avoid duplicated information and reduction of possible human and system errors. This process is important to try to do at the beginning, before selectioning the database engine and independient from development languages and tools.
For example: You can take your phonebook (from your smartphone or in an old style, a physical book). Observe what data can you register for each contact: name, address, city, group, cel phone, e-mail, etc. We can create a long list with a certain amount of columns (fields for database terms) and we will see that city has repetitive information, maybe wrote in different forms, for group is the same case. So, let’s normalize our phone book or contact list.
Our first step will be to assign a number to each contact. Because this number is meant to be unique and each contact will have one, we will call it our primary key.
The next move will be to extract a list of cities and assign numbers to each city, just like our previous step. After that, we will replace the name of the city in our contact list with the new number. This list of new cities will be a catalog, and now has a primary key as well. The city number for the main list is called Foreign Key, and refers to the main key in our city catalog. Let’s repeat the process with our group field.
Now, if we look at the new organization of data, we can see that for each city in our catalog, we have some records of contacts, the same with the relationship between our catalog of groups and the main list.
From now on, whenever we want to add a new contact, we just have to type the city number and group number, instead of write the name or abreviation, or what else, and our information will keep consistent.
This is the basic normalization process. You can repeat this steps every time you will need.
The benefits of this process include: avoiding repetitive information, clear data, less memory requirement and storage capacity. Perhaps a couple of disadvantages will be that the engine will have to navigate through different tables to get the information requested and the level of complexity in the design.