Introduction Show
Database denormalization is a technique used to improve data access performances. When a database is normalized, and methods such as indexing are not enough, denormalization serves as one of the final options to speed up data retrieval. This article explains what database denormalization is and the different techniques used to speed up a database. Database denormalization is the process of systematically combining data to get information quickly. The process brings relations down to lower normal forms, reducing the overall integrity of the data. On the other hand, the data retrieval performances increase. Instead of performing multiple costly JOINs on numerous tables, database normalization helps bring together information that is commonly or logically combined. Database anomalies appear because of lower normal forms. The problem of redundancies finds a solution in adding software-level limitations when inputting data into a database. Database Normalization vs. DenormalizationDatabase normalization and denormalization are two different ways to alter the structure of a database. The table describes the main differences between the two methods:
Database normalization takes an unnormalized database through normal forms to improve data structure. On the other hand, denormalization starts with a normalized database and combines data for faster execution of commonly used queries. Why and When Should You Denormalize a Database?Database denormalization is a viable technique when data retrieval speed is an essential factor. However, the method changes the overall database structure. Denormalization is helpful in the following scenarios:
If a database has low performance, denormalization is not always the right way to go. Since the process changes the database structure, existing functionalities are at risk of breaking down. Note: Database issues often arise due to a poorly designed database. Consider learning about the different database types available and choosing the right option for your use case. Having a point of reference is an important concept when changing the database structure. Ultimately, database normalization serves as a last resort instead of a quick solution. Denormalization TechniquesThere are various database denormalization techniques used depending on the use case. Each method has an appropriate place of use, advantages, and disadvantages. Pre-joining TablesPre-joined tables store the frequently used pieces of information together into one table. The process comes in handy when:
The method creates massive redundancies, so it is essential to use a minimal number of columns and update the information periodically. Example of Pre-Joining Tables A store keeps the information about items and the categories to which the items belong. The foreign key serves as a reference to the item type. Pre-joining the tables adds the category name to the items table. Adding the category name directly to the table of items allows viewing the items by category quickly. For lengthier queries, this method saves time and reduces the number of JOINs. Mirrored TablesA mirrored table is a copy of an existing table. The table is either:
The goal is to reproduce the data from the original into a new table. Making duplicates is a good technique for creating a backup to preserve the initial state of the database. Example of Mirrored Tables Mirroring tables is a method often used for preparing data in decision support systems. Since the queries usually aggregate over many data points, the task would significantly decrease the system performances. Decision support systems greatly benefit from the use of mirrored tables. Applying transactions over the original table goes uninterrupted while demanding reports happen on the duplicate table. Table SplittingTable splitting implies dividing normalized tables into two or more relations. Dividing tables happens in two dimensions:
The method's goal is to split tables into smaller units for faster and more convenient data handling. If the database also contains the original table, this method is considered a particular case of mirrored tables. Examples of Table Splitting The usage examples depend on the criteria of table splitting. The most common reasons for dividing tables are:
Storing Derivable ValuesStoring frequently executed calculations is worthwhile in situations where:
Directly storing derivable data ensures calculations are already done when generating a report and eliminates the need to look up the source values for each query. Example of Storing Derivable Values If we have a
database table that keeps track of information about people, a person's age is a calculated value based on their date of birth. Derive the age by finding the difference between the current date using the MySQL date function Age is an essential piece of information when analyzing any demographic information. The source value, which is the date of birth, does not change. Hierarchy TablesA hierarchy table is a tree-like structure with a one-to-many relation. One parent table has many children. However, the children have only one parent table. Hierarchy tables are used in cases where:
Hard-Coded ValuesHard-coded values remove a reference to a commonly used entity. Use this method in situations where:
Instead of using a small look-up table, the values are hardcoded into the application directly. The process also avoids having to perform joins on the look-up table. Example of Hard-Coded Values A table with information about people could use a small look-up table to store information about the gender of individuals. Since the information in the look-up table has a limited number of values, consider hard-coding the data into the table of people directly. Hard-coded values eliminate the need for a look-up table and the JOIN operation with that table. Any alterations made in the look-up table or recording of new values require the addition of a check constraint. Storing Details with MasterThe master table contains the main table of information, whereas other tables contain specific details. Store the details with the master table when:
Keeping all the details with the master table is convenient when selecting data. The method works best when there are fewer details. Otherwise, the data retrieval process slows down significantly. Example of Storing Details with Master A master table with customer information typically stores specific details about the person in a separate table. Information about the particular location, for example, usually resides in a series of smaller tables. Any report that considers the customers' location benefits from adding the location details to the master table. Repeating Single Detail with MasterQueries often need just a single detail added to the master table instead of pre-joining multiple values. Use this method when:
Adding a single detail to a master table is most common when the database contains historical data. The repeated entity is usually the newest information. Example of Single Detail with Master A store database normally has a master table of information about items it sells. Another table with details about the historical price changes also contains the information on the current price. Since this single detail helps analyze the current item prices, the latest information about the price is handy to repeat in the master table. Any cost changes need to be addressed and updated into the master table as well for consistency. Short-Circuit KeysIn a database with three or more tables of related information, the short-circuit keys method skips the middle table(s) and "short-circuits" the grandparent and grandchild tables. Use the short-circuit technique in situations where:
If two relations relate through a middle table, omit the JOIN on the intermediate relation and connect the first and last table directly. Example of Short-Circuit Keys An information system could keep information about people in one table, their address in another location, and the geographic area of that address in a third table. For any demographic report, the exact address is not a critical piece of information. However, the location of a person is essential for analysis. Short-circuiting the table of people with the area omits the JOIN on the middle table. Denormalization AdvantagesThe advantages of database denormalization are:
Denormalization DisadvantagesThe disadvantages to consider when denormalizing a database are:
Conclusion This article provides a clear idea of what database normalization is and how to apply it to certain situations. Carefully consider the reasons and methods behind denormalization. Making any changes to a database might be permanent and irreversible. What process is used to eliminate data redundancy and improve efficiency in a database?Database normalization involves efficiently arranging data in a database to ensure redundancy elimination. This process ensures that a company's database contains information that appears and reads similarly throughout all records.
What is data redundancy in database management system?Data redundancy refers to the practice of keeping data in two or more places within a database or data storage system. Data redundancy ensures an organization can provide continued operations or services in the event something happens to its data -- for example, in the case of data corruption or data loss.
Which of the following methods to introduce denormalization to the database?Methods of De-normalization. Adding Redundant columns.. Adding derived columns.. Collapsing the tables.. Snapshots.. VARRAYS.. Materialized Views.. What is the reason to introduce data redundancy into a normalized database design?It is important that a database is normalized to minimize redundancy (duplicate data) and to ensure only related data is stored in each table. It also prevents any issues stemming from database modifications such as insertions, deletions, and updates. The stages of organization are called normal forms.
|