Normalization in DBMS is a technique for organizing data in database tables so that there is less repetition of data.
DBMS Normalization is a systematic strategy of decomposing (breaking down) tables in order to minimize data redundancy (repetition) and undesired features such as insertion, update, and deletion anomalies in DBMS.
It is a multi-step procedure that converts data into tabular format, removes duplicate data, and establishes the relationship between tables.
Normalization is essential for:
- Eliminating redundant data ensures data integrity, as repeated data raises the likelihood of inconsistent data.
- It helps to keep data consistent by storing it in a single table and referring it everywhere else.
- Storage optimization, nevertheless, is not a concern these days because database storage is inexpensive.
- Breaking down huge tables into smaller tables with linkages improves the database's scalability and adaptability.
- Ensuring data dependencies make sense, i.e. data is stored logically.
Problems without Normalization in DBMS
If a table is not correctly normalized and has data redundancy (repetition), it will not only consume additional memory space but will also make it harder to handle and update data in the database without losing data. Insertion, update, and deletion anomalies are common if the database is not normalized.
rollno name branch hod office_tel 401 Akon CSE Mr. X 53337 402 Bkon CSE Mr. X 53337 403 Ckon CSE Mr. X 53337 404 Dkon CSE Mr. X 53337
To understand these anomalies, consider a Student table.
The table above contains data for four Computer Science students.
As can be seen, data for the fields branch, hod (Head of Department), and office_tel are duplicated for students in the same branch of the college, indicating data redundancy.
1. Insertion anomaly in DBMS
Assume that for a new admission, unless a student chooses a branch, the student's data cannot be inserted; otherwise, the branch information must be set to NULL.
Also, if we need to insert data for 100 students from the same branch, the branch information will be repeated for all 100 students.
These possibilities are nothing more than insertion anomalies.
If you must repeat the same data in each row of data, it is preferable to store the data separately and reference it in each row.
So, in the above table, we may keep the branch information separate and only utilize the branch_id in the student table, where the branch_id can be used to retrieve branch data.
2. Anomaly in Database Management System Update
What happens if Mr. X leaves the college? or is Mr. X no longer the head of the computer science department?
In that situation, all student records must be updated, and if any records are missed by mistake, data inconsistency will occur.
This is an Updation anomaly since you must update all records in your table only because one piece of information has changed.
3. Deletion Anomaly in DBMS
In our Student table, we save two types of information together: student information and branch information.
So, if only one student is enrolled in a branch and quits the institution, or if the student's entry is deleted for any reason, we will also lose the branch information.
So, never in DBMS, we should keep two different entities together, which in the preceding example is Student and branch.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.