undefined | unit 3 database design

Unit-3

Database Design

Q1) What is a Good Database Design and what are their importance?

A1)

A good database design process is governed by specific rules. The first rule dictates that redundant data must be avoided; as it wastes space and increases the probability of faults and discrepancies within the database. The next rule is that the accuracy and comprehensiveness of information is extremely imperative. If the database contains erroneous information, any documents that fetch data from such a database will also include inaccurate information. Consequently, any decisions based on those documents will be misleading.

So, how can you ensure that your database design is good? A well-designed database is the one that:

Distributes your data into tables based on specific subject areas to decrease data redundancy

Delivers the database the information needed to link the data in the tables

Provides support, and guarantees the precision and reliability of data

Caters to your information processing and reporting requirements

Functions interactively with the database operators as much as possible

ETL Your Enterprise Data Using MYSQL Connector

Move MySQL Data to a Destination of Your Choice Through a Codeless, Drag-and-Drop UI.

Importance of Database Design

Database design defines the database structure used for planning, storing, and managing information. Accuracy in data can only be accomplished if a database is designed to store only valuable and necessary information.

A well-designed database is imperative in guaranteeing information consistency, eliminating redundant data, efficiently executing queries, and improving the performance of the database. Meticulously designing a database saves you from wasting time and getting frustrated during the database development phase. A good database design also allows you to easily access and retrieve data whenever needed.

The reliability of data depends on the table structure; whereas creating primary and unique keys guarantees uniformity in the stored information. Data replication can be avoided by forming a table of probable values and using a key to denote the value. So, whenever the value changes, the alteration happens only once in the main table.

As the general performance of a database depends on its design, a good database design uses simple queries and faster implementation. It is easy to maintain and update; whereas fixing trivial interruptions in a poor database design may harm stored events, views, and utilities.

Q2) What is the database development life cycle?

A2)

Database Development Life Cycle

There are various stages in database development. However, it is not necessary to follow each of the steps sequentially. The life cycle can be broadly divided into three steps: requirement analysis, database designing, and implementation.

1- Requirement Analysis

Requirement analysis requires two steps:

Planning: In this stage, the plan of the entire Database Development Life Cycle is decided. It also requires an analysis of the organization’s information systems strategy.

Defining the system: This stage explains and lays out the proposed database system’s scope.

2- Database designing

The actual database designing takes into account two key models:

Logical model: It is concerned with using the given requirements to create a database model. The complete design is laid out on paper at this stage, without considering any specific DBMS requirement or physically implementing it.

Physical model: This stage comes after the logical model and therefore involves physically implementing the logical model. It takes the DBMS and other physical implementation factors into consideration.

3- Implementation

The implementation stage of database development life cycle is concerned with:

Data conversion and loading: It comprises data importation and data conversion coming from the old system into the new database.

Testing: Finally, this stage identifies errors in the new system and make sure if all the database requirement specification are met.

Q3) What is functional dependency?

A3)

Functional Dependency

Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional dependency says that if two tuples have same values for attributes A1, A2,..., An, then those two tuples must have to have same values for attributes B1, B2, ..., Bn.

Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-hand side attributes determine the values of attributes on the right-hand side.

Armstrong's Axioms

If F is a set of functional dependencies then the closure of F, denoted as F+, is the set of all functional dependencies logically implied by F. Armstrong's Axioms are a set of rules, that when applied repeatedly, generates a closure of functional dependencies.

Reflexive rule− If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta.

Augmentation rule− If a → b holds and y is attribute set, then ay → by also holds. That is adding attributes in dependencies, does not change the basic dependencies.

Transitivity rule−Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also holds. a→ b is called as a functionally that determines b.

Trivial Functional Dependency

Trivial− If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold.

Non-trivial− If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD.

Completely non-trivial− If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non-trivial FD.

Q4) Explain normalization in database and what are the different types of normalization

A4)

Normalization

If a database design is not perfect, it may contain anomalies, which are like a bad dream for any database administrator. Managing a database with anomalies is next to impossible.

Update anomalies− If data items are scattered and are not linked to each other properly, then it could lead to strange situations. For example, when we try to update one data item having its copies scattered over several places, a few instances get updated properly while a few others are left with old values. Such instances leave the database in an inconsistent state.

Deletion anomalies−We tried to delete a record, but parts of it was left undeleted because of unawareness, the data is also saved somewhere else.

Insert anomalies−We tried to insert data in a record that does not exist at all.

Normalization is a method to remove all these anomalies and bring the database to a consistent state.

First Normal Form

First Normal Form is defined in the definition of relations (tables) itself. This rule defines that all the attributes in a relation must have atomic domains. The values in an atomic domain are indivisible units.

unorganized relation

Fig 3 – 1NF

We re-arrange the relation (table) as below, to convert it to First Normal Form.

Relation in 1NF

Fig 4 - Re-arrange the relation 1NF

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form

Before we learn about the second normal form, we need to understand the following −

Prime attribute−An attribute, which is a part of the candidate-key, is known as a prime attribute.

Non-prime attribute−An attribute, which is not a part of the prime-key, is said to be a non-prime attribute.

If we follow second normal form, then every non-prime attribute should be fully functionally dependent on prime key attribute. That is, if X →A holds, then there should not be any proper subset Y of X, for which Y → A also holds true.

Relation not in 2NF

Fig 5 – 2 NF

We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on any of the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial dependency, which is not allowed in Second Normal Form.

Relation in 2NF

Fig 6 - Partial dependency

We broke the relation in two as depicted in the above picture. So there exists no partial dependency.

Third Normal Form

For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy −

No non-prime attribute is transitively dependent on prime key attribute.

For any non-trivial functional dependency, X → A, then either −

X is a superkey or,
A is prime attribute.

Relation not in 3NF

Fig 7 – 3 NF

We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID→ Zip → City, so there existstransitive dependency.

To bring this relation into third normal form, we break the relation into two relations as follows −

Relation in 3NF

Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF states that −

For any non-trivial functional dependency, X → A, X must be a super-key.

In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-key in the relation ZipCodes. So,

Stu_ID→Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.

Q5) What are the rules and types of functional dependency?

A5)

Rules of Functional Dependencies

Below are the Three most important rules for Functional Dependency in Database:

Reflexive rule –. If X is a set of attributes and Y is_subset_of X, then X holds a value of Y.

Augmentation rule: When x -> y holds, and c is attribute set, then ac ->bc also holds. That is adding attributes which do not change the basic dependencies.

Transitivity rule: This rule is very much similar to the transitive rule in algebra if x -> y holds and y -> z holds, then x -> z also holds. X -> y is called as functionally that determines y.

Types of Functional Dependencies in DBMS

There are mainly four types of Functional Dependency in DBMS. Following are the types of Functional Dependencies in DBMS:

Multivalued Dependency

Trivial Functional Dependency

Non-Trivial Functional Dependency

Transitive Dependency

Multivalued Dependency in DBMS

Multivalued dependency occurs in the situation where there are multiple independent multivalued attributes in a single table. A multivalued dependency is a complete constraint between two sets of attributes in a relation. It requires that certain tuples be present in a relation. Consider the following Multivalued Dependency Example to understand.

Example:

Car_model	Maf_year	Color
H001	2017	Metallic
H001	2017	Green
H005	2018	Metallic
H005	2018	Blue
H010	2015	Metallic
H033	2012	Gray

In this example, maf_year and color are independent of each other but dependent on car_model. In this example, these two columns are said to be multivalue dependent on car_model.

This dependence can be represented like this:

car_model ->maf_year

car_model-> colour

Trivial Functional Dependency in DBMS

The Trivial dependency is a set of attributes which are called a trivial if the set of attributes are included in that attribute.

So, X -> Y is a trivial functional dependency if Y is a subset of X. Let's understand with a Trivial Functional Dependency Example.

For example:

Emp_id	Emp_name
AS555	Harry
AS811	George
AS999	Kevin

Consider this table of with two columns Emp_id and Emp_name.

{Emp_id, Emp_name} ->Emp_id is a trivial functional dependency as Emp_id is a subset of {Emp_id,Emp_name}.

Non Trivial Functional Dependency in DBMS

Functional dependency which also known as a nontrivial dependency occurs when A->B holds true where B is not a subset of A. In a relationship, if attribute B is not a subset of attribute A, then it is considered as a non-trivial dependency.

Company	CEO	Age
Microsoft	Satya Nadella	51
Google	SundarPichai	46
Apple	Tim Cook	57

Example:

(Company} -> {CEO} (if we know the Company, we knows the CEO name)

But CEO is not a subset of Company, and hence it's non-trivial functional dependency.

Transitive Dependency in DBMS

A Transitive Dependency is a type of functional dependency which happens when t is indirectly formed by two functional dependencies. Let's understand with the following Transitive Dependency Example.

Example:

Company	CEO	Age
Microsoft	Satya Nadella	51
Google	SundarPichai	46
Alibaba	Jack Ma	54

{Company} -> {CEO} (if we know the compay, we know its CEO's name)

{CEO } -> {Age} If we know the CEO, we know the Age

Therefore according to the rule of rule of transitive dependency:

{ Company} -> {Age} should hold, that makes sense because if we know the company name, we can know his age.

Note: You need to remember that transitive dependency can only occur in a relation of three or more attributes.

What is Closure of a set of FD's?

Closure of an Attribute: Closure of an Attribute can be defined as a set of attributes that can be functionally determined from it.

Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F

Closure of a set of attributes X concerning F is the set X+ of all attributes that are functionally determined by X

Pseudocode to find Closure of an Attribute?

Determine X+, the closure of X under functional dependency set F

X Closure : = will contain X itself;

Repeat the process as:

old X Closure : = X Closure;

for each functional dependency P → Q in FD set do

if X Closure is subset of P then X Closure := X Closure U Q ;

Repeat until ( X Closure = old X Closure);

Algorithm of Determining X+, the Closure of X under F

Input: A set F of FDs on a relation schema R, and a set of attributes X, which is a subset of R.

X+ := X;

repeat

oldX+ := X+ ;

for each functional dependency Y → Z in F do

if X+ ⊇ Y then X+ := X+ ∪ Z;

until (X+ = oldX+ );

7. QUESTIONS ON CLOSURE SET OF ATTRIBUTE:

1) Given relational schema R( P Q R S T U V) having following attribute P Q R S T U and V, also there is a set of functional dependency denoted by FD = { P->Q, QR->ST, PTV->V }.

Determine Closure of (QR)+ and (PR)+

a) QR+ = QR (as the closure of an attribute or set of attributes contain same).

Now as per algorithm look into a set of FD that complete the left side of any FD contains either Q, R, or QR since in FD QR→ST has complete QR.

Hence QR+ = QRST

Again, trace the remaining two FD that any left part of FD contains any Q, R, S, T.

Since no complete left side of the remaining two FD{P->Q, PTV->V} contain Q, R, S, T.

Therefore QR+ = QRST (Answer)

Note: In FD PTV→V, T is in QRST but that cannot be entertained, as complete PTV should be a subset of QRST

b) PR + = PR (as the closure of an attribute or set of attributes contain same)

Now as per algorithm look into a set of FD, and check that complete left side of any FD contains either P, R, or PR. Since in FD P→Q, P is a subset of PR, Hence PR+ = PRQ

Again, trace the remaining two FD that any left part of FD contains any P, R, Q, Since, in FD QR → ST has its complete left part QR in PQR

Hence PR+ = PRQST

Again trace the remaining one FD { PTV->V } that its complete left belongs to PRQST. Since complete PTV is not in PRQST, hence we ignore it.

Therefore PR+ = PRQST ( Answer)

2. Given relational schema R( P Q R S T) having following attributes P Q R S and T, also there is a set of functional dependency denoted by FD = { P->QR, RS->T, Q->S, T-> P }.

Determine Closure of ( T )+

T + = T (as the closure of an attribute or set of attributes contain same) Now as per algorithm look into a set of FD that complete the left side of any FD contains T since, in FD T → P, T is in T, Hence T+ = TP Again trace the remaining three FD that any left part of FD contain any TP, Since in FD P → QR has its complete left part P in TP, Hence T+ = TPQR Again trace the remaining two FD { RS->T, Q->S } that any of its Complete left belongs to TPQR, Since in FD Q → S has its complete left part Q in TPQR, Hence T+ = TPQRS Again trace the remaining one FD { RS->T } that its complete left belongs to TPQRS, Since in FD RS → T has its complete left part RS in TPQRS Hence T+ = TPQRS ( no changes, as T, is already in TPQRS) Therefore T+ = TPQRS ( Answer).

Q6) Explain normalization and what are its different types explain with examples

A6)

Normalization

Normalization is the process of organizing the data in the database.

Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate the undesirable characteristics like Insertion, Update and Deletion Anomalies.

Normalization divides the larger table into the smaller table and links them using relationship.

The normal form is used to reduce redundancy from the database table.

Types of Normal Forms

There are the four types of normal forms:

DBMS Normalization

Fig 8 – Normal Forms

Normal Form	Description
1NF	A relation is in 1NF if it contains an atomic value.
2NF	A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the primary key.
3NF	A relation will be in 3NF if it is in 2NF and no transition dependency exists.
4NF	A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.
5NF	A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be lossless.

First Normal Form (1NF)

A relation will be 1NF if it contains an atomic value.

It states that an attribute of a table cannot hold multiple values. It must hold only single-valued attribute.

First normal form disallows the multi-valued attribute, composite attribute, and their combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.

EMPLOYEE table:

EMP_ID	EMP_NAME	EMP_PHONE	EMP_STATE
14	John	7272826385, 9064738238	UP
20	Harry	8574783832	Bihar
12	Sam	7390372389, 8589830302	Punjab

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID	EMP_NAME	EMP_PHONE	EMP_STATE
14	John	7272826385	UP
14	John	9064738238	UP
20	Harry	8574783832	Bihar
12	Sam	7390372389	Punjab
12	Sam	8589830302	Punjab

Second Normal Form (2NF)

In the 2NF, relational must be in 1NF.

In the second normal form, all non-key attributes are fully functional dependent on the primary key

Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a school, a teacher can teach more than one subject.

TEACHER table

TEACHER_ID	SUBJECT	TEACHER_AGE
25	Chemistry	30
25	Biology	30
47	English	35
83	Math	38
83	Computer	38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper subset of a candidate key. That's why it violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_DETAIL table:

TEACHER_ID	TEACHER_AGE
25	30
47	35
83	38

TEACHER_SUBJECT table:

TEACHER_ID	SUBJECT
25	Chemistry
25	Biology
47	English
83	Math
83	Computer

Third Normal Form (3NF)

A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.

3NF is used to reduce the data duplication. It is also used to achieve the data integrity.

If there is no transitive dependency for non-prime attributes, then the relation must be in third normal form.

A relation is in third normal form if it holds atleast one of the following conditions for every non-trivial function dependency X → Y.

X is a super key.

Y is a prime attribute, i.e., each element of Y is part of some candidate key.

Example:

EMPLOYEE_DETAIL table:

EMP_ID	EMP_NAME	EMP_ZIP	EMP_STATE	EMP_CITY
222	Harry	201010	UP	Noida
333	Stephan	02228	US	Boston
444	Lan	60007	US	Chicago
555	Katharine	06389	UK	Norwich
666	John	462007	MP	Bhopal

Super key in the table above:

{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on

Candidate key: {EMP_ID}

Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form.

That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.

EMPLOYEE table:

EMP_ID	EMP_NAME	EMP_ZIP
222	Harry	201010
333	Stephan	02228
444	Lan	60007
555	Katharine	06389
666	John	462007

EMPLOYEE_ZIP table:

EMP_ZIP	EMP_STATE	EMP_CITY
201010	UP	Noida
02228	US	Boston
60007	US	Chicago
06389	UK	Norwich
462007	MP	Bhopal

Boyce Codd normal form (BCNF)

BCNF is the advance version of 3NF. It is stricter than 3NF.

A table is in BCNF if every functional dependency X → Y, X is the super key of the table.

For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one department.

EMPLOYEE table:

EMP_ID	EMP_COUNTRY	EMP_DEPT	DEPT_TYPE	EMP_DEPT_NO
264	India	Designing	D394	283
264	India	Testing	D394	300
364	UK	Stores	D283	232
364	UK	Developing	D283	549

In the above table Functional dependencies are as follows:

EMP_ID → EMP_COUNTRY

EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate key: {EMP-ID, EMP-DEPT}

The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.

To convert the given table into BCNF, we decompose it into three tables:

EMP_COUNTRY table:

EMP_ID	EMP_COUNTRY
264	India
264	India

EMP_DEPT table:

EMP_DEPT	DEPT_TYPE	EMP_DEPT_NO
Designing	D394	283
Testing	D394	300
Stores	D283	232
Developing	D283	549

EMP_DEPT_MAPPING table:

EMP_ID	EMP_DEPT
D394	283
D394	300
D283	232
D283	549

Functional dependencies:

EMP_ID → EMP_COUNTRY

EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate keys:

For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

Now, this is in BCNF because left side part of both the functional dependencies is a key.

Fourth normal form (4NF)

A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.

For a dependency A→ B, if for a single value of A, multiple values of B exists, then the relation will be a multi-valued dependency.

Example

STUDENT

STU_ID	COURSE	HOBBY
21	Computer	Dancing
21	Math	Singing
34	Chemistry	Dancing
74	Biology	Cricket
59	Physics	Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity. Hence, there is no relationship between COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID, which leads to unnecessary repetition of data.

So to make the above table into 4NF, we can decompose it into two tables:

STUDENT_COURSE

STU_ID	COURSE
21	Computer
21	Math
34	Chemistry
74	Biology
59	Physics

STUDENT_HOBBY

STU_ID	HOBBY
21	Dancing
21	Singing
34	Dancing
74	Cricket
59	Hockey

Fifth normal form (5NF)

A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be lossless.

5NF is satisfied when all the tables are broken into as many tables as possible in order to avoid redundancy.

5NF is also known as Project-join normal form (PJ/NF).

Example

SUBJECT	LECTURER	SEMESTER
Computer	Anshika	Semester 1
Computer	John	Semester 1
Math	John	Semester 1
Math	Akash	Semester 2
Chemistry	Praveen	Semester 1

In the above table, John takes both Computer and Math class for Semester 1 but he doesn't take Math class for Semester 2. In this case, combination of all these fields required to identify a valid data.

Suppose we add a new Semester as Semester 3 but do not know about the subject and who will be taking that subject so we leave Lecturer and Subject as NULL. But all three columns together acts as a primary key, so we can't leave other two columns blank.

So to make the above table into 5NF, we can decompose it into three relations P1, P2 & P3:

SEMESTER	SUBJECT
Semester 1	Computer
Semester 1	Math
Semester 1	Chemistry
Semester 2	Math

SUBJECT	LECTURER
Computer	Anshika
Computer	John
Math	John
Math	Akash
Chemistry	Praveen

SEMSTER	LECTURER
Semester 1	Anshika
Semester 1	John
Semester 1	John
Semester 2	Akash
Semester 1	Praveen

Q7) Explain relational decomposition and what are its types

A7)

Relational Decomposition

When a relation in the relational model is not in appropriate normal form then the decomposition of a relation is required.

In a database, it breaks the table into multiple tables.

If the relation has no proper decomposition, then it may lead to problems like loss of information.

Decomposition is used to eliminate some of the problems of bad design like anomalies, inconsistencies, and redundancy.

Types of Decomposition

DBMS Relational Decomposition

Fig 9 - Decomposition

Lossless Decomposition

If the information is not lost from the relation that is decomposed, then the decomposition will be lossless.

The lossless decomposition guarantees that the join of relations will result in the same relation as it was decomposed.

The relation is said to be lossless decomposition if natural joins of all the decomposition give the original relation.

Example:

EMPLOYEE_DEPARTMENT table:

EMP_ID	EMP_NAME	EMP_AGE	EMP_CITY	DEPT_ID	DEPT_NAME
22	Denim	28	Mumbai	827	Sales
33	Alina	25	Delhi	438	Marketing
46	Stephan	30	Bangalore	869	Finance
52	Katherine	36	Mumbai	575	Production
60	Jack	40	Noida	678	Testing

The above relation is decomposed into two relations EMPLOYEE and DEPARTMENT

EMPLOYEE table:

EMP_ID	EMP_NAME	EMP_AGE	EMP_CITY
22	Denim	28	Mumbai
33	Alina	25	Delhi
46	Stephan	30	Bangalore
52	Katherine	36	Mumbai
60	Jack	40	Noida

DEPARTMENT table

DEPT_ID	EMP_ID	DEPT_NAME
827	22	Sales
438	33	Marketing
869	46	Finance
575	52	Production
678	60	Testing

Now, when these two relations are joined on the common column "EMP_ID", then the resultant relation will look like:

Employee ⋈ Department

EMP_ID	EMP_NAME	EMP_AGE	EMP_CITY	DEPT_ID	DEPT_NAME
22	Denim	28	Mumbai	827	Sales
33	Alina	25	Delhi	438	Marketing
46	Stephan	30	Bangalore	869	Finance
52	Katherine	36	Mumbai	575	Production
60	Jack	40	Noida	678	Testing

Hence, the decomposition is Lossless join decomposition.

Dependency Preserving

It is an important constraint of the database.

In the dependency preservation, at least one decomposed table must satisfy every dependency.

If a relation R is decomposed into relation R1 and R2, then the dependencies of R either must be a part of R1 or R2 or must be derivable from the combination of functional dependencies of R1 and R2.

For example, suppose there is a relation R (A, B, C, D) with functional dependency set (A->BC). The relational R is decomposed into R1(ABC) and R2(AD) which is dependency preserving because FD A->BC is a part of relation R1(ABC).

Q8) What is multivalued dependency give an example?

A8)

Multivalued Dependency

Multivalued dependency occurs when two attributes in a table are independent of each other but, both depend on a third attribute.

A multivalued dependency consists of at least two attributes that are dependent on a third attribute that's why it always requires at least three attributes.

Example: Suppose there is a bike manufacturer company which produces two colors(white and black) of each model every year.

BIKE_MODEL	MANUF_YEAR	COLOR
M2011	2008	White
M2001	2008	Black
M3001	2013	White
M3001	2013	Black
M4006	2017	White
M4006	2017	Black

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and independent of each other.

In this case, these two columns can be called as multivalued dependent on BIKE_MODEL. The representation of these dependencies is shown below:

BIKE_MODEL → → MANUF_YEAR

BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL multidetermined COLOR".

Q9) Explain Fourth normal form (4NF)

A10)

A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.

For a dependency A→ B, if for a single value of A, multiple values of B exists, then the relation will be a multi-valued dependency.

Example

STUDENT

STU_ID	COURSE	HOBBY
21	Computer	Dancing
21	Math	Singing
34	Chemistry	Dancing
74	Biology	Cricket
59	Physics	Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity. Hence, there is no relationship between COURSE and HOBBY.

So to make the above table into 4NF, we can decompose it into two tables:

STUDENT_COURSE

STU_ID	COURSE
21	Computer
21	Math
34	Chemistry
74	Biology
59	Physics

STUDENT_HOBBY

STU_ID	HOBBY
21	Dancing
21	Singing
34	Dancing
74	Cricket
59	Hockey

Q10) Explain join dependency

A10)

Join Dependency

Join decomposition is a further generalization of Multivalued dependencies.

If the join of R1 and R2 over C is equal to relation R, then we can say that a join dependency (JD) exists.

Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a given relations R (A, B, C, D).

Alternatively, R1 and R2 are a lossless decomposition of R.

A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,....., Rn is a lossless-join decomposition.

The *(A, B, C, D), (C, D) will be a JD of R if the join of join's attribute is equal to the relation R.

Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD of R.

Sign Up