RDBMS stands for Relational Database Management System. It is a type of database management system. It is the basis for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access. Prior to this type of managing database Network, Hierarchical and file system use to run. Research is going on for next level of Object-Oriented Database Management System.

History of SQL

Dr. E. F. Codd published the paper, “A Relational Model of Data for Large Shared Data Banks”, in June 1970 in the Association of Computer Machinery (ACM) journal, Communications of the ACM. Codd’s model is now accepted as the definitive model for relational database management systems (RDBMS). The language, Structured English Query Language (“SEQUEL”) was developed by IBM Corporation, Inc., to use Codd’s model. SEQUEL later became SQL (still pronounced “sequel”). In 1979, Relational Software, Inc. (now Oracle Corporation) introduced the first commercially available implementation of SQL. Today, SQL is accepted as the standard RDBMS language.

SQL

SQL uses the terms table, row, and column for relation, tuple, and attribute, respectively. A table is uniquely identified by its name and consists of rows that contain the stored information, each row containing exactly one tuple (or record). Following is the example of a Employees table:

+—-+———-+—–+———–+———-+

| ID | NAME | D.O.B | SALARY |

+—-+———-+—–+———–+———-+

| 1 | John | 05.10.1985 | 2000.00 |

| 2 | Milly | 02.04.1970 | 7500.00 |

| 3 | Bob | 25.06.1977 | 8000.00 |

| 4 | Ronny | 12.01.1980 | 6500.00 |

| 5 | Den | 30.07.1975 | 8500.00 |

| 6 | Claudia | 22.12.1981 | 4500.00 |

| 7 | Muffy | 06.11.1968 | 10000.00 |

+—-+———-+—–+———–+———-+

A table can have one or more columns. A column is made up of a column name and a data type, and it describes an attribute of the tuples. The structure of a table, also called relation schema, thus is defined by its attributes. The type of information to be stored in a table is defined by the data types of the attributes at table creation time. The table is a collection of related data entries and it consists of columns and rows.

Database Normalization

(Tip – Freshers are asked in detail on this segement)

Database normalization is the process of efficiently organizing data in a database. There are two reasons of the normalization process:

  1. Eliminating redundant data, for example, storing the same data in more than one table.
  2. Ensuring data dependencies make sense.

Normalization guidelines are divided into normal forms; think of form as the format or the way a database structure is laid out. The aim of normal forms is to organize the database structure so that it complies with certain rules of which first normal form, then second normal form, and finally third normal form are most talked about.

It’s your choice to take it further and go to fourth normal form, fifth normal form, and so on, but generally speaking, third normal form is enough. (TIP: Freshers are asked more on other NF. Freshers should cram the NF will examples. Often examples like one mentioned below are questioned and asked to apply NF to resolve into a systematic table.)

  • Define the data items required, because they become the columns in a table. Place related data items in a table.
  • Ensure that there are no repeating groups of data.
  • Ensure that there is a primary key.

First Rule of 1NF:

You must define the data items. This means looking at the data to be stored, organizing the data into columns, defining what type of data each column contains, and finally putting related columns into their own table.

For example, you put all the columns relating to locations of meetings in the Location table, those relating to employees in the EmployeeDetails table, and so on.

Second Rule of 1NF:

The next step is ensuring that there are no repeating groups of data. Consider we have following table:

Problem:

CREATE TABLE CUSTOMERS(

ID INT NOT NULL,

NAME VARCHAR (20) NOT NULL,

AGE INT NOT NULL,

ORDERS VARCHAR(155)

);

So if we populate this table for a single customer having multiple orders then it would be something as follows:

ID NAME AGE ORDERS
101 Rohit 30 Cannon Xius-Red
101 Rohit 30 Sony Laptop
101 Rohit 30 Tripod Large

But as per 1NF, we need to ensure that there are no repeating groups of data. So let us break above table into two parts and join them using a key as follows:

Solution:

CUSTOMERS table:

CREATE TABLE CUSTOMERS(

ID INT NOT NULL,

NAME VARCHAR (20) NOT NULL,

AGE INT NOT NULL,

PRIMARY KEY (ID)

);

This table would have following record:

ID NAME AGE
101 Rohit 30

ORDERS table:

CREATE TABLE ORDERS(

ID INT NOT NULL,

CUSTOMER_ID INT NOT NULL,

ORDERS VARCHAR(155),

PRIMARY KEY (ID)

);

This table would have following records:

ID CUSTOMER_ID ORDERS
10 101 Cannon Xius-Red
11 101 Sony Laptop
12 101 Tripod Large

Third Rule of 1NF:

The final rule of the first normal form. Create a primary key for each table which we have already created.

2NF:

Second normal form states that it should meet all the rules for 1NF and there must be no partial dependences of any of the columns on the primary key:

Consider a customer-order relation and you want to store customer ID, customer name, order ID and order detail, and date of purchage:

CREATE TABLE CUSTOMERS(

CUST_ID INT NOT NULL,

CUST_NAME VARCHAR (20) NOT NULL,

ORDER_ID INT NOT NULL,

ORDER_DETAIL VARCHAR (20) NOT NULL,

SALE_DATE DATETIME,

PRIMARY KEY (CUST_ID, ORDER_ID)

);

This table is in first normal form, in that it obeys all the rules of first normal form.

Problem:

In this table, the primary key consists of CUST_ID and ORDER_ID. Combined they are unique assuming same customer would hardly order same thing.

However, the table is not in second normal form because there are partial dependencies of primary keys and columns. CUST_NAME is dependent on CUST_ID, and there’s no real link between a customer’s name and what he purchaged. Order detail and purchage date are also dependent on ORDER_ID, but they are not dependent on CUST_ID, because there’s no link between a CUST_ID and an ORDER_DETAIL or their SALE_DATE.

Solution:

To make this table comply with second normal form, you need to separate the columns into three tables.

First, create a table to store the customer details as follows:

CREATE TABLE CUSTOMERS(

CUST_ID INT NOT NULL,

CUST_NAME VARCHAR (20) NOT NULL,

PRIMARY KEY (CUST_ID)

);

Next, create a table to store details of each order:

CREATE TABLE ORDERS(

ORDER_ID INT NOT NULL,

ORDER_DETAIL VARCHAR (20) NOT NULL,

PRIMARY KEY (ORDER_ID)

);

Finally, create a third table storing just CUST_ID and ORDER_ID to keep track of all the orders for a customer:

CREATE TABLE CUSTMERORDERS(

CUST_ID INT NOT NULL,

ORDER_ID INT NOT NULL,

SALE_DATE DATETIME,

PRIMARY KEY (CUST_ID, ORDER_ID)

);

3NF

A table is in third normal form when the following conditions are met:

  • It is in second normal form.
  • All non-primary fields are dependent on the primary key.

The dependency of non-primary fields is between the data. For example in the below table, street name, city, and state are unbreakably bound to the zip code.

Problem:

CREATE TABLE CUSTOMERS(

CUST_ID INT NOT NULL,

CUST_NAME VARCHAR (20) NOT NULL,

DOB DATE,

STREET VARCHAR(200),

CITY VARCHAR(100),

STATE VARCHAR(100),

ZIP VARCHAR9(12),

EMAIL_ID VARCHAR(256),

PRIMARY KEY (CUST_ID)

);

The dependency between between zip code and address is called a transitive dependency. To comply with third normal form, all you need to do is move the Street, City, and State fields into their own table, which you can call the Zip Code table:

Solution:

CREATE TABLE ADDRESS(

ZIP VARCHAR9(12),

STREET VARCHAR(200),

CITY VARCHAR(100),

STATE VARCHAR(100),

PRIMARY KEY (ZIP)

);

Next, alter the CUSTOMERS table as follows:

CREATE TABLE CUSTOMERS(

CUST_ID INT NOT NULL,

CUST_NAME VARCHAR (20) NOT NULL,

DOB DATE,

ZIP VARCHAR9(12),

EMAIL_ID VARCHAR(256),

PRIMARY KEY (CUST_ID)

);

The advantages of removing transitive dependencies are mainly twofold. First, the amount of data duplication is reduced and therefore your database becomes smaller.

The second advantage is data integrity. When duplicated data changes, there’s a big risk of updating only some of the data, especially if it’s spread out in a number of different places in the database. For example, if address and zip code data were stored in three or four different tables, then any changes in zip codes would need to ripple out to every record in those three or four table

Consider the following non-BCNF table whose functional dependencies follow the {AB ? C, C ? B} pattern:

Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy’s
Fuller Hairdresser Sweeney Todd’s
Fuller Optician Eagle Eye

For each Person / Shop Type combination, the table tells us which shop of this type is geographically nearest to the person’s home. We assume for simplicity that a single shop cannot be of more than one type. The candidate keys of the table are:

  • {Person, Shop Type}
  • {Person, Nearest Shop}

There are ample solutions. Those solutions will kick off previous form and we are not supposed to neglect any previous form. A design that eliminates all of these anomalies (but does not conform to BCNF) is possible. This design consists of the original “Nearest Shops” table supplemented by the “Shop” table described above.

Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy’s
Fuller Hairdresser Sweeney Todd’s
Fuller Optician Eagle Eye
Shop
Shop Shop Type
Eagle Eye Optician
Snippets Hairdresser
Merlin Books Bookshop
Doughy’s Bakery
Sweeney Todd’s Hairdresser

Conculsion of application of BCNF:

  • When there is more than one candidate key, a relational table may be in 3NF and anomalies may still result.
  • This occurs when there is a composite primary key, and there are two equally valid candidates to make up part of this composite primary key. If there is an attribute (one or more columns) on which any other attribute is fully dependent, and this attribute is NOT itself a candidate key, then the table is not in Boyce-Codd Normal form (BCNF).

 

Pizza Delivery Permutations
Restaurant Pizza Variety Delivery Area
A1 Pizza Thick Crust Springfield
A1 Pizza Stuffed Crust Springfield
A1 Pizza Stuffed Crust Shelbyville
A1 Pizza Stuffed Crust Capital City
Elite Pizza Thin Crust Capital City
Elite Pizza Stuffed Crust Capital City
Vincenzo’s Pizza Thick Crust Springfield
Vincenzo’s Pizza Thick Crust Shelbyville
Vincenzo’s Pizza Thin Crust Springfield
Vincenzo’s Pizza Thin Crust Shelbyville

Each row indicates that a given restaurant can deliver a given variety of pizza to a given area.

The table has no non-key attributes because its only key is {Restaurant, Pizza Variety, Delivery Area}. Therefore it meets all normal forms up to BCNF. If we assume, however, that pizza varieties offered by a restaurant are not affected by delivery area, then it does not meet 4NF. The problem is that the table features two non-trivial multivalued dependencies on the {Restaurant} attribute (which is not a superkey). The dependencies are:

  • {Restaurant} ?? {Pizza Variety}
  • {Restaurant} ?? {Delivery Area}

This state of affairs leads to redundancy in the table: for example, we are told three times that A1 Pizza offers Stuffed Crust, and if A1 Pizza starts producing Cheese Crust pizzas then we will need to add multiple rows, one for each of A1 Pizza’s delivery areas.

Varieties By Restaurant
Restaurant Pizza Variety
A1 Pizza Thick Crust
A1 Pizza Stuffed Crust
Elite Pizza Thin Crust
Elite Pizza Stuffed Crust
Vincenzo’s Pizza Thick Crust
Vincenzo’s Pizza Thin Crust
Delivery Areas By Restaurant
Restaurant Delivery Area
A1 Pizza Springfield
A1 Pizza Shelbyville
A1 Pizza Capital City
Elite Pizza Capital City
Vincenzo’s Pizza Springfield
Vincenzo’s Pizza Shelbyville

TIPS: Defination of these NF is asked as they have nothing to do at SQL level. These are followed by Database Administrators.

Fifth normal form (5NF), also known as Project-join normal form (PJ/NF) is a level of database normalization. A table is said to be in the 5NF if and only if every join dependency in it is implied by the candidate keys.

Domain/key normal form (DKNF) is a normal form used in database normalization which requires that the database contains no constraints other than domain constraints (specifies the permissible values for a given attribute, while a key constraint specifies the attributes that uniquely identify a row in a given table) and key constraints.

6NF is intended to decompose relation variables to irreducible components. Any relation in 6NF is also in 5NF.

An Oracle database is a collection of data. The purpose of a database is to store and retrieve related information. A database server is the key to solving the problems of information management. In general, a server (task of server is to receives and processes the SQL and PL/SQL statements that originate from client applications) manages a large amount of data in a multiuser environment so that many users can concurrently access the same data. All this is accomplished while delivering high performance. Database administrators are appointed to provide high performance. A database server also prevents unauthorized access and provides efficient solutions for failure recovery.

The database has logical structures (eg. Schema Object) and physical structures (eg. Database files). Because the physical and logical structures are separate, the physical storage of data can be managed without affecting the access to logical storage structures.

Schemas and Schema Objects

A schema is a collection of database objects. A schema is owned by a database user and has the same name as that user. Schema objects are the logical structures that directly refer to the database’s data. Schema objects include structures like tables, views, and indexes. Eg. Scott is a schema.