UNIT 1: Database

Data: Data are the collection of raw facts and figure which are unorganized, uninterpreted, and isolated, which doesn’t give any particular sense. They are randomly gathered. So that they can be processed to obtain the desired result.

Ram, 17, Pokhara, 18

Information: Information are the processed form of data. Information gives a meaning full result after data are interpreted.

Ram 17 year old and lives in Pokhara 18.

As we know, very organization from the prehistoric time deals with data and information. Data are very necessary for organization to operate. In previous time, data used to be collected randomly and were not organized in any particular order which leads to following drawbacks.

  1. Some information was stored in more than one file. (Data redundancy).
  2. Difficult to present data from the user’s view.
  3. Data security was one of the major problem.
  4. Data processing and manipulation were difficult.
  5. Data retrieval were very slow.

Hence, concept of database arise in order to overcome the above drawbacks or limitations. Database is collection of interrelated data or entities stored in tabular form. It gives very useful information for an organization during data manipulation and decision making. It provides a base for foundation for managing large volume of data in a well organized manner. Simply database is a systematic and organized form of data so that user can retrieve or access data whenever required within no time. For eg, telephone directory, dictionary, marks ledger, customer records etc.

Roll NoNameAddressContact
1Aaishu karkiPokhara0090786
2Bishaka LohaniBhairab tole3452423
[Note: Column of a database table is known a field. Row of a database table is known as record.]

In past people used file processing system / flat file system in order to store data and keep record due to difficulty in processing file and having following  limitations. This system are no longer in use rather people use DBMS in to order Keep record systematically, scientifically and securely.

Limitations of flat file system.

  1. Data security system is one of the major problem in flat file system.
  2. Data Sharing was almost impossible or difficult.
  3. Some record was stored in more than one file. (Data redundancy)
  4. Difficult to represent data through user prospective.
  5. It is difficult for processing complex queries.

DBMS (Database Management System)

DBMS is a computerized record keeping system. It is a software that defines, manipulates and  manage the database. It allows to access the file, update the record and retrieve data as requested. In other word, DBMS is defined as the collection of interrelated  data and set of programs to access there data. The collection of data is usually a database which contains the information about any particular organization. The primary goal of DBMS is to provide an effective and efficient environment for both data retrieval and  storing data in database. For example MS Access, Oracle, MY SQL, Fox-Pro, D-base etc

Advantages of DBMS

1) Sharing data: Using DBMS, data stored in database can be shared among multiple users or computer. For example, branches of bank share data from database.

2) Reduce data redundancy: Data redundancy refer to duplication or repetition of same data over and over. DBMS reduces such type of unnecessary repetition of data.

3) Data backup and recovery: Using DBMS, we can easily create spare copy of original files and documents that can be later used in case of accidental or intentional loss and damages.

4) Data security: Using DBMS we can restrict use of database to the unauthorized person. It helps to ensure date security.

5) Multiple user interface: DBMS facilitates sharing of data which means same data can be accessed from multiple device and location. Hence, user can experience multiple interface to access same set of data.

6) Data Integrity: Data Integrity refers to consistency of data. Using database we can arrange same sort of data in a like order. We can define and enforce constraints for data integrity.

Disadvantages of DBMS

1) Expensive: It may be expensive to run and operate DBMS for any organization.

2) Changing Technology: It is very much difficult to incorporate latest changing technology in existing system. In order to create and maintain database technical manpower and trainings are required.

3) Backup is needed: Since data maybe damage anytime accidentally or intentionally. So, it is mandatory to create a backup.

Database Model

There are different forms of Database Management system. Each characterized by the way where data are defined and structured. This arrangement of data in several structure are known as data base model. Different types of database model. Click here for full explanation with pros, cons and diagram.

Hierarchical database model

It is one of the oldest type of database model. In this model data are represented in the forms of record, each record has multiple field or attributes. All records are arranged in database as tree like structure. The relationship between the records is called parent child relationship in which one child record relates to only a single parent i.e child posses property only property of a single parent. Here child are restricted to use the property of a parent to whom it doesn’t belong.

Advantages

  1. It is the simplest and the easiest model.
  2. It supports one to one or one to many relationship.
  3. Searching is easier and faster if parent is known.

Disadvantages

  1. It is an old fashion and outdated database model.
  2. It doesn’t support many to one relationship.
  3. It doesn’t reduce data redundancy because some data are written over different places.

Network database model

This network model replace hierarchical model due to some limitation on the model. Suppose an employee relates to two different department then hierarchical model cannot be able to arrange record in proper place. So, network database model was emerged to arrange non-hierarchical database. The structure of database is more like graph rather than tree structure. A network model consists of collection of record which are interrelated to each other with the help of relationship. Each records have multiple fields and each field has only one data value. In this type of model a parent may have multiple child, as well as child can have multiple parent.

Advantages

  1. It accepts many to many relationships. So, It is more flexible .
  2. It reduces data redundancy.
  3. This network mode is simple and easy to design.
  4. Searching is faster due to use of multi-directional pointer.

Disadvantages

  1. Needs long program to handle the relationship.
  2. Lack of structural independence.
  3. Less security

Relational database model

In relational database model, the data are organized into tables which contains multiple row and columns. These tables are called relations. A row in a table represents a relationship among a set of values. Since, a table is collection of such relationship. It is generally referred to the mathematical term relations from which the relational database model derives its name. It is also known as RDBMS.

Note: The database system which stores and display data in tabular format of rows and column like spreadsheet is known as RDBMS. It is the most practical DBMS those days. For example, MS-Access, MY SQL, Oracle etc

Advantages

  1. There is less data redundancy.
  2. Breaking of complex database into simple is very much easier.
  3. Database processing is faster than other model.

Disadvantage

  1. Establishing more relationships complex.
  2. It requires powerful computer and data storage device.

Centralized database VS Distributed database

Centralized database

It is a simple type which works on client server basis. In this type clients or user are directly connected to the centrally totally located server. This server hosts the data of its client or user and helps them to store and retrieve data as requirement. This type of system are used in small scale industries which doesn’t have to deal with large volume of data and user. Centralized database runs on single computer which may have single or multiple user. Since database is centralized, security is not a crucial part here. The maintenance of database is easier because of data are centrally stored. This type of system denote allow unauthorized person to access data.

Advantages

  1. Suitable for small scale industries.
  2. Operation and maintenance is easier.
  3. Since it prevent unauthorized person being accessed to database, it minimizes risk factor.

Disadvantages

  1. Data are not secured in this type of system.
  2. Not suitable for large scale industries.
  3. Failure of centrally located serves will collapse whole network.

Distributed database

This type of database system are complex in structure, instead of storing and retrieving data from centrally located server, it uses several numbers of database and server randomly located at different place. It is the collection of multiple logically interrelated database which are distributed in many geographical location. Since server are located at different locations user can experience a good speed of bandwidth. Similarly, back up and recovery process is lot more easier there, which makes data more secured. This type of system is used by large organization who has to deal with large volume of data and user all around the world. Since it is distributed in nature there may arises security issue and are costly to maintain and operate. Simply, distributed database system are the collection of several number of centralized database system in different locations.

Advantages

  1. Backup and recovery of data is easier.
  2. It can handle large volume of data and user all over the word.
  3. User can experience high speed bandwidth.

Disadvantages

  1. Very expensive to operate and maintain.
  2. Data security may be real issue.
Differences between centralized and distributed database system.
CentralizedDistributed
Simple typeComplex type
Located on particular locationMany geographical location
Only one serverMultiple server in many location
Suitable for small scale industriesSuitable for large scale industries
Maintenance is easyMaintenance is difficult
Security is highSecurity is low
Low speedHigh Speed
CheapExpensive
Failure of server affect whole networkDoesn’t affect whole network
High chance of data lossLess chance of data loss

Structure query Language (SQL)

SQL stands for Structured Query Language. It is an international standard data base query language for accessing and managing data in the database. SQL was introduced and developed by IBM in early 1970’s. It was able to control relational database. SQL is not a complete programming language. It is only used for communicating with database. SQL has statement for data definition (DDL), data manipulation (DML) and data control (DCL). A query is a request to a DBMS for the retrieval, modification, insertion and deletion of the data from database.

SQL is made of three sub languages: DDL, DML and DCL

1) DDL (Data Definition Language): DDL is used by the database designer and programmers to specify the content and the structure of table. It is used to define the physical characteristics of record. It includes commands that manipulate the structure of object such as tables: For eg, to create table

Syntax:

CREATE TABLE table_name (field_name1 data_type1  field_name2 data_type2 ………);

CREATE TABLE Student (SN Number Fname text);

2) DML (Data Manipulation Language): DML is related with manipulation of records such as retrieval, sorting, display and deletion of records or data. It helps user to use query and display report of the table. It provide technique for processing the database. It includes commands like insert, delete, select, and update to manipulate the information stored in the database.

Syntax:

INSERT INTO table_name VALUES (list of values);

INSERT INTO student VALUES (1 RAM);

3) DCL (Data Control Language): All provides additional feature for security of table and database. It includes commands for controlling data and access to the database. Some of the example of this command are grant, commit, etc

Entity Relationship Database model (ER Model)

The entity relationship database model (ER diagram) is based on the perception of a real world that contains a collection of basic object called entities and relationship among these objects. The ER diagram is an overall logical structure of database that can be expressed graphically. It was developed to facilitates database design. The major objectives of ER diagram is to show relationship among different entities. It has following components.

1) Entity: The distinguishable object of this real world is known as entities. It has a set of properties which uniquely identifies an entity. For eg, if student is an entity then his/her name may be property. It is denoted by rectangle.

2) Attributes: Attributes are the properties possessed by an entity. They are represented by ellipse or oval sign. For eg, if student is an entity then its attribute can be registration number, name, roll no, class, address, etc.

3) Link: The flow of information is indicated by the link in ER diagram. It is simply denoted by a line. It is a connection of entity, attributes and relationships.

4) Relationships: A relationship is a association among several entities. It is represented by diamond. For eg, if teachers and students are two entities the association can be derived as teacher teachers students. It shows meaningful dependencies between several entities. There are 3 types of relationships. One to one. One to many, Many to many.

DBA (Database Administrator)

DBA is the most responsible person in an organization with sound knowledge of DBMS. He/she is the overall administrator of the program. He/she has the maximum amount of privileges for accessing database and defining the role of the employee which use the system. The main goal of DBA is to keep the database server up to date, secure and provide information to the user on demand.

Qualities of good DBA

  1. He/she should have sound and complete knowledge about DBMS and its operation.
  2. He/she should be familiar with several DBMS packages such as MS Access, MY SQL, Oracle etc
  3. He/she should have depth knowledge about the OS in which database server is running.
  4. He/she should have good understanding of network architecture.
  5. He/she should hove good database designing skill.

Responsibilities

  1. DBA has responsibility to install, monitor, and upgrade database server.
  2. He/she should has responsibility to maintain database security by creating backup for recovery.
  3. He/she has responsibility to conduct training on the uses of database.
  4. DBA defines user privilege, relationships and manages form, reports in database.

Normalization (v.imp)

The process of breaking down or decomposing as complex relation into simple relation. It reduces redundancy (unnecessary repetition of data) using principle of non-loss decomposition in which table are reduce to smaller tables without loss of information.

Normalization is the database design process in which complex database table is broken down into simple separate tables. It makes data model more flexible and easier to  maintain.

EF Codd has introduced few rules for normalizing the database in 1970 and these rules are known as normal forms. This process minimizes and controls the duplication of data in a database and also provides a rapid search for data from database.

Unnormalized database

Emp_codeJanuary
Emp_nameFebruary
AddressMarch
Contact no.April
Date of birthMay
DepartmentJune
DesignationJuly
Basic_salaryDaily_allowance
Travel_AllowanceGross_salary
TaxProvident _fund

Normalized database

EmployeeSalaryMonth
Emp_codeBasic_SalaryJanuary
Emp_nameTravel_allowanceFebruary
AddressDaily_allowanceMarch
Contact no.Gross_salaryApril
Date of birthProvident_funtMay
DepartmentTaxJune
Designation July

Advantages of normalization

  1. It reduces data redundancy (duplication of data)
  2. It improves faster sorting and indexing.
  3. It simplifies the structure of the database table.
  4. It improves the performance of a system.
  5. It avoids loss of information.

Normal Forms

Let us consider a following unnormalized table.

NameRollClassSub_nameSub_marksSub_nameSub_marks
Ram111Computer95Account78
Sita112Computer98Account80
Hari211Computer80Account82
Shyam212Computer92Account83

A) 1NF ( First Normal form)

  1. 1NF sets the very basic rules for on organized database.
  2. It eliminates duplicate columns from the same table.
  3. It creates separate tables from each group of related data and identify each row with a unique column called primary key.

The objective of 1NF is to divide the base datas into logical units called tables.

In above table, we can see that column of subject nome and marks are repeated which are eliminated in 1NF.

NameRollClassSub_nameSub_marks
Ram111Computer95
Ram111Account78
Sita112Computer98
Sita112Account80
Hari211Computer80
Hari211Account82
Shyam212Computer92
Shyam213Account83

B) 2NF (Second Normal form)

  1. It further addresses the concept of remaining duplicate data.
  2. It should be in first normal form.
  3. It removes the date that applies to multiple row of a table and place them in separate tables.

The objective of second NF is to take data i.e. partly dependent upon the primary key and enter the data into another table.

In above table name depends upon roll no and class, subject name only depends upon class, subject marks depends upon name and subject_name. Hence, above table can be decomposed as:

NameRollClass
Ram111
Sita112
Hari211
Shyam212
SubjectClass
Computer11
Account11
Computer12
Account12
NameSub_nameSub_marks
RamComputer95
RamAccount78
SitaComputer98
SitaAccount80
HariComputer80
HariAccount82
ShyamComputer92
ShyamAccount83

3NF (Third Normal Form)

  1. It should be in second normal form.
  2. It removes the column that are not dependent on primary key using 3NF above table can be decomposed as:
Sub_idSubject
C1Computer
A1Account
Class_idClass
XI11
XII12
Std_idClassRollClass_id
1Ram1XI
2Sita1XII
3Hari2XI
4Shyam2XII
Std_idSub_idMarks
1C195
1A178
2C198
2A180
3C180
3A182
4C192
4A183

Frequently Asked Question (FAQs)

1) Define DBMS. List out objectives of DBMS.

2) What is relational database model? List out the advantage of this model.

3) Differentiate between DBMS and RDBMS with example

4) Describe the centralized and distributed database model with figure.

5) What is hierarchical database model? list out its advantage and disadvantage.

6) What is normalization? Explain the normalization process with example.

7) Define DBA. what are the roles/ responsibilities and criteria of a good DBA.

8) Differentiate between database and DBMS with example.

9) write short notes on

  1. ERD
  2. SQL -> DDL, DML, DCL
  3. Data Integrity
  4. Data security
  5. Data Dictionary