Wednesday, 19 August 2015

clustered index vs non-clustered index

An index is an on-disk structure associated with a table or view that speeds retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.

A table or view can contain the following types of indexes:

Clustered
- Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.
- The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.
Nonclustered

Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value.
The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.
You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key limits, 900 bytes and 16 key columns, and execute fully covered, indexed, queries. For more information, see Create Indexes with Included Columns.

Tuesday, 18 August 2015

In SQL, what’s the difference between the having clause and the where clause?

The difference between the having and where clause is best illustrated by an example. Suppose we have a table called emp_bonus as shown below. Note that the table has multiple entries for employees A and B.

emp_bonus

Employee	Bonus
A	1000
B	2000
A	500
C	700
B	1250

If we want to calculate the total bonus that each employee received, then we would write a SQL statement like this:

select employee, sum(bonus) from emp_bonus group by employee;

The Group By Clause

In the SQL statement above, you can see that we use the "group by" clause with the employee column. What the group by clause does is allow us to find the sum of the bonuses for each employee. Using the ‘group by’ in combination with the ‘sum(bonus)’ statement will give us the sum of all the bonuses for employees A, B, and C.
Subscribe to our newsletter for more free interview questions.
Running the SQL above would return this:

Employee	Sum(Bonus)
A	1500
B	3250
C	700

Now, suppose we wanted to find the employees who received more than $1,000 in bonuses for the year of 2007. You might think that we could write a query like this:

BAD SQL:
select employee, sum(bonus) from emp_bonus 
group by employee where sum(bonus) > 1000;

The WHERE clause does not work with aggregates like SUM

The SQL above will not work, because the where clause doesn’t work with aggregates – like sum, avg, max, etc.. Instead, what we will need to use is the having clause. The having clause was added to sql just so we could compare aggregates to other values – just how the ‘where’ clause can be used with non-aggregates. Now, the correct sql will look like this:

GOOD SQL:
select employee, sum(bonus) from emp_bonus 
group by employee having sum(bonus) > 1000;

Difference between having and where clause

So we can see that the difference between the having and where clause in sql is that the where clause can not be used with aggregates, but the having clause can. One way to think of it is that the having clause is an additional filter to the where clause.

Friday, 31 July 2015

DKNF

Domain-key normal form (DK/NF) is a normal form used in database normalization which requires that the database contains no constraints other than domain constraints and key constraints.

A domain constraint specifies the permissible values for a given attribute, while a key constraint specifies the attributes that uniquely identify a row in a given table.

The domain/key normal form is achieved when every constraint on the relation is a logical consequence of the definition of keys and domains, and enforcing key and domain restraints and conditions causes all constraints to be met. Thus, it avoids all non-temporal anomalies

DDL , DML ,DCL

What are the difference between DDL, DML and DCL commands?

DDL

Data Definition Language (DDL) statements are used to define the database structure or schema. Some examples:

CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed
COMMENT - add comments to the data dictionary
RENAME - rename an object

DML

Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples:

SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
EXPLAIN PLAN - explain access path to data
LOCK TABLE - control concurrency

DCL

Data Control Language (DCL) statements. Some examples:

GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command

TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions.

COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back
ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use

WHAT DO U MEAN BY RELATION SHIP TYPE?

Q-TYPE OF RELATIONSHIP IN DATA BASE ?
ANS-

Explain the types of relationships in database.

One-to-one
One to one is implemented using single table by establishing relationship between same type of columns in a table.

One-to-many
Implemented using two tables with primary key and foreign key relationships.

Many-to-many
Implemented using a junction table.
The keys from both the tables form composite primary key of the junction table

OR WE CAL CLASSIFY RELATIONSHIP AS

(A)- UNARY RELATIONSHIP -

RELATISONSHIP WITHIN SAME TABLE

(B) - BINARY RELATION SHIP

RELATION SHIP B/W TWO TABLES

Q- WHAT DO U MEAN BY DEGREE OF RELATIONSHIP ?

ANS- No of entity participate in the relationship is called degree of the relationship..

SYMBOLS FOR ER DIAGRAM

E-R Diagram

ER-Diagram is a visual representation of data that describes how data is related to each other.

Symbols and Notations

Components of E-R Diagram

The E-R diagram has three main components.

1) Entity

An Entity can be any object, place, person or class. In E-R Diagram, an entity is represented using rectangles. Consider an example of an Organisation. Employee, Manager, Department, Product and many more can be taken as entities from an Organisation.

Weak Entity

Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of their own. Double rectangle represents weak entity.

2) Attribute

An Attribute describes a property or characterstic of an entity. For example, Name, Age, Address etc can be attributes of a Student. An attribute is represented using eclipse.

Key Attribute

Key attribute represents the main characterstic of an Entity. It is used to represent Primary key. Ellipse with underlying lines represent Key Attribute.

Composite Attribute

An attribute can also have their own attributes. These attributes are known as Composite attribute.

3) Relationship

A Relationship describes relations between entities. Relationship is represented using diamonds.

There are three types of relationship that exist between Entities.

Binary Relationship
Recursive Relationship
Ternary Relationship

Binary Relationship

Binary Relationship means relation between two Entities. This is further divided into three types.

One to One : This type of relationship is rarely seen in real world.

The above example describes that one student can enroll only for one course and a course will also have only one Student. This is not what you will usually see in relationship.

One to Many : It reflects business rule that one entity is associated with many number of same entity. For example, Student enrolls for only one Course but a Course can have many Students.

The arrows in the diagram describes that one student can enroll for only one course.

Many to Many :

The above diagram represents that many students can enroll for more than one courses.

Recursive Relationship

When an Entity is related with itself it is known as Recursive Relationship.

Ternary Relationship

Relationship of degree three is called Ternary relationship

Generalization

Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity. In generalization, the higher level entity can also combine with other lower level entity to make further higher level entity.

Specialization

Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, some higher level entities may not have lower-level entity sets at all.

Aggregration

Aggregration is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course, is acting as an Entity in relation with Visitor.

QUESTIONS

1. What is database?

A database is a collection of information that is organized so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.

In computing, databases are sometimes classified according to their organizational approach. The most prevalent approach is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.

2. What is DBMS?

It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications..

3. What is a Database system?

The database and DBMS software together is called as Database system.

4. What is extension and intension?

Extension: It is the number of tuples present in a table at any instance. This is time dependent.
Intension: It is a constant value that gives the name, structure of table and the constraints laid on it.

5 What is System R? What are its two major subsystems?

System R was designed and developed over a period of 1974-79 at IBM San Jose Research Center. It is a prototype and its purpose was to demonstrate that it is possible to build a Relational System that can be used in a real life environment to solve real life problems, with performance at least comparable to that of existing system.
Its two subsystems are

Research Storage
System Relational Data System.

6 How is the data structure of System R different from the relational structure?

Unlike Relational systems in System R

Domains are not supported
Enforcement of candidate key uniqueness is optional
Enforcement of entity integrity is optional
Referential integrity is not enforced

7. What is Data Independence?

Data independence means that "the application is independent of the storage structure and access strategy of data". In other words, The ability to modify the schema definition in one level should not affect the schema definition in the next higher level.
Two types of Data Independence:

Physical Data Independence: Modification in physical level should not affect the logical level.
Logical Data Independence: Modification in logical level should affect the view level.

NOTE: Logical Data Independence is more difficult to achieve

8. What is a view? How it is related to data independence?

A view may be thought of as a virtual table, that is, a table that does not really exist in its own right but is instead derived from one or more underlying base table. In other words, there is no stored file that direct represents the view instead a definition of view is stored in data dictionary.

Growth and restructuring of base tables is not reflected in views. Thus the view can insulate users from the effects of restructuring and growth in the database. Hence accounts for logical data independence.

9. What is an Entity type?

It is a collection (set) of entities that have same attributes.

10. What is an Entity set?

It is a collection of all entities of particular entity type in the database.

11. What is Data Model?

A collection of conceptual tools for describing data, data relationships data semantics and constraints.

12. What is a Relation Schema and a Relation?

A relation Schema denoted by R(A1, A2, ..., An) is made up of the relation name R and the list of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-values t=(v1,v2, ..., vn).

13. What is degree of a Relation?

It is the number of attribute of its relation schema.

14. What is Relationship?

It is an association among two or more entities.

15 -How does Tuple-oriented relational calculus differ from domain-oriented relational calculus?

The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted values are tuples of that relation. E.g. QUEL
The domain-oriented calculus has domain variables i.e., variables that range over the underlying domains instead of over relation. E.g. ILL, DEDUCE

16. What is indexing and what are the different kinds of indexing?

Indexing is a technique for determining how quickly specific data can be found.
Types:

Binary search style indexing
B-Tree indexing
Inverted list indexing
Memory resident table
Table indexing