Tuesday, May 10, 2016

sql interview questions

1)what is key difference between dbms and rdbms:

The key difference is that RDBMS (relational database management system) applications store data in a tabular form, while DBMS applications store data as files.
Does that mean there are no tables in a DBMS?
There can be, but there will be no “relation” between the tables, like in a RDBMS. In DBMS, data is generally stored in either a hierarchical form or a navigational form

2)What is Primary key,Foreign key,candidate key,Alternate key
Primary is used to identify a record uniquely in a table
Foregin key:.The primary key is the column or set of columns used to uniquely identify the items in a table. A foreign key is used to uniquely identify the items in a different table, allowing join operations to happen.
candidate keys: are keys which can be considered for use to primary or foreighn keys
 Alternate keys:
 3)Difference between primary and unique
Primary does not allow null,restricts entry of duplciate as well as null
unique allws 1 null value, restricts entry of duplicate values
why unique Allows only 1 null : Its a design decision.
 

Referential Integrity:

Data is usually spread across several tables, which are related to each other through key columns. Database normalization is used to avoid duplicates in a database, thus avoiding potential errors. Although it's true that database normalization minimizes the chance for errors, it doesn't eliminate them. There is still the need for a set of data-integrity rules that will enforce the relationships between tables and keep the database as consistent as possible. This set of rules is called referential integrity, and it is a part of the wider set of rules that enforce data integrity in general.
4)What is normalization,forms of normalization


Problem:
The definition of first, second, third, fourth and fifth normal forms.

Solution:
In the process of efficiently storing data, and eliminating redundancy,  tables in a database are 
designed and created to be in one of five possible normal forms.  Each normal form contains and 
enforces the rules of the previous form,  and,  in turn,  applies some stricter rules on the design of 
tables.

A set of tables in a database are initially said to be in 0 normal form.


First Normal Form:
===============
  Tables are said to be in first normal form when:
  - The table has a primary key.
  - No single attribute (column) has multiple values.
  - The non-key attributes (columns) depend on the primary key.

  Some examples of placing a table in first normal form are:

  author_id:  stories:
  000024   novelist, playwright  // multiple values
  000034   magazine columnist
  002345   novella, newpaper columnist // multiple values

  In first normal form the table would look like:

  author_id:  stories:
  000024   novelist
  000024   playwright
  000034   magazine columnist
  002345   novella
  002345   newpaper columnist


Second Normal Form:
=================
  Tables are said to be in second normal form when:
  - The tables meet the criteria for first normal form.
  - If the primary key is a composite of attributes (contains multiple 
    columns), the non key attributes (columns) must depend on the whole
    key.

  Note: Any table with a primay key that is composed of a single 
  attribute (column) is automatically in second normal form.

Third Normal Form:
================
  Tables are said to be in third normal form when:
  - The tables meet the criteria for second normal form.
  - Each non-key attribute in a row does not depend on the entry in 
    another key column.

Fourth Normal Form:
================
  Tables are said to be in fourth normal form when:
  - The table meets the criteria for third normal form.
  - Situations where non-key attributes depend on the key column 
    exclusive of other non-key columns are eliminated.


Fifth Normal Form:
===============
  Tables are said to be in fifth normal form when:
  - The table meets the criteria for fourth normal form.
  - The table consists of a key attribute and a non-key attribute only.
 
 
1NF is the most basic of normal forms - each cell in a table must contain only one piece of information, and there can be no duplicate rows.
2NF and 3NF are all about being dependent on the primary key. Recall that a primary key can be made up of multiple columns. As Chris said in his response:
The data depends on the key [1NF], the whole key [2NF] and nothing but the key [3NF] (so help me Codd).

2NF

Say you have a table containing courses that are taken in a certain semester, and you have the following data:
|-----Primary Key----|               uh oh |
                                           V
CourseID | SemesterID | #Places  | Course Name  |
------------------------------------------------|
IT101    |   2009-1   | 100      | Programming  |
IT101    |   2009-2   | 100      | Programming  |
IT102    |   2009-1   | 200      | Databases    |
IT102    |   2010-1   | 150      | Databases    |
IT103    |   2009-2   | 120      | Web Design   |
This is not in 2NF, because the fourth column does not rely upon the entire key - but only a part of it. The course name is dependent on the Course's ID, but has nothing to do with which semester it's taken in. Thus, as you can see, we have duplicate information - several rows telling us that IT101 is programming, and IT102 is Databases. So we fix that by moving the course name into another table, where CourseID is the ENTIRE key.
Primary Key |

CourseID    |  Course Name |
---------------------------|
IT101       | Programming  |
IT102       | Databases    |
IT103       | Web Design   |
No redundancy!

3NF

Okay, so let's say we also add the name of the teacher of the course, and some details about them, into the RDBMS:
|-----Primary Key----|                           uh oh |
                                                       V
Course  |  Semester  |  #Places   |  TeacherID  | TeacherName  |
---------------------------------------------------------------|
IT101   |   2009-1   |  100       |  332        |  Mr Jones    |
IT101   |   2009-2   |  100       |  332        |  Mr Jones    |
IT102   |   2009-1   |  200       |  495        |  Mr Bentley  |
IT102   |   2010-1   |  150       |  332        |  Mr Jones    |
IT103   |   2009-2   |  120       |  242        |  Mrs Smith   |
Now hopefully it should be obvious that TeacherName is dependent on TeacherID - so this is not in 3NF. To fix this, we do much the same as we did in 2NF - take the TeacherName field out of this table, and put it in its own, which has TeacherID as the key.
 Primary Key |

 TeacherID   | TeacherName  |
 ---------------------------|
 332         |  Mr Jones    |
 495         |  Mr Bentley  |
 242         |  Mrs Smith   |
No redundancy!!
One important thing to remember is that if something is not in 1NF, it is not in 2NF or 3NF either. So each additional Normal Form requires everything that the lower normal forms had, plus some extra conditions, which must all be fulfilled.
 
 5)difference between truncate and drop
truncate empties table  ------ drop delete data and structure as well
6) what are acid properties?
Atomicity - Allon non rule for db modifications
Consistency gurantess that transaction never leaves db in inconsistent state
Integrity keeps transaction seperate from each other until they are finished
Durability :db will keep track of changes ins such a way that it can be recovered
7) Type of constraints
not null
null
foreign_key
Primary key
Unique
default 
check
8)DML commands,DDl,DCL commands

DDL :create/alter/drop
DML : Insert update,delete,Select
DCL:Grant Commit,Rollback,revoke

9)Difference between having and where?
10)Difference between union and union all
     union all does not include duplicates
     union all does not include duplicates
11) distinct
DISTINCT (if used outside an aggregation function) that is superfluous when the SELECT clause includes all the columns specified in the GROUP BY clause. The GROUP BY clause groups the rows in such a way that the columns on which they are grouped no longer contain duplicate values.

DISTINCT filters out duplicate values of your returned fields.

12) Aggregate functions
min()
max()
avg()
sum()
count()
15) select from table where condition with and or 
16) IN/not in
17) Rules for group by group by rules 
 Any column specification specified in the SELECT clause must exclusively
occur as a parameter of an aggregated function
OR

In the list of columns given in the GROUP BY clause,
OR
or in both

2)An expression that occurs in the GROUP BY clause can appear in the SELECT clause.


Incorrect as TOWN does not appear in aggregated count function and it also does not occur in group by clause

SELECT   TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY PLAYERNO

Correct as sal appears in count,

SELECT   sal, COUNT(sal)
FROM     PLAYERS
GROUP BY deptno;


Correct as dept no occurs in group by

SELECT   deptno, COUNT(*) as number
FROM     emp
GROUP BY deptno;


10.5 Grouping of NULL Values
If grouping is required on a column that contains NULL values,
all these NULL values form one group. When rows are grouped,
NULL values are also considered to be equal.
The reason is that, with a GROUP BY, a vertical comparison is applied.

Group By X means put all those with the same value for X in the one group.

Group By X, Y means put all those with the same values for both X and Y in the one group.

JOINS in SQL:

What is SQL JOIN ?

SQL JOIN is a method to retrieve data from two or more database tables.

What are the different SQL JOINs ?

There are a total of five JOINs. They are :
  1. JOIN or INNER JOIN
  2. OUTER JOIN

     2.1 LEFT OUTER JOIN or LEFT JOIN
     2.2 RIGHT OUTER JOIN or RIGHT JOIN
     2.3 FULL OUTER JOIN or FULL JOIN

  3. NATURAL JOIN
  4. CROSS JOIN
  5. SELF JOIN

1. JOIN or INNER JOIN :

In this kind of a JOIN, we get all records that match the condition in both the tables, and records in both the tables that do not match are not reported.
In other words, INNER JOIN is based on the single fact that : ONLY the matching entries in BOTH the tables SHOULD be listed.
Note that a JOIN without any other JOIN keywords (like INNER, OUTER, LEFT, etc) is an INNER JOIN.

Join: list all orders with customer information


2. OUTER JOIN :

OUTER JOIN retrieves
Either, the matched rows from one table and all rows in the other table Or, all rows in all tables (it doesn't matter whether or not there is a match).
There are three kinds of Outer Join :
2.1 LEFT OUTER JOIN or LEFT JOIN
This join returns all the rows from the left table in conjunction with the matching rows from the right table. If there are no columns matching in the right table, it returns NULL values.
left  join : all customer who have placed orders or not
2.2 RIGHT OUTER JOIN or RIGHT JOIN

This JOIN returns all the rows from the right table in conjunction with the matching rows from the left table. If there are no columns matching in the left table, it returns NULL values.

right inner join : customers who have not placed orders
 




2.3 FULL OUTER JOIN or FULL JOIN
This JOIN combines LEFT OUTER JOIN and RIGHT OUTER JOIN. It returns row from either table when the conditions are met and returns NULL value when there is no match.
In other words, OUTER JOIN is based on the fact that : ONLY the matching entries in ONE OF the tables (RIGHT or LEFT) or BOTH of the tables(FULL) SHOULD be listed.
 
Note that `OUTER JOIN` is a loosened form of `INNER JOIN`.

full join: match all customers and suppliers by country 

3. NATURAL JOIN :

It is based on the two conditions :
  1. the JOIN is made on all the columns with the same name for equality.
  2. Removes duplicate columns from the result.
This seems to be more of theoretical in nature and as a result (probably) most DBMS don't even bother supporting this.

4. CROSS JOIN :

It is the Cartesian product of the two tables involved. The result of a CROSS JOIN will not make sense in most of the situations. Moreover, we wont need this at all (or needs the least, to be precise).

5. SELF JOIN :

It is not a different form of JOIN, rather it is a JOIN (INNER, OUTER, etc) of a table to itself.

self join : match customers which are from same city and country


SUB Queries:

1. List the employees working in research department
2. List employees who are located in New York and Chicago
3. Display the department name in which ANALYSTS are working
4. Display employees who are reporting to JONES
5. Display all the employees who are reporting to Jones Manager
6. Display all the managers in SALES and ACCOUNTING department
7. Display all the employee names in Research and Sales Department who are having at least 1 person reporting to them
8. Display all employees who do not have any reportees
9. List employees who are having at least 2 reporting
10. List the department names which are having more than 5 employees
11. List department name having at-least 3 salesman
12. List employees from research and accounting having at-least 2 reporting
13. Display second max salary
14. Display 4th max salary
15. Display 5th max salary 

21)Corelated subqueries
16. Write a query to get 4th max salary from EMP table
17. Write a query to get 2nd & 6th max salary from EMP table
18. Write a query to get first 3 salaries from the EMP table
19. Write a query to get 2nd least salary from the EMP table
20. Write a query to get least 3 salaries from the EMP table
21. List all the employees whose salaries are greater than their respective departmental average salary.
Trigger:
is a procedure that initates an action when event such as insert,delete,update happens.

View: stored query accesible as a virtual table.
It can be used for retrieving updating deleting rows
can also contain aggregate fields

Rows  updated  or  deleted  in  the  view  are  updated 
or  deleted in the table the view was created with. It should also be noted
that  as  data  in  the  original  table  changes,  so  does
the data  in 
the  view as  views  are  the  way  to  look  at  parts of  the  original
table.

Rows  updated  or  deleted  in  the  view  are  updated 
or  deleted in the table the view was created with. It should also be noted
that  as  data  in  the  original  table  changes,  so  does
the data  in 
the  view as  views  are  the  way  to  look  at  parts of  the  original
table. 
Two main  purposesof creating a view are 1.)
provide a security  
mechanism which restricts users to a certain subset of data and
2.)
provide a mechanism for developers to customize how users can
logically view the data.

UNION and UNION All:

The union and union all operators allow you to combine multiple data sets. The difference between the two is that union sorts the combined set and removes duplicates while union all does not

The exists keyword can be used in that way, but really it's intended as a way to avoid counting:
--this statement needs to check the entire table
select count(*) from [table] where ...

--this statement is true as soon as one match is found
exists ( select * from [table] where ... )
This is most useful where you have if conditional statements, as exists can be a lot quicker than count.
The in is best used where you have a static list to pass:
 select * from [table]
 where [field] in (1, 2, 3)
When you have a table in an in statement it makes more sense to use a join, but mostly it shouldn't matter. The query optimiser should return the same plan either way. In some implementations (mostly older, such as Microsoft SQL Server 2000) in queries will always get a nested join plan, while join queries will use nested, merge or hash as appropriate. More modern implementations are smarter and can adjust the plan even when in is used.

Find Duplicates:

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1
 
You can create indexes on columns to speed up queries. Indexes provide faster access to data for operations that return a small portion of a table's rows.
In general, you should create an index on a column in any of the following situations:
  • The column is queried frequently.
  • A referential integrity constraint exists on the column.
  • A UNIQUE key integrity constraint exists on the column.
You can create an index on any column; however, if the column is not used in any of these situations, creating an index on the column does not increase performance and the index takes up resources unnecessarily.
Although the database creates an index for you on a column with an integrity constraint, explicitly creating an index on such a column is recommended.
You can use the following techniques to determine which columns are best candidates for indexing:
  • Use the EXPLAIN PLAN feature to show a theoretical execution plan of a given query statement.
  • Use the V$SQL_PLAN view to determine the actual execution plan used for a given query statement.