Unit 2 | unit 2 structured query language sql

Back to Study material

RDBMS

Unit - 2

Structured Query Language (SQL)

2.1 SQL: Data types - fixed length, variable length, ex

In order to describe the values that a column can hold, SQL Datatype is used.

Each column is needed in the database table to have a name and data type.

Fig 1: SQL data types

Binary data types

Three kinds of binary data types are given below:

Data type	Description
Binary	It has a fixed byte length of 8000. It includes binary data of fixed-length.
Varbinary	It has a fixed byte length of 8000. It includes binary data of variable-length.
Image	It has a maximum of 2,147,483,647 bytes in length. It includes binary data of variable-length.

2. Numeric data types

The subtypes are given below:

Data type	From	To	Description
Float	-1.79E + 308	1.79E + 308	Used to specify a floating-point value
Real	-3.40e + 38	3.40E + 38	Specifies a single precision floating point number

3. Extract numeric data types

The subtypes are given below:

Data types	Description
Int	Used to specify an integer value.
Smallint	Used to specify small integer value
Bit	Number of bits to store.
Decimal	Numeric value that can have a decimal number
Numeric	Used to specify a numeric value

4. Character String data types

Data types	Description
Char	It contains Fixed-length (max - 8000 character)
Varchar	It contains variable-length (max - 8000 character)
Text	It contains variable-length (max - 2,147,483,647 character)

5. Date and Time data types

Data types	Description
Date	Used to store the year, month, and days value.
Time	Used to store the hour, minute, and second values.
Timestamp	Stores the year, month, day, hour, minute, and the second value.

2.2 Data Constraints - Primary key, Foreign key, Null, Check, Default

Constraints are rules that can be applied to a table's data type. That is, we may use constraints to limit the type of data that can be recorded in a specific column in a table.

Constraints set restrictions on how much and what kind of data can be inserted, modified, and deleted from a table. Constraints are used to ensure data integrity during an update, removal, or insert into a table.

Primary key

The primary key is a field that uniquely identifies each table row. If a column in a table is designated as a primary key, it cannot include NULL values, and all rows must have unique values for this field. To put it another way, this is a combination of NOT NULL and UNIQUE constraints.

The ROLL NO field is marked as primary key in the example below, which means it cannot contain duplicate or null entries.

CREATE TABLE STUDENT(

ROLL_NO INT NOT NULL,

STU_NAME VARCHAR (35) NOT NULL UNIQUE,

STU_AGE INT NOT NULL,

STU_ADDRESS VARCHAR (35) UNIQUE,

PRIMARY KEY (ROLL_NO)

);

Foreign key

A foreign key differs from a super key, candidate key, or primary key in that it is used to join two tables or link them together.

A foreign key is a secondary key that connects two tables via the primary key. It signifies that one table's columns point to the other table's primary key attribute. It also means that any attribute set as a primary key attribute will serve as a foreign key attribute in another table. A foreign key, on the other hand, has nothing to do with the primary key.

A Foreign Key column establishes a connection between two tables. Foreign keys are used to guarantee data integrity while also allowing navigation between two different instances of the same entity. It functions as a cross-reference between two tables since it refers to another table's main key.

Example

DeptCode	DeptName
1	Computer
2	Science
3	English

TeacherID	Fname	Lname
T007	Sara	Choubey
T019	David	Chopda
T004	Raj	Sharma

We have two tables in this key in dbms example, teach and department in a school. There is, however, no way to see which searches are used in which departments.

We may construct a relationship between the two tables in this table by adding the foreign key in DeptCode to the Teacher name.

TeacherID	DeptCode	Fname	Lname
T007	1	Sara	Choubey
T019	2	David	Chopda
T004	3	Raj	Sharma

This concept is also known as Referential Integrity.

Null

The NOT NULL constraint ensures that no NULL values are stored in a column. When we don't specify a value for a field when inserting a record into a table, the column defaults to NULL. We can ensure that a certain column(s) cannot have NULL values by specifying a NULL constraint.

Example:

CREATE TABLE STUDENT(

ROLL_NO INT NOT NULL,

STU_NAME VARCHAR (35) NOT NULL,

STU_AGE INT NOT NULL,

STU_ADDRESS VARCHAR (235),

PRIMARY KEY (ROLL_NO)

);

Check

This constraint is used to specify a table's range of values for a certain column. When this constraint is applied to a column, it assures that the value of the specified column must be inside the provided range.

CREATE TABLE STUDENT(

ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,

STU_NAME VARCHAR (35) NOT NULL,

STU_AGE INT NOT NULL,

EXAM_FEE INT DEFAULT 10000,

STU_ADDRESS VARCHAR (35) ,

PRIMARY KEY (ROLL_NO)

);

The check constraint on the ROLL NO column of the STUDENT database was set in the preceding example. The value of the ROLL NO field must now be bigger than 1000.

Default

When no value is provided when inserting a record into a table, the DEFAULT constraint assigns a default value to the field.

Example

The following SQL, for example, creates a new table called CUSTOMERS and adds five fields to it. The SALARY column is set to 5000.00 by default in this situation, so if the INSERT INTO statement does not supply a value for this column, it will be set to 5000.00 by default.

CREATE TABLE CUSTOMERS(

ID INT NOT NULL,

NAME VARCHAR (20) NOT NULL,

AGE INT NOT NULL,

ADDRESS CHAR (25) ,

SALARY DECIMAL (18, 2) DEFAULT 5000.00,

PRIMARY KEY (ID)

);

If the CUSTOMERS table has already been established, you'll need to perform a query similar to the one shown in the code block below to add a DEFAULT constraint to the SALARY column.

ALTER TABLE CUSTOMERS

MODIFY SALARY DECIMAL (18, 2) DEFAULT 5000.00;

Drop Default Constraint

Use the SQL statement below to remove a DEFAULT constraint.

ALTER TABLE CUSTOMERS

ALTER COLUMN SALARY DROP DEFAULT;

Key takeaway

Constraints are rules that can be applied to a table's data type. That is, we may use constraints to limit the type of data that can be recorded in a specific column in a table.
The primary key is a field that uniquely identifies each table row.
A foreign key is a table field that uniquely identifies each row of a different table.

2.3 Clauses - (Select, where, group by, order by)

Select

The SELECT statement is used in the database to select / retrieve data from a table. The most frequently used statement is the SQL Pick Statement.

The SELECT Statement question retrieves data from the table as a whole or from some particular columns.

We need to write SELECT statement queries in SQL if we want to retrieve any data from a table.

We retrieve the data by columns in SELECT Statement wise.

Syntax

The SQL SELECT Syntax categorizes into two sections, first we get complete table data from all columns, second we get some unique columns from a table.

SELECT all Columns data from a table

SELECT * FROM table_name;

* = indicate the retrieve all the columns from a table.

Table_name = is the name of table from which the data is retrieved.

SELECT some columns data from a table

SELECT Col1, Col2, Col3 FROM table_name

Col1, Col2, Col3 = is the column name from which data is retrieved.

Table_name = is the name of table from which the data is retrieved.

A SELECT command has number of clauses:

WHERE clause: Any output row from the FROM stage applies a criterion, further reducing the number of rows.
GROUP BY clause: Class sets of rows with values balanced by columns in a group.
HAVING clause: Criteria for each group of rows are added. Criteria can only be extended to columns within a category that have constant values (those in the grouping columns or aggregate functions applied across the group).
ORDER BY clause: The rows returned from the SELECT stage are sorted as desired. Supports sorting, ascending or descending on multiple columns in a specified order. The output columns will be similar and have the same name as the columns returned from the SELECT point.
INTO clause: Logically, the INTO clause is last used in processing.

Key takeaway:

The SELECT statement is used in the database to select / retrieve data from a table. We retrieve the data by columns in SELECT Statement wise.

The SELECT Statement question retrieves data from the table as a whole or from some particular columns.

Where

When collecting data from a single table or merging many tables, the SQL WHERE clause is used to set a condition. Only a specific value from the table is returned if the stated condition is met. To filter the records and retrieve only the ones you need, use the WHERE clause.

The WHERE clause is utilized not just in the SELECT statement, but also in the UPDATE, DELETE, and other statements.

Syntax

The SELECT statement with the WHERE clause has the following basic syntax.

SELECT column1, column2, columnN

FROM table_name

WHERE [condition]

You can use comparison or logical operators like >, =, LIKE, NOT, etc. to describe a condition.

Example

Consider the following records in the CUSTOMERS table:

+----+----------+-----+-----------+----------+

+----+----------+-----+-----------+----------+

| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |

| 2 | Khilan | 25 | Delhi | 1500.00 |

| 3 | kaushik | 23 | Kota | 2000.00 |

| 4 | Chaitali | 25 | Mumbai | 6500.00 |

| 5 | Hardik | 27 | Bhopal | 8500.00 |

| 6 | Komal | 22 | MP | 4500.00 |

| 7 | Muffy | 24 | Indore | 10000.00 |

+----+----------+-----+-----------+----------+

The following code retrieves the ID, Name, and Salary fields from the CUSTOMERS database, where the salary is larger than 2000 dollars.

SQL> SELECT ID, NAME, SALARY

FROM CUSTOMERS

WHERE SALARY > 2000;

This would result in the following:

+----+----------+----------+

| ID | NAME | SALARY |

+----+----------+----------+

| 4 | Chaitali | 6500.00 |

| 5 | Hardik | 8500.00 |

| 6 | Komal | 4500.00 |

| 7 | Muffy | 10000.00 |

+----+----------+----------+

The following query, for example, would retrieve the ID, Name, and Salary fields for a client named Hardik from the CUSTOMERS table.

It's vital to remember that all strings should be enclosed in single quotes ("). Numeric values, on the other hand, should be presented without any quotation marks, like in the example above.

SQL> SELECT ID, NAME, SALARY

FROM CUSTOMERS

WHERE NAME = 'Hardik';

This would result in the following:

+----+----------+----------+

| ID | NAME | SALARY |

+----+----------+----------+

| 5 | Hardik | 8500.00 |

+----+----------+----------+

Group by

What is the SQL Group by Clause?

The GROUP BY clause is a SQL command that is used to group rows that have the same values. The GROUP BY clause is used in the SELECT statement. Optionally it is used in conjunction with aggregate functions to produce summary reports from the database.

That's what it does, summarizing data from the database.

The queries that contain the GROUP BY clause are called grouped queries and only return a single row for every grouped item.

SQL GROUP BY Syntax

Now that we know what the SQL GROUP BY clause is, let's look at the syntax for a basic group by query.

SELECT statements... GROUP BY column_name1[,column_name2,...] [HAVING condition];

HERE

● "SELECT statements..." is the standard SQL SELECT command query.

● "GROUP BYcolumn_name1" is the clause that performs the grouping based on column_name1.

● "[,column_name2,...]" is optional; represents other column names when the grouping is done on more than one column.

● "[HAVING condition]" is optional; it is used to restrict the rows affected by the GROUP BY clause. It is similar to the WHERE clause.

Grouping using a Single Column

In order to help understand the effect of SQL Group By clause, let's execute a simple query that returns all the gender entries from the member’s table.

SELECT gender FROM members;

Gender

Female

Male

Female

Male

Suppose we want to get the unique values for genders. We can use a following query -

SELECT gender FROM members GROUP BY gender;

Executing the above script in MYSQL workbench against the Myflixdb gives us the following results.

Gender

Female

Male

Note only two results have been returned. This is because we only have two gender types Male and Female. The GROUP BY clause in SQL grouped all the "Male" members together and returned only a single row for it. It did the same with the "Female" members.

Grouping using multiple columns

Suppose that we want to get a list of movie category_id and corresponding years in which they were released.

Let's observe the output of this simple query

SELECT category_id,year_released FROM movies;

Category_id	Year_released
1	2011
2	2008
NULL	2008
NULL	2010
8	2007
6	2007
6	2007
8	2005
NULL	2012
7	1920
8	NULL
8	1920

The above result has many duplicates.

Let's execute the same query using group by in SQL -

SELECT category_id,year_released FROM movies GROUP BY category_id,year_released;

Executing the above script in MySQL workbench against the myflixdb gives us the following results shown below.

Category_id	Year_released
NULL	2008
NULL	2010
NULL	2012
1	2011
2	2008
6	2007
7	1920
8	1920
8	2005
8	2007

The GROUP BY clause operates on both the category id and year released to identify unique rows in our above example.

If the category id is the same but the year released is different, then a row is treated as a unique one. If the category id and the year released is the same for more than one row, then it's considered a duplicate and only one row is shown.

Order by

The SQL ORDER BY clause is used to sort data by one or more columns in ascending or descending order. By default, some databases sort query results in ascending order.

Syntax

The ORDER BY clause's basic grammar is as follows:

SELECT column-list

FROM table_name

[WHERE condition]

[ORDER BY column1, column2, .. ColumnN] [ASC | DESC];

In the ORDER BY clause, you can utilize more than one column. Make sure that whichever column you're using to sort is included in the column-list.

Example

Consider the following records in the CUSTOMERS table:

+----+----------+-----+-----------+----------+

+----+----------+-----+-----------+----------+

| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |

| 2 | Khilan | 25 | Delhi | 1500.00 |

| 3 | kaushik | 23 | Kota | 2000.00 |

| 4 | Chaitali | 25 | Mumbai | 6500.00 |

| 5 | Hardik | 27 | Bhopal | 8500.00 |

| 6 | Komal | 22 | MP | 4500.00 |

| 7 | Muffy | 24 | Indore | 10000.00 |

+----+----------+-----+-----------+----------+

The following code block shows an example of sorting the results by NAME and SALARY in ascending order.

SQL> SELECT * FROM CUSTOMERS

ORDER BY NAME, SALARY;

This would result in the following:

+----+----------+-----+-----------+----------+

+----+----------+-----+-----------+----------+

| 4 | Chaitali | 25 | Mumbai | 6500.00 |

| 5 | Hardik | 27 | Bhopal | 8500.00 |

| 3 | kaushik | 23 | Kota | 2000.00 |

| 2 | Khilan | 25 | Delhi | 1500.00 |

| 6 | Komal | 22 | MP | 4500.00 |

| 7 | Muffy | 24 | Indore | 10000.00 |

| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |

+----+----------+-----+-----------+----------+

2.4 SQL Operators: Logical, Relational, Special- In Between, Like

An operator is a reserved word or character that is used in the WHERE clause of a SQL statement to perform operations like comparisons and arithmetic operations. These Operators are used to define conditions in SQL statements and to function as conjunctions for multiple conditions in a single statement.

● Arithmetic operators

● Comparison operators

● Logical operators

● Operators used to negate conditions

Logical

Assume that 'variable a' and 'variable b' exist. In this case, 'a' has 20 and 'b' has 10.

Sr.No	Operator & Description
1	ALL When a value is compared to all of the values in another value set, the ALL operator is used.
2	AND The AND operator allows multiple conditions to occur in the WHERE clause of a SQL statement.
3	ANY The ANY operator compares a value to any valid value in the list according to the condition.
4	BETWEEN The BETWEEN operator is used to find values that are between the minimum and maximum values in a set of values.
5	EXISTS The EXISTS operator is used to look for a row in a given table that meets a certain set of criteria.
6	IN When a value is compared to a list of literal values that have been defined, the IN operator is used.
7	LIKE The LIKE operator compares a value to other values that are identical using wildcard operators.
8	NOT The NOT operator flips the definition of the logical operator it's used with. For example, NOT EXISTS, NOT BETWEEN, NOT IN, and so on. It's a negation operator.
9	OR The OR operator is used in the WHERE clause of a SQL statement to combine several conditions.
10	IS NULL When a value is compared to a NULL value, the NULL operator is used.
11	UNIQUE The UNIQUE operator looks for uniqueness in every row of a table (no duplicates).

Relational

When two expressions or values are compared, a Boolean result is returned. The table below lists all of the relational operations that PL/SQL supports. Assume variable A contains 10 and variable B contains 20, then

Operator	Description	Example
=	Checks whether the values of two operands are equal, and if they are, the condition is true.	(A = B) is not true.
!= <> ~=	Checks whether the values of two operands are equivalent; if they aren't, the condition is true.	(A != B) is true.
>	If the left operand's value is greater than the right operand's value, the condition becomes true.	(A > B) is not true.
<	If the value of the left operand is less than the value of the right operand, the condition is satisfied.	(A < B) is true.
>=	If the left operand's value is larger than or equal to the right operand's value, the condition becomes true.	(A >= B) is not true.
<=	If the left operand's value is less than or equal to the right operand's value, then the condition is true.	(A <= B) is true

Special - In

IN operator

The IN operator allows you to specify two or more expressions to use in a query search. If the value of the corresponding column equals one of the phrases indicated by the IN predicate, the condition is true.

Syntax:

SELECT column_name(s)

FROM table_name

WHERE column_name IN (list_of_values);

Between

The BETWEEN operator gives a range that sets the qualifying values' bottom and upper bounds. This operator combines the and and operators. A range of values larger than or equal to 5000 and less than or equal to 20000 has been chosen as the data. Numeric, text, and date data types can all be used with the Between operator.

Syntax:

SELECT column_name(s)

FROM table_name

WHERE column_name BETWEEN value1 AND value2;

LIKE is a comparison operator that compares column values to a pattern. Regular characters must match the characters supplied in the character string exactly during pattern matching. Any character or date data type can be used in the column. Wildcard characters are characters that can be used anywhere in the pattern. I utilized four different forms of wildcards:

Percent sign (%): It is used to represent or search any string of zero or more characters.
Underscore (_): It is used to represent or search a single character.
Bracket ([]): It is used to represent or search any single character within the specified range.
Caret (^): It is used to represent or search any single character not within the specified range.

Syntax:

SELECT *

FROM users

WHERE first_name LIKE ‘%Bob%’;

Key takeaway

An operator is a reserved word or character that is used in the WHERE clause of a SQL statement to perform operations like comparisons and arithmetic operations.

2.5 Sub Queries

A Subquery is a query within another SQL query and embedded within the WHERE clause.

Important Rule:

● A subquery can be placed in a number of SQL clauses like WHERE clause, FROM clause, HAVING clause.

● You can use Subquery with SELECT, UPDATE, INSERT, DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

● A subquery is a query within another query. The outer query is known as the main query, and the inner query is known as a subquery.

● Subqueries are on the right side of the comparison operator.

● A subquery is enclosed in parentheses.

● In the Subquery, ORDER BY command cannot be used. But GROUP BY command can be used to perform the same function as ORDER BY command.

1. Subqueries with the Select Statement

SQL subqueries are most frequently used with the Select statement.

Syntax

SELECT column_name
FROM table_name
WHERE column_name expression operator
( SELECT column_name from table_name WHERE ... );

Example

Consider the EMPLOYEE table have the following records:

ID	NAME	AGE	ADDRESS	SALARY
1	John	20	US	2000.00
2	Stephan	26	Dubai	1500.00
3	David	27	Bangkok	2000.00
4	Alina	29	UK	6500.00
5	Kathrin	34	Bangalore	8500.00
6	Harry	42	China	4500.00
7	Jackson	25	Mizoram	10000.00

The subquery with a SELECT statement will be:

SELECT *
FROM EMPLOYEE
WHERE ID IN (SELECT ID
FROM EMPLOYEE
WHERE SALARY > 4500);

This would produce the following result:

ID	NAME	AGE	ADDRESS	SALARY
4	Alina	29	UK	6500.00
5	Kathrin	34	Bangalore	8500.00
7	Jackson	25	Mizoram	10000.00

2. Subqueries with the INSERT Statement

● SQL subquery can also be used with the Insert statement. In the insert statement, data returned from the subquery is used to insert into another table.

● In the subquery, the selected data can be modified with any of the character and date functions.

Syntax:

INSERT INTO table_name (column1, column2, column3....)
SELECT *
FROM table_name
WHERE VALUE OPERATOR

Example

Consider a table EMPLOYEE_BKP with similar as EMPLOYEE.

Now use the following syntax to copy the complete EMPLOYEE table into the EMPLOYEE_BKP table.

INSERT INTO EMPLOYEE_BKP
SELECT * FROM EMPLOYEE
WHERE ID IN (SELECT ID
FROM EMPLOYEE);

3. Subqueries with the UPDATE Statement

The subquery of SQL can be used in conjunction with the Update statement. When a subquery is used with the Update statement, then either single or multiple columns in a table can be updated.

Syntax

UPDATE table
SET column_name = new_value
WHERE VALUE OPERATOR
(SELECT COLUMN_NAME
FROM TABLE_NAME
WHERE condition);

Example

Let's assume we have an EMPLOYEE_BKP table available which is backup of EMPLOYEE table. The given example updates the SALARY by .25 times in the EMPLOYEE table for all employee whose AGE is greater than or equal to 29.

UPDATE EMPLOYEE
SET SALARY = SALARY * 0.25
WHERE AGE IN (SELECT AGE FROM CUSTOMERS_BKP
WHERE AGE >= 29);

This would impact three rows, and finally, the EMPLOYEE table would have the following records.

ID	NAME	AGE	ADDRESS	SALARY
1	John	20	US	2000.00
2	Stephan	26	Dubai	1500.00
3	David	27	Bangkok	2000.00
4	Alina	29	UK	1625.00
5	Kathrin	34	Bangalore	2125.00
6	Harry	42	China	1125.00
7	Jackson	25	Mizoram	10000.00

4. Subqueries with the DELETE Statement

The subquery of SQL can be used in conjunction with the Delete statement just like any other statements mentioned above.

Syntax

DELETE FROM TABLE_NAME
WHERE VALUE OPERATOR
(SELECT COLUMN_NAME
FROM TABLE_NAME
WHERE condition);

Example

Let's assume we have an EMPLOYEE_BKP table available which is backup of EMPLOYEE table. The given example deletes the records from the EMPLOYEE table for all EMPLOYEE whose AGE is greater than or equal to 29.

DELETE FROM EMPLOYEE
WHERE AGE IN (SELECT AGE FROM EMPLOYEE_BKP
WHERE AGE >= 29 );

This would impact three rows, and finally, the EMPLOYEE table would have the following records.

ID	NAME	AGE	ADDRESS	SALARY
1	John	20	US	2000.00
2	Stephan	26	Dubai	1500.00
3	David	27	Bangkok	2000.00
7	Jackson	25	Mizoram	10000.00

Key takeaway

A Subquery is a query within another SQL query and embedded within the WHERE clause.

Important Rule:

● A subquery can be placed in a number of SQL clauses like WHERE clause, FROM clause, HAVING clause.

● You can use Subquery with SELECT, UPDATE, INSERT, DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

● A subquery is a query within another query. The outer query is known as the main query, and the inner query is known as a subquery.

● Subqueries are on the right side of the comparison operator.

● A subquery is enclosed in parentheses.

● In the Subquery, ORDER BY command cannot be used. But GROUP BY command can be used to perform the same function as ORDER BY command.

2.6 Join - Sub queries

A subquery can be rebuilt as a JOIN in many circumstances, but not all.

Rewriting Subqueries as JOINS

The DISTINCT keyword can be used to rewrite a subquery that uses IN, for example:

SELECT * FROM table1 WHERE col1 IN (SELECT col1 FROM table2);

Can be rephrased as:

SELECT DISTINCT table1.* FROM table1, table2 WHERE table1.col1=table2.col1;

You can also rewrite NOT IN or NOT EXISTS queries. These two queries, for example, yield the same result:

SELECT * FROM table1 WHERE col1 NOT IN (SELECT col1 FROM table2);

SELECT * FROM table1 WHERE NOT EXISTS (SELECT col1 FROM table2 WHERE table1.col1=table2.col1);

Both of which can be rewritten as:

SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id WHERE table2.id IS NULL;

Sometimes, subqueries that can be rebuilt as an LEFT JOIN are more efficient.

Using Subqueries instead of JOINS

However, there are some situations when subqueries are preferable to joins:

● When you need duplicates but not bogus ones. Assume Table 1 has three rows numbered 1,1,2 and Table 2 contains two rows numbered 1,2,2. Only this subquery-based SELECT statement will give you the correct result (1,1,2) if you need to list the rows in Table 1 that are also in Table 2:

SELECT Table_1.column_1

FROM Table_1

WHERE Table_1.column_1 IN

(SELECT Table_2.column_1

FROM Table_2);

This SQL statement is incorrect:

SELECT Table_1.column_1

FROM Table_1,Table_2

WHERE Table_1.column_1 = Table_2.column_1;

Because the output will be 1,1,2,2 — and 2 duplication is a mistake This SQL query will also fail:

SELECT DISTINCT Table_1.column_1

FROM Table_1,Table_2

WHERE Table_1.column_1 = Table_2.column_1;

Since the outcome is 1,2 — thus removing the repeated 1 is also an error.

● When the statement before the query is not a query. The SQL command:

UPDATE Table_1 SET column_1 = (SELECT column_1 FROM Table_2);

Cannot be expressed using a join unless some uncommon SQL3 features are employed.

● When the join is complete, an expression is used. The SQL command:SELECT * FROM Table_1

WHERE column_1 + 5 =

(SELECT MAX(column_1) FROM Table_2);

It's difficult to convey with a join. In fact, this SQL query is the only thing that comes to mind:

SELECT Table_1.*

FROM Table_1,

(SELECT MAX(column_1) AS max_column_1 FROM Table_2) AS Table_2

WHERE Table_1.column_1 + 5 = Table_2.max_column_1;

Nothing is gained from the change because it still involves a parenthesized query.

2.7 Nesting subqueries

Subqueries can be nested inside each other. SQL allows you to stack queries within each other. A subquery is a SELECT statement that returns intermediate results and is nested within another SELECT statement. The innermost subquery is executed first, followed by the following level.

You may recall that joins allow you to filter queries. Rather than filtering in the WHERE clause, it's customary to connect a subquery that hits the same table as the outer query.

SELECT *

FROM tutorial.sf_crime_incidents_2014_01 incidents

JOIN ( SELECT date

FROM tutorial.sf_crime_incidents_2014_01

ORDER BY date

LIMIT 5

) sub

ON incidents.date = sub.date

When paired with aggregations, this can be extremely useful. When you join, the subquery output requirements are less severe than when you employ the WHERE clause. Your inner query, for example, can return numerous results. The following query sorts all of the results by the number of events recorded in a single day. It accomplishes this by collecting the total number of incidences per day in the inner query, then sorting the outer query using those values:

SELECT incidents.*,

sub.incidents AS incidents_that_day

FROM tutorial.sf_crime_incidents_2014_01 incidents

JOIN ( SELECT date,

COUNT(incidnt_num) AS incidents

FROM tutorial.sf_crime_incidents_2014_01

GROUP BY 1

) sub

ON incidents.date = sub.date

ORDER BY sub.incidents DESC, time

Subqueries can help you improve the performance of your queries significantly. Let's go over the Crunchbase Data again. Consider how you'd like to compile all of the companies that receive funding and those that are bought each month. If you wanted to, you could accomplish it without subqueries, but don't run it because it will take minutes to return:

SELECT COALESCE(acquisitions.acquired_month, investments.funded_month) AS month,

COUNT(DISTINCT acquisitions.company_permalink) AS companies_acquired,

COUNT(DISTINCT investments.company_permalink) AS investments

FROM tutorial.crunchbase_acquisitions acquisitions

FULL JOIN tutorial.crunchbase_investments investments

ON acquisitions.acquired_month = investments.funded_month

GROUP BY 1

Note that you must join on date fields to execute this effectively, which results in a big "data explosion." Essentially, you're linking every row in a given month from one table to every month in a given row on the other table, resulting in an enormous number of rows returned. COUNT(DISTINCT) must be used instead of COUNT to acquire accurate counts due to the multiplicative impact. This is what you'll see below:

7,414 rows are returned with the following query:

SELECT COUNT(*) FROM tutorial.crunchbase_acquisitions

There are 83,893 rows in the following query:

SELECT COUNT(*) FROM tutorial.crunchbase_investments

The query below returns 6,237,396 rows:

SELECT COUNT(*)

FROM tutorial.crunchbase_acquisitions acquisitions

FULL JOIN tutorial.crunchbase_investments investments

ON acquisitions.acquired_month = investments.funded_month

If you want to learn more about cartesian products, you can do some additional study. It's also worth noting that the FULL JOIN and COUNT operations above are rather quick—the it's COUNT(DISTINCT) that takes the longest. More on this in the query optimization lesson.

Of course, you could solve this much faster by aggregating the two tables individually, then merging them together to conduct the counts across much smaller datasets:

SELECT COALESCE(acquisitions.month, investments.month) AS month,

acquisitions.companies_acquired,

investments.companies_rec_investment

FROM (

SELECT acquired_month AS month,

COUNT(DISTINCT company_permalink) AS companies_acquired

FROM tutorial.crunchbase_acquisitions

GROUP BY 1

) acquisitions

FULL JOIN (

SELECT funded_month AS month,

COUNT(DISTINCT company_permalink) AS companies_rec_investment

FROM tutorial.crunchbase_investments

GROUP BY 1

)investments

ON acquisitions.month = investments.month

ORDER BY 1 DESC

Just in case one table had observations in a month that the other table didn't, we utilized a FULL JOIN above. When the acquisitions subquery didn't have any month entries, we utilised COALESCE to display them (presumably no acquisitions occurred in those months). To better understand how these elements work, we strongly advise you to re-run the query without them. You can also run each subquery separately to gain a better grasp of how they work.

Key takeaway

2.8 Join: Equi join, Simple join, Outer join, self join

To get data from various tables, SQL JOINS are needed. When two or more tables are listed in a SQL statement, a SQL JOIN is done.

Equi Join

EQUI JOIN creates a JOIN between two tables for equality or matching column(s) values. EQUI JOIN also creates a JOIN by using ON and then providing the names of the columns and their corresponding tables to check equality using the equal sign (=).

Syntax:

SELECT column_list

FROM table1, table2....

WHERE table1.column_name =

Table2.column_name;

Example –

SELECT student.name, student.id, record.class, record.city

FROM student, record

WHERE student.city = record.city;

Syntax :

SELECT column_list

FROM table1

JOIN table2

[ON (join_condition)]

Example –

SELECT student.name, student.id, record.class, record.city

FROM student

JOIN record

ON student.city = record.city;

Table name — Student

Id	Name	Class	City
3	Hina	3	Delhi
4	Megha	2	Delhi
6	Gouri	2	Delhi

Table name — Record

Id	Class	City
9	3	Delhi
10	2	Delhi
12	2	Delhi

Output after Equi join

Name	Id	Class	City
Hina	3	3	Delhi
Megha	4	3	Delhi
Gouri	6	3	Delhi
Hina	3	2	Delhi
Megha	4	2	Delhi
Gouri	6	2	Delhi
Hina	3	2	Delhi
Megha	4	2	Delhi
Gouri	6	2	Delhi

Simple Join

As long as the condition is met, the INNER JOIN keyword selects all rows from both tables. This keyword will generate a result-set by combining all rows from both tables that satisfy the requirement, i.e. the common field's value will be the same.

Syntax

SELECT columns

FROM table1

INNER JOIN table2

ON table1.column = table2.column;

Fig 2: Inner join

Outer Join

LEFT OUTER JOIN

This join retrieves all rows from the table on the left side of the join, as well as matching rows from the table on the right. The result-set will include null for the rows for which there is no matching row on the right side. LEFT OUTER JOIN is another name for LEFT JOIN.

Syntax

SELECT columns

FROM table1

LEFT [OUTER] JOIN table2

ON table1.column = table2.column;

Fig 3: Left outer join

RIGHT OUTER JOIN

The RIGHT JOIN function is analogous to the LEFT JOIN function. This join retrieves all rows from the table on the right side of the join, as well as matching rows from the table on the left. The result-set will include null for the rows for which there is no matching row on the left side. RIGHT OUTER JOIN is another name for RIGHT JOIN.

Syntax

SELECT columns

FROM table1

RIGHT [OUTER] JOIN table2

ON table1.column = table2.column;

Fig 4: Right outer join

FULL OUTER JOIN

The result-set of FULL JOIN is created by combining the results of both LEFT JOIN and RIGHT JOIN. All of the rows from both tables will be included in the result-set. The result-set will contain NULL values for the rows for which there is no match.

Syntax

SELECT columns

FROM table1

FULL [OUTER] JOIN table2

ON table1.column = table2.column;

Fig 5: Full outer join

Self Join

A self-join is a join that connects the table to itself. It's usually utilised when the entities have a hierarchical relationship (for example, employee-manager) or when you want to compare rows within the same table. The inner join or left outer join syntax is used. Within a query, table aliases are used to give the same table multiple names.

Syntax:

SELECT <columns>

FROM <table> AS <alias1>

JOIN <table> AS <alias2>

ON <alias1>.<columnA> = <alias2>.<columnB>

Example

Use the following query to get addresses in the same city:

SELECT a1.AddressID, a2.AddressID, a1.City

FROM Person.Address a1

JOIN Person.Address a2

ON a1.AddressID > a2.AddressID

WHERE a1.City = a2.City

2.9 Views

In SQL, a view is a virtual table based on the SQL statement result-set.

A view, much like a real table, includes rows and columns. Fields in a database view are fields from one or more individual database tables.

You can add SQL, WHERE, and JOIN statements to a view and show the details as if the data came from a single table.

Create a view

Syntax

CREATE VIEW view_name AS

SELECT column1, column2, ...

FROM table_name

WHERE condition;

Up-to-date data still displays a view! Any time a user queries a view, the database engine recreates the data, using the view's SQL statement.

Create a view example

A view showing all customers from India is provided by the following SQL.

CREATE VIEW [India Customers] AS

SELECT CustomerName, ContactName

FROM Customers

WHERE Country = 'India';

Updating a view

With the Build OR REPLACE VIEW command, the view can be changed.

Create or replace view syntax

CREATE OR REPLACE VIEW view_name AS

SELECT column1, column2, ...

FROM table_name

WHERE condition;

For the "City" view, the following SQL adds the "India Customers" column:

CREATE OR REPLACE VIEW [India Customers] AS

SELECT CustomerName, ContactName, City

FROM Customers

WHERE Country = 'India';

Dropping a view

With the DROP VIEW instruction, a view is removed.

Drop view syntax

DROP VIEW view_name;

"The following SQL drops the view of "India Customers":

DROP VIEW [India Customers];

Indexes

● Special lookup tables are indexes. It is used to very easily extract data from the database.

● To speed up select queries and where clauses are used, an Index is used. But it displays the data input with statements for insertion and update. Without affecting the data, indexes can be generated or dropped.

● An index is much like an index on the back of a book in a database.

For example, when you refer to all the pages in a book that addresses a certain subject, you must first refer to an index that lists all the subjects alphabetically, and then refer to one or more particular page numbers.

● Create index statement

It is used on a table to construct an index. It allows value to be duplicated

Syntax

CREATE INDEX index_name

ON table_name (column1, column2, ...);

Example

CREATE INDEX idx_name

ON Persons (LastName, FirstName);

● Unique index statement

It is used on a table to construct a specific index. Duplicate value does not make it.

Syntax

CREATE UNIQUE INDEX index_name

ON table_name (column1, column2, ...);

Example

CREATE UNIQUE INDEX websites_idx

ON websites (site_name);

● Drop index statement

It is used to delete a table's index.

Syntax

DROP INDEX index_name;

Example

DROP INDEX websites_idx;

Key takeaway:

It is used to very easily extract data from the database.
To speed up select queries and where clauses are used, an Index is used.

2.10 Indexes

Indexing is a technique for data structure that helps you to access information from a database file quickly. An Index is a small table of two columns only. The first column consists of a copy of a table's primary or candidate key. The second column includes a series of pointers to hold the disc block address where that particular key value is located.

Indexing is a way of maximizing a database's output by minimizing the amount of disc access needed when processing a question. It is a technique for data structure that is used to find and access the data in a database easily.

Using a few database columns, indexes are established.

The first column is the Search key, which includes a copy of the table's primary or candidate key. In order for the corresponding data to be accessed easily, these values are stored in sorted order.
The second column is the Data Reference or Pointer, which includes a series of pointers that hold the disc block address where you can find the particular key value.

Indexing has different attributes:

Access Types: This refers to the access type, such as value-based quest, access to the range, etc.
Access Time: This refers to the time taken to locate a single element of data or a collection of elements.
Insertion Time: It relates to the time it takes to find a suitable space and insert new data.
Deletion Time: It takes time to find and delete an object and to update the layout of the index.
Space Overhead: It refers to the extra space that the index takes.

Types of indexes

Fig 6: Types of database

Primary index

The Primary Index is an ordered file with two fields and a fixed length scale. The first field is the same as the primary key, and the second field is pointed to that particular block of information. In the primary index, the relationship between the entries in the index table is always one-to-one.

Main DBMS indexing is also further split into two forms:

● Dense

● Sparse

Dense index

In a dense index, for each search key valued in the database, a record is generated. This helps you to search more easily, however more space is required to store index records. Method records include the value of the search key in this index and point to the actual record on the disc.

Fig 7: Dense index

Sparse index

It is an index record that only exists in the file for some of the values. The Sparse Index helps you solve the problems of DBMS Dense Indexing. A number of index columns store the same data block address in this form of indexing technique, and the block address will be fetched when data needs to be retrieved.

However, for only those search-key values, sparse Index stores index records. It requires less space, less overhead maintenance for insertion, and deletions, but compared to the dense index for locating records, it is slower.

The database index below is an example of the Sparse Index

Fig 8: Sparse index

2. Clustering index

The records themselves are stored in the index in a clustered index, and not the pointers. On non-primary key columns, the Index is often generated, which may not be unique for each record.

You can group two or more columns in such a situation to get unique values and create an index called the Clustered Index. This helps you locate the record faster as well.

Example:

Suppose a business has many staff in each department. Suppose we use a clustering index, where all employees within a single cluster who belong to the same Dept ID are considered, and index pointers point to the entire cluster. Dept Id is a non-unique key here.

Fig 9: Example of cluster index

3. Secondary index

In DBMS, a secondary index can be created by a field that has a specific value for each record and should be a candidate key. It is also known as an index for non-clustering.

To reduce the mapping size of the first level, this two-level database indexing technique is used. A wide range of numbers is chosen for the first level because of this; the mapping size still remains small.

In secondary indexing, another degree of indexing is added to reduce the mapping scale. The huge range for the columns is initially chosen in this process, so that the mapping scale of the first level becomes small. Each range is then further split into smaller ranges.

The first level mapping is stored in the primary memory, so that the fetching of addresses is easier. In the secondary memory, the second level mapping and actual data are stored (hard disk).

Fig 10: Secondary index

Let’s understand this by an example:

If you want to find the record of roll 111 in the diagram, the first level index will look for the highest entry that is smaller than or equal to 111. At this amount, he'll get 100.
Then again, it does max (111) <= 111 at the second index stage and gets 110. It now goes to the data block using the address 110, and starts looking for each record until it gets 111.
This is how this approach is used to conduct a search. Often, adding, modifying or removing is performed the same way.

Key takeaway:

● Data can be stored in sorted order or may not be stored.

● The Primary Index is an ordered file with two fields and a fixed length scale.

● For this category, records which have similar characteristics are grouped and indexes are established in cluster index.

● In secondary indexing, another degree of indexing is added to reduce the mapping scale.

2.11 Sequence

The sequence generator generates a series of integers in a specific order. The sequence generator comes in handy when you need to create unique sequential ID numbers.

Individual sequence numbers that were generated and utilized in a transaction that was subsequently rolled back can be skipped.

For numeric columns in database tables, a sequence generates a serial list of unique numbers. Sequences make application programming easier by producing unique integer values for each row of a single table or several tables automatically.

Assume that two users are putting new employee entries into the EMP database at the same time. Neither user has to wait for the other to enter the next available employee number when utilizing a sequence to generate unique employee numbers for the EMPNO column. The sequence creates the correct values for each user automatically.

Tables are unrelated to sequence numbers, so the same sequence can be used for one or multiple tables. A sequence can be accessed by several people to generate actual sequence numbers once it is created.

A sequence is a set of integers, such as 1, 2, 3, and so on, that are generated and supported by some database systems in order to produce unique values on demand.

A sequence is a schema-bound user-defined object that generates a list of numeric values.
Many databases employ sequences because many applications need that each row in a table include a unique value, and sequences give an easy way to do so.
The sequence of numeric numbers is generated at defined intervals in ascending or descending order, and it can be adjusted to resume when max value is exceeded.

Syntax

CREATE SEQUENCE sequence_name

START WITH initial_value

INCREMENT BY increment_value

MINVALUE minimum value

MAXVALUE maximum value

CYCLE|NOCYCLE ;

Sequence_name: Name of the sequence.

Initial_value: starting value from where the sequence starts.

Initial_value should be greater than or equal

To minimum value and less than equal to maximum value.

Increment_value: Value by which sequence will increment itself.

Increment_value can be positive or negative.

Minimum_value: Minimum value of the sequence.

Maximum_value: Maximum value of the sequence.

Cycle: When sequence reaches its set_limit

It starts from the beginning.

Nocycle: An exception will be thrown

If the sequence exceeds its max_value.

Example

Make a table called students, with the columns id and name.

CREATE TABLE students

(

ID number(10),

NAME char(20)

);

Insert values into the table now.

INSERT into students VALUES(sequence_1.nextval,'Ramesh');

INSERT into students VALUES(sequence_1.nextval,'Suresh');

Where sequence_1.nextval will insert id’s in id column in a sequence as defined in sequence_1.

Output

______________________

| ID | NAME |

------------------------

| 1 | Ramesh |

| 2 | Suresh |

----------------------

Key takeaway

The sequence generator generates a series of integers in a specific order. The sequence generator comes in handy when you need to create unique sequential ID numbers.

References:

Database system concept- Korth Silberschatz.
SQL - PL/SQL by Ivan Bayross BPB Publications.
Structure query language-By Osborne
Learning MySQL by O’reilly

Sign Up

Index

Notes

Highlighted

Underlined

Browse by Topics

Notes

Highlighted

Underlined