Introduction Beginner’s Guide to SQL
This is Beginner’s Guide to SQL, Databases and their management systems have for some time now been an essential means of storing data. Pick up any software application that comes to the top of your head – be it a simple login page or a payment gateway portal. All these pick up or pack data into a systemic facility called a database. SQL (Structured Query Language) is a language developed to query and manipulate information in these database systems. Call it ‘Sequel’ or “ess-que-el”, there is no denying the importance of SQL in data storage and transformation.
SQL has become more and more prevalent and ubiquitous. It is one of the most sought-after skills, no longer only restricted to the IT industry. Whether it’s a myriad of rows of stock data in algo trading or a few tables of an emerging start-up, SQL is needed everywhere.
So how about picking up some essential components to add to your resume? This Beginner’s Guide to SQL surely put into perspective why SQL is so important and briefly explain some salient features. Read our latest blog on Top 20 SQL Interview Questions
Some important concepts
- Database – A database is a systematic collection of data. It enables data management and manipulation. Traditionally, data here is stored in the form of tables consisting of rows and columns.
- Relational Databases – This type of database defines database relationships in the form of tables and is connected by constraints. “Relations” here are between tables. It is also called Relational DBMS, which is also the most popular DBMS type in the market. Some popular RDBMS systems include Oracle, MySQL, PostgreSQL, etc.
Need for SQL in RDBMS
- Enables access to data within RDBMS.
- Allows defining relationships and attributes of given data.
- Perform table operations like creating, altering, and dropping tables.
- SQL also offers the creation of views, triggers, and stored procedures.
- Authentications can also be set on the tables by setting permissions.
- Other flavors of SQL –
NoSQL : NoSQL is a querying language that deals with databases that are not strictly relational in nature. Moreover, data here is not modeled in the traditional tabular format. They are popular in real-time web applications and big data models.
Before starting a Beginner’s Guide to SQL, let’s see SQL Features, Now let’s learn about some popular SQL queries that are mandatory to learn as a beginner. If you don’t have any SQL workbench installed, you can simply import the SQLite library in Python and proceed with learning SQL!
DDL (Data Definition Language)
DDL queries , as the name suggests, deal with data definition i.e tables and schema. It operates on the object level and is used to modify, add, or delete definitions of tables and other objects. These are permanent commands that cannot be revoked.
|Create||It is used to create the database or its objects like table, index, function, views, store procedure, and triggers).||CREATE TABLE table_name (|
|Alter||Used to alter the structure of the database.||ALTER TABLE table_name|
ADD column_name datatype;
|Drop||This command is used to delete DB objects.||DROP TABLE table_name;|
|Truncate||It is used to wipe out all data within a table. (Note – This is not explicitly present in SQLite, but is allowed in other popular DBMS systems)||TRUNCATE TABLE table_name;|
For example, I used the following query to create the table.
Data Query Language – The Select Query
The SELECT command is what data analysts use more than 95% of their time. It is used to retrieve information from your database.
Syntax – SELECT (column names, *) from table_name;
Filtering data is an important aspect of the select query. Real-world data has thousands of rows and a wide range of features. The WHERE clause is used to filter results in SQL. It contains one or more logical expressions that evaluate each row and return them if the condition is true.
The most commonly used comparison operators to use in
WHERE clauses are:
- = (equal to),
- <> or != (not equal to),
- < (less than), > (greater than),
- <= (less than or equal), >= (greater than or equal)
For instance, the following query returns data of employees whose salary is greater than 40000.
While this one returns those employees who live in Powai.
Logical operators, such as AND, OR, LIKE can also be used to test data against a condition. The LIKE clause is used to match various patterns of textual formatted data.
The following query returns those who live in Powai and are from IT department.
This query returns employees who live in cities that begin with ‘P’. We have used the LIKE clause to apply this filter.
DML (Data Manipulation Language)
As opposed to DDL, DML queries operate on the data level. It contains four revolutionary commands that will let you work and manipulate data with ease!
|Insert||This command is used to insert data into the table.||INSERT INTO table_name (column1, column2, column3, …)|
VALUES (value1, value2, value3, …);
|Update||It is used to modify data within a table.||UPDATE table_name|
SET column1 = value1, column2 = value2, …
|Delete||This is used to delete records from a table||DELETE FROM table_name WHERE condition;|
(Note – If WHERE is not mentioned, all records of your table gets deleted)
We inserted records into the database table with the following command.
Suppose an employee’s (ID no. 5) has been given a promotion. This implies an update in salary!
Let’s assume another employee Mark has left the company. We will have to delete his record.
As you can clearly see we are transforming table data here. While dealing with real-world data you need to be very precise and clear about the manipulations you make. Otherwise, you may end up losing vital information!
Aggregations and Grouping
Aggregate functions take a list of values and return a single value. The most widely used aggregate functions are AVG, MIN, MAX, SUM and COUNT.
To find total no. of employees –
To find average salary of employees within IT department –
Simply performing aggregations is not enough. What if we want to find the minimum and maximum salaries based on each department? Here is where the GROUP BY clause shows its magic. It is strictly used after the WHERE statement.
Sometimes you may also need to filter these groups as well. In that case, we use the HAVING clause. Here, we shall try to find the departments which have more than 1 employee.
Joining tables is one of the most fundamental parts of relational databases. Upto now, we have only been working with one table. But RDBMS allows us to link one table with another so as to query attributes from both at the same time. Some commonly used joins are –
INNER JOIN – Returns records that have matching values in both tables.
LEFT JOIN – Returns all records from the left table, and matching records from the right table
RIGHT JOIN – Returns all records from the right table, and matching records from the left table
OUTER JOIN – Returns merged records from left and right table
In our tutorial we have a second table containing contact info based on Person ID.
We first perform an INNER join on these tables.
We see that some employees may also have multiple contact information. Let’s find out those employees that did not provide any contact information.
Beginner’s Guide to SQL That’s all the basic information you need about SQL to start your data analysis journey! Of course, there are some intricate tricks to the trade when you are dealing with massive rows of data, but this pretty much covers the vital basics. Knowing SQL has become a mandatory prerequisite in data analyst and business analyst roles across all companies. So make sure to take SQL as a skill under your belt!