Summary of "SQL Course for Beginners: Learn 90% of SQL in 1 Hour"
High-level summary
This is a beginner-to-advanced, one-hour SQL tutorial using a real e‑commerce dataset. It covers:
- Setting up a database and loading CSV data into MySQL (using Python/pandas)
- Core SELECT queries (column selection, WHERE filters, operators)
- Sorting, limiting, and NULL handling
- Aggregate functions and GROUP BY / HAVING
- Text, numeric, and date/time functions
- Joins (inner / left / right / cross / self)
- Subqueries and Common Table Expressions (CTE)
- CASE expressions, window functions (cumulative sums, ranking), and views
The instructor demonstrates each concept using an e‑commerce dataset (customers, geolocation, order_items, orders, payments, products, sellers) available on Kaggle and in a GitHub repo.
Dataset and environment
- Dataset: Brazil e‑commerce dataset (orders from 2016–2018) — CSV files:
customers,geolocation,order_items,orders,payments,products,sellers. - Tools:
- MySQL (server + MySQL Workbench)
- Python with
pandasandmysql-connector-python(or another MySQL connector) - Any Python IDE / Jupyter Notebook / VS Code
- Repository: the author’s GitHub repo “E‑commerce SQL” and a Kaggle link are provided in the course.
Loading CSVs into MySQL (recommended workflow)
Recommended method: use Python + pandas + a MySQL connector to read CSVs and push them into MySQL.
Steps:
- Install required Python packages:
pip install pandaspip install mysql-connector-python(or an equivalent connector)
- Prepare CSV files in a folder and note the path (convert backslashes to forward slashes if needed).
- Write a Python script or notebook that:
import pandas as pdand import the MySQL connector module- reads each CSV into a DataFrame:
pd.read_csv(path) - normalizes/cleans data (replace Python
NaN/Nonewith SQLNULLwhere needed) - connects to MySQL (host
localhost, username, password, database) - creates the database if needed:
CREATE DATABASE e_commerce; - writes DataFrames to MySQL tables (use
to_sqlequivalents or connector +INSERTs)
- Handle NULLs explicitly: convert
NaN/Noneto SQLNULLor useDataFrame.fillna()/replace()before inserting. - Verify in MySQL Workbench by refreshing the database and opening tables to confirm rows loaded.
Basic SELECT and syntax reminders
- Basic form:
SELECT column_list FROM table; - To return all columns:
SELECT * FROM table; - Example:
SELECT customer_id, customer_city, customer_state FROM customers; - Terminate SQL statements with a semicolon when running multiple queries in the same window.
Filtering: WHERE and logical operators
- Equality and strings:
WHERE column = 'value' - Logical operators:
AND,OR,NOT - Range:
BETWEEN a AND b(inclusive; works for numbers and dates) - Membership:
IN (val1, val2, ...)andNOT IN (...) - Pattern matching:
LIKEwith%wildcardcol LIKE 'R%'— starts with Rcol LIKE '%r'— ends with rcol LIKE '%d%'— contains d
Sorting and limiting results
- Order results:
ORDER BY column [ASC|DESC](defaultASC) - Order by multiple columns, with mixed directions
- Limit rows:
LIMIT n [OFFSET m](MySQL also supportsLIMIT m, n)- Examples:
LIMIT 5;orLIMIT 2, 3(skip 2, take 3)
- Examples:
Aggregate functions and GROUP BY / HAVING
- Common aggregates:
SUM(),AVG(),MIN(),MAX(),COUNT(),ROUND() - Count distinct:
COUNT(DISTINCT column) - When mixing aggregates and non-aggregates, group by the non-aggregated columns:
GROUP BY column - Filter after aggregation with
HAVING(useWHEREfor row-level filtering before aggregation)- Example:
SELECT payment_type, AVG(payment_value) FROM payments GROUP BY payment_type HAVING AVG(payment_value) > 100;
- Example:
Text (string) functions
LENGTH(column)— length including spacesTRIM(column)— remove leading/trailing whitespaceUPPER(column)/LOWER(column)— case conversionREPLACE(column, 'old', 'new')— replace substringsCONCAT(col1, ' ', col2)— combine columns; alias the result when needed
Date & time functions
- Extractors:
DAY(date),MONTH(date),YEAR(date) - Name functions:
MONTHNAME(date),DAYNAME(date) - Difference:
DATEDIFF(date1, date2)— returns difference in days (watch argument order/sign) - Use these to group by month/year/quarter; use
ABS()if you need absolute differences
Numeric functions
ROUND(value, decimals)CEIL()/CEILING()— round upFLOOR()— round down
NULL handling
- Check NULLs with
IS NULLandIS NOT NULL(do not use= NULL) - Replace
NaN/Nonein Python prior to insertion, or handle NULLs in queries withIS NULL/COALESCE()as needed
Joins
- Inner join:
INNER JOIN— matched rows only - Left join:
LEFT JOIN— all left rows plus matched right rows - Right join:
RIGHT JOIN— all right rows plus matched left rows - Cross join:
CROSS JOIN— Cartesian product - Self join: join a table with itself using aliases (e.g.,
t1,t2) - Use fully qualified references:
table.columnto avoid ambiguity - Example join chain for sales analysis:
products JOIN order_items ON products.product_id = order_items.product_idorder_items JOIN payments ON order_items.order_id = payments.order_id- This chain allows computing sales per product/category.
Subqueries and CTEs (WITH … AS)
- Subquery: a query nested inside another query (can appear in
SELECT,FROM,WHERE, etc.) - CTE:
WITH alias AS (subquery)— makes complex logic readable and reusable - Typical use: compute category sales in a CTE, order by sales DESC,
LIMITtop N, then select category names from the derived table
CASE expressions
-
Syntax:
CASE WHEN condition THEN value [WHEN ...] ELSE value END AS alias -
Use to bucket values (e.g., sales buckets: LOW / MEDIUM / HIGH)
Window functions
- Running total example:
SUM(sales) OVER (ORDER BY order_date)— cumulative sum - Ranking:
RANK(),DENSE_RANK(),ROW_NUMBER() OVER (ORDER BY metric DESC) - Use windows for cumulative sums, running totals, ranking within partitions, and top‑N per group
- To filter by rank, wrap the windowed result in a subquery/CTE and apply
WHERE rank <= N
Views
- Create a view:
CREATE VIEW view_name AS (SELECT ...); - Views act like virtual tables and encapsulate repeated complex queries
Practical tips emphasized
- Ensure the correct database is selected in MySQL Workbench (or prefix object names)
- Use semicolons to separate queries
- Clean and normalize CSVs before loading; replace
NaNwith SQLNULLwhen appropriate - Use aliases to simplify queries and avoid ambiguous column references
- Use
ORDER BY+LIMITto inspect top/bottom results quickly - Use CTEs and views to make complex logic modular and readable
- Use
GROUP BY+HAVINGwhen filtering aggregated results (HAVING applies after grouping)
Example analyses shown (brief)
- Filter customers by state:
WHERE customer_state = 'MG' - Filter payments by
payment_typeandpayment_valueusingAND/OR -
Aggregate total revenue:
SELECT ROUND(SUM(payment_value), 2) AS revenue FROM payments; -
Count unique cities:
SELECT COUNT(DISTINCT customer_city) FROM customers; - Compute date differences between estimated and actual delivery:
DATEDIFF(...) - Total sales per year: join
ordersandpayments,GROUP BY YEAR(purchase_timestamp) - Top product categories by total sales using subquery/CTE and
LIMIT - Create sales buckets using
CASE - Compute cumulative daily sales:
SUM(...) OVER (ORDER BY order_date)
Errors and quirks to watch for
- Column name typos and schema mismatches — column names must match exactly
- Large datasets can cause MySQL client memory errors or long run times — during development use
LIMIT, sampling, or optimize queries - Python
NaNvs SQLNULLrequires explicit handling before insertion
Speakers and sources
- Instructor: Aayushi Jain (Dubs Cube / W Cube Tech channel)
- Course/mentorship: Dubs Cube (W Cube Tech)
- Dataset sources: Kaggle (e‑commerce Brazil dataset) and the instructor’s GitHub repo “E‑commerce SQL”
- Technologies/tools referenced: MySQL (MySQL Workbench), Python,
pandas,mysql-connector-python
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.