Connecting MySQL Database with Python Pandas

Chapters

Python Pandas Tutorial

Connecting MySQL Database with Python Pandas

Working with data stored in relational databases is a crucial skill in data analysis. MySQL is one of the most popular relational databases used worldwide. Python’s Pandas library makes it easy to connect to MySQL, run SQL queries, and import data directly into DataFrames for analysis.

In this chapter, we’ll cover how to:

Install the required libraries
Connect to a MySQL database
Read data from MySQL into Pandas
Write Pandas DataFrames back to MySQL
Handle connection errors safely

🧩 Step 1: Install Required Libraries

You need the following Python libraries:

pip install pandas mysql-connector-python sqlalchemy

pandas → for data manipulation and analysis
mysql-connector-python → official MySQL driver
sqlalchemy → provides a clean interface between Pandas and SQL databases

🧩 Step 2: Create a Sample MySQL Database

Before connecting, let’s assume you have a MySQL database named company_db with a table called employees:

CREATE DATABASE company_db;

USE company_db;

CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary FLOAT,
    hire_date DATE
);

INSERT INTO employees (name, department, salary, hire_date)
VALUES 
('Alice Johnson', 'Finance', 72000, '2020-05-15'),
('Bob Smith', 'Marketing', 65000, '2019-07-23'),
('Charlie Brown', 'IT', 85000, '2018-03-10'),
('Diana King', 'Finance', 78000, '2021-01-05'),
('Ethan Clark', 'HR', 56000, '2017-11-18');

🧩 Step 3: Connect MySQL to Pandas Using `mysql.connector`

import mysql.connector
import pandas as pd

# Establish connection
conn = mysql.connector.connect(
    host='localhost',
    user='root',
    password='your_password',
    database='company_db'
)

# Read table into DataFrame
query = "SELECT * FROM employees"
df = pd.read_sql(query, conn)

print(df.head())

conn.close()

✅ Explanation

mysql.connector.connect() establishes the connection.
pd.read_sql() runs SQL and loads the result into a DataFrame.
conn.close() closes the connection safely.

🧩 Step 4: Using SQLAlchemy for Easier Integration

SQLAlchemy provides a more flexible way to manage connections and is preferred for production code.

from sqlalchemy import create_engine
import pandas as pd

# Create SQLAlchemy engine
engine = create_engine("mysql+mysqlconnector://root:your_password@localhost/company_db")

# Read entire table
df = pd.read_sql("SELECT * FROM employees", con=engine)

print(df)

Benefits of SQLAlchemy

Works with multiple databases (MySQL, PostgreSQL, SQLite, etc.)
Can handle connection pooling
Easier integration with to_sql() for writing DataFrames back to MySQL

🧩 Step 5: Writing Pandas DataFrame to MySQL

Suppose you have a new DataFrame:

new_data = {
    'name': ['Frank Miller', 'Grace Lee'],
    'department': ['IT', 'Marketing'],
    'salary': [88000, 63000],
    'hire_date': ['2022-06-15', '2023-02-10']
}

df_new = pd.DataFrame(new_data)

# Write to MySQL table
df_new.to_sql('employees', con=engine, if_exists='append', index=False)

print("New records inserted successfully!")

`if_exists` Parameter:

'fail' → Raises error if table exists
'replace' → Drops the table and recreates it
'append' → Adds new rows to existing table

🧩 Step 6: Querying and Filtering Data

Once your data is loaded, you can query and filter it using SQL or Pandas directly.

Using SQL Query

query = "SELECT * FROM employees WHERE department = 'Finance'"
df_finance = pd.read_sql(query, con=engine)
print(df_finance)

Using Pandas

df_finance = df[df['department'] == 'Finance']
print(df_finance)

🧩 Step 7: Handling Errors Gracefully

When dealing with databases, always include exception handling.

import mysql.connector
from mysql.connector import Error

try:
    conn = mysql.connector.connect(
        host='localhost',
        user='root',
        password='your_password',
        database='company_db'
    )
    if conn.is_connected():
        print("✅ Connected to MySQL database")
        df = pd.read_sql("SELECT * FROM employees", conn)
        print(df.head())

except Error as e:
    print(f"❌ Error: {e}")

finally:
    if conn.is_connected():
        conn.close()
        print("🔒 Connection closed.")

🧩 Step 8: Writing Query Results to Excel or CSV

Once you fetch data from MySQL, you can easily export it.

df.to_excel('employees_data.xlsx', index=False)
df.to_csv('employees_data.csv', index=False)

Now, you have data from MySQL saved locally for further analysis or sharing.

🧩 Step 9: Use Case — Analyzing Employee Salaries

You can perform analysis directly on the MySQL data loaded into Pandas:

avg_salary = df.groupby('department')['salary'].mean().reset_index()
print(avg_salary)

Or visualize it using Matplotlib:

import matplotlib.pyplot as plt

avg_salary.plot(kind='bar', x='department', y='salary', legend=False)
plt.title('Average Salary by Department')
plt.ylabel('Salary ($)')
plt.show()

🧩 Step 10: Summary

Task	Function/Method	Description
Connect to MySQL	`mysql.connector.connect()`	Establishes connection
Read SQL Query	`pd.read_sql()`	Loads SQL result into DataFrame
Write to MySQL	`DataFrame.to_sql()`	Inserts data back into SQL table
Handle Errors	`try-except` block	Manages connection failures
Export Data	`to_csv()` / `to_excel()`	Saves data locally

🧩 Step 11: Best Practices

Use environment variables to store credentials securely.
Close connections after every operation.
Use SQLAlchemy for scalability and cleaner syntax.
Avoid loading very large tables at once — use LIMIT or chunking with pd.read_sql(query, chunksize=10000).
Always validate and sanitize input to prevent SQL injection.

✅ Summary

Connecting Pandas with MySQL empowers data professionals to bridge data storage and analytics seamlessly. You can query live databases, analyze them in Pandas, visualize results, and write insights back — all within Python.

This integration forms the foundation for advanced workflows in ETL, data analytics, and machine learning pipelines.

Connecting MySQL Database with Python Pandas

In this chapter, we’ll cover how to:

Install the required libraries
Connect to a MySQL database
Read data from MySQL into Pandas
Write Pandas DataFrames back to MySQL
Handle connection errors safely

🧩 Step 1: Install Required Libraries

You need the following Python libraries:

pip install pandas mysql-connector-python sqlalchemy

pandas → for data manipulation and analysis
mysql-connector-python → official MySQL driver
sqlalchemy → provides a clean interface between Pandas and SQL databases

🧩 Step 2: Create a Sample MySQL Database

Before connecting, let’s assume you have a MySQL database named company_db with a table called employees:

CREATE DATABASE company_db;

USE company_db;

CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary FLOAT,
    hire_date DATE
);

INSERT INTO employees (name, department, salary, hire_date)
VALUES 
('Alice Johnson', 'Finance', 72000, '2020-05-15'),
('Bob Smith', 'Marketing', 65000, '2019-07-23'),
('Charlie Brown', 'IT', 85000, '2018-03-10'),
('Diana King', 'Finance', 78000, '2021-01-05'),
('Ethan Clark', 'HR', 56000, '2017-11-18');

🧩 Step 3: Connect MySQL to Pandas Using `mysql.connector`

import mysql.connector
import pandas as pd

# Establish connection
conn = mysql.connector.connect(
    host='localhost',
    user='root',
    password='your_password',
    database='company_db'
)

# Read table into DataFrame
query = "SELECT * FROM employees"
df = pd.read_sql(query, conn)

print(df.head())

conn.close()

✅ Explanation

mysql.connector.connect() establishes the connection.
pd.read_sql() runs SQL and loads the result into a DataFrame.
conn.close() closes the connection safely.

🧩 Step 4: Using SQLAlchemy for Easier Integration

SQLAlchemy provides a more flexible way to manage connections and is preferred for production code.

from sqlalchemy import create_engine
import pandas as pd

# Create SQLAlchemy engine
engine = create_engine("mysql+mysqlconnector://root:your_password@localhost/company_db")

# Read entire table
df = pd.read_sql("SELECT * FROM employees", con=engine)

print(df)

Benefits of SQLAlchemy

Works with multiple databases (MySQL, PostgreSQL, SQLite, etc.)
Can handle connection pooling
Easier integration with to_sql() for writing DataFrames back to MySQL

🧩 Step 5: Writing Pandas DataFrame to MySQL

Suppose you have a new DataFrame:

new_data = {
    'name': ['Frank Miller', 'Grace Lee'],
    'department': ['IT', 'Marketing'],
    'salary': [88000, 63000],
    'hire_date': ['2022-06-15', '2023-02-10']
}

df_new = pd.DataFrame(new_data)

# Write to MySQL table
df_new.to_sql('employees', con=engine, if_exists='append', index=False)

print("New records inserted successfully!")

`if_exists` Parameter:

'fail' → Raises error if table exists
'replace' → Drops the table and recreates it
'append' → Adds new rows to existing table

🧩 Step 6: Querying and Filtering Data

Once your data is loaded, you can query and filter it using SQL or Pandas directly.

Using SQL Query

query = "SELECT * FROM employees WHERE department = 'Finance'"
df_finance = pd.read_sql(query, con=engine)
print(df_finance)

Using Pandas

df_finance = df[df['department'] == 'Finance']
print(df_finance)

🧩 Step 7: Handling Errors Gracefully

When dealing with databases, always include exception handling.

import mysql.connector
from mysql.connector import Error

try:
    conn = mysql.connector.connect(
        host='localhost',
        user='root',
        password='your_password',
        database='company_db'
    )
    if conn.is_connected():
        print("✅ Connected to MySQL database")
        df = pd.read_sql("SELECT * FROM employees", conn)
        print(df.head())

except Error as e:
    print(f"❌ Error: {e}")

finally:
    if conn.is_connected():
        conn.close()
        print("🔒 Connection closed.")

🧩 Step 8: Writing Query Results to Excel or CSV

Once you fetch data from MySQL, you can easily export it.

df.to_excel('employees_data.xlsx', index=False)
df.to_csv('employees_data.csv', index=False)

Now, you have data from MySQL saved locally for further analysis or sharing.

🧩 Step 9: Use Case — Analyzing Employee Salaries

You can perform analysis directly on the MySQL data loaded into Pandas:

avg_salary = df.groupby('department')['salary'].mean().reset_index()
print(avg_salary)

Or visualize it using Matplotlib:

import matplotlib.pyplot as plt

avg_salary.plot(kind='bar', x='department', y='salary', legend=False)
plt.title('Average Salary by Department')
plt.ylabel('Salary ($)')
plt.show()

🧩 Step 10: Summary

Task	Function/Method	Description
Connect to MySQL	`mysql.connector.connect()`	Establishes connection
Read SQL Query	`pd.read_sql()`	Loads SQL result into DataFrame
Write to MySQL	`DataFrame.to_sql()`	Inserts data back into SQL table
Handle Errors	`try-except` block	Manages connection failures
Export Data	`to_csv()` / `to_excel()`	Saves data locally

🧩 Step 11: Best Practices

Use environment variables to store credentials securely.
Close connections after every operation.
Use SQLAlchemy for scalability and cleaner syntax.
Avoid loading very large tables at once — use LIMIT or chunking with pd.read_sql(query, chunksize=10000).
Always validate and sanitize input to prevent SQL injection.

✅ Summary

This integration forms the foundation for advanced workflows in ETL, data analytics, and machine learning pipelines.

Chapters

Connecting MySQL Database with Python Pandas

🧩 Step 1: Install Required Libraries

🧩 Step 2: Create a Sample MySQL Database

🧩 Step 3: Connect MySQL to Pandas Using mysql.connector

✅ Explanation

🧩 Step 4: Using SQLAlchemy for Easier Integration

Benefits of SQLAlchemy

🧩 Step 5: Writing Pandas DataFrame to MySQL

if_exists Parameter:

🧩 Step 6: Querying and Filtering Data

Using SQL Query

Using Pandas

🧩 Step 7: Handling Errors Gracefully

🧩 Step 8: Writing Query Results to Excel or CSV

🧩 Step 9: Use Case — Analyzing Employee Salaries

🧩 Step 10: Summary

🧩 Step 11: Best Practices

✅ Summary

Connecting MySQL Database with Python Pandas

🧩 Step 1: Install Required Libraries

🧩 Step 2: Create a Sample MySQL Database

🧩 Step 3: Connect MySQL to Pandas Using mysql.connector

✅ Explanation

🧩 Step 4: Using SQLAlchemy for Easier Integration

Benefits of SQLAlchemy

🧩 Step 5: Writing Pandas DataFrame to MySQL

if_exists Parameter:

🧩 Step 6: Querying and Filtering Data

Using SQL Query

Using Pandas

🧩 Step 7: Handling Errors Gracefully

🧩 Step 8: Writing Query Results to Excel or CSV

🧩 Step 9: Use Case — Analyzing Employee Salaries

🧩 Step 10: Summary

🧩 Step 11: Best Practices

✅ Summary

🧩 Step 3: Connect MySQL to Pandas Using `mysql.connector`

`if_exists` Parameter:

🧩 Step 3: Connect MySQL to Pandas Using `mysql.connector`

`if_exists` Parameter: