Excel files (.xls, .xlsx) are one of the most common formats for sharing tabular data. Pandas makes it very simple to read from and write to Excel files.


1. Requirements

  • Install openpyxl (for .xlsx) or xlrd (for .xls):
pip install openpyxl xlrd

By default:

  • pandas ≥ 1.2 uses openpyxl to read/write .xlsx.
  • .xls (old format) requires xlrd (but newer versions of xlrd support only .xls, not .xlsx).

2. Reading Excel Files

Basic read

import pandas as pd

df = pd.read_excel("data.xlsx")   # reads first sheet by default
print(df.head())

Read a specific sheet

df = pd.read_excel("data.xlsx", sheet_name="Sales")
  • sheet_name can be a string (sheet name), int (sheet index, starting from 0), or list of names/indices.
sheets = pd.read_excel("data.xlsx", sheet_name=["Sales","Expenses"])
print(sheets["Sales"].head())

Read only selected columns

df = pd.read_excel("data.xlsx", usecols=["A","C","F"])  # by Excel columns
# or by column names
df = pd.read_excel("data.xlsx", usecols=["Name","Salary"])

Read only specific rows

df = pd.read_excel("data.xlsx", skiprows=2, nrows=10)
  • skiprows=2 → skip first 2 rows.
  • nrows=10 → read only 10 rows.

3. Writing to Excel

Write a DataFrame to Excel

df.to_excel("output.xlsx", index=False)
  • index=False → prevents writing row numbers.

Write to specific sheet

df.to_excel("output.xlsx", sheet_name="Summary", index=False)

Write multiple DataFrames to one Excel file

Use ExcelWriter:

with pd.ExcelWriter("report.xlsx") as writer:
    df1.to_excel(writer, sheet_name="Sales", index=False)
    df2.to_excel(writer, sheet_name="Expenses", index=False)

Append new sheet to an existing Excel file

with pd.ExcelWriter("report.xlsx", mode="a", engine="openpyxl") as writer:
    df3.to_excel(writer, sheet_name="NewData", index=False)

4. Controlling Output

Change column order and formatting

df.to_excel("output.xlsx", columns=["Name","Salary","Age"], index=False)

Add styling (optional with Styler)

styled = df.style.highlight_max(color="lightgreen")
styled.to_excel("styled.xlsx", engine="openpyxl")

5. Reading/Writing Excel with multiple sheets

  • Read all sheets into a dictionary:
sheets_dict = pd.read_excel("report.xlsx", sheet_name=None)
for sheet, data in sheets_dict.items():
    print(f"Sheet: {sheet}")
    print(data.head())
  • Write dictionary of DataFrames to multiple sheets:
dfs = {"Sales": df1, "Expenses": df2, "Summary": df3}
with pd.ExcelWriter("output.xlsx") as writer:
    for sheet, data in dfs.items():
        data.to_excel(writer, sheet_name=sheet, index=False)

6. Small End-to-End Example

import pandas as pd

# Load data
df = pd.read_excel("employees.xlsx", sheet_name="EmployeeData")

# Clean: fill missing salaries with average
df['Salary'].fillna(df['Salary'].mean(), inplace=True)

# Transform: add yearly salary
df['YearlySalary'] = df['Salary'] * 12

# Save cleaned data to new Excel
df.to_excel("employees_cleaned.xlsx", sheet_name="CleanedData", index=False)

Quick Exercises

FIles to use: https://github.com/slidescope/Pandas-Read-or-write-to-excel

  1. Read the Excel file sales.xlsx from the sheet 2024_Q1 and display the first 5 rows.
  2. Read only the Name and Revenue columns from sales.xlsx.
  3. Write a DataFrame of student marks to marks.xlsx under sheet Semester1.
  4. Write two DataFrames (sales and expenses) into the same file report.xlsx in separate sheets.
  5. Append new data (df_new) into an existing Excel file as a new sheet.

Excel files (.xls, .xlsx) are one of the most common formats for sharing tabular data. Pandas makes it very simple to read from and write to Excel files.


1. Requirements

  • Install openpyxl (for .xlsx) or xlrd (for .xls):
pip install openpyxl xlrd

By default:

  • pandas ≥ 1.2 uses openpyxl to read/write .xlsx.
  • .xls (old format) requires xlrd (but newer versions of xlrd support only .xls, not .xlsx).

2. Reading Excel Files

Basic read

import pandas as pd

df = pd.read_excel("data.xlsx")   # reads first sheet by default
print(df.head())

Read a specific sheet

df = pd.read_excel("data.xlsx", sheet_name="Sales")
  • sheet_name can be a string (sheet name), int (sheet index, starting from 0), or list of names/indices.
sheets = pd.read_excel("data.xlsx", sheet_name=["Sales","Expenses"])
print(sheets["Sales"].head())

Read only selected columns

df = pd.read_excel("data.xlsx", usecols=["A","C","F"])  # by Excel columns
# or by column names
df = pd.read_excel("data.xlsx", usecols=["Name","Salary"])

Read only specific rows

df = pd.read_excel("data.xlsx", skiprows=2, nrows=10)
  • skiprows=2 → skip first 2 rows.
  • nrows=10 → read only 10 rows.

3. Writing to Excel

Write a DataFrame to Excel

df.to_excel("output.xlsx", index=False)
  • index=False → prevents writing row numbers.

Write to specific sheet

df.to_excel("output.xlsx", sheet_name="Summary", index=False)

Write multiple DataFrames to one Excel file

Use ExcelWriter:

with pd.ExcelWriter("report.xlsx") as writer:
    df1.to_excel(writer, sheet_name="Sales", index=False)
    df2.to_excel(writer, sheet_name="Expenses", index=False)

Append new sheet to an existing Excel file

with pd.ExcelWriter("report.xlsx", mode="a", engine="openpyxl") as writer:
    df3.to_excel(writer, sheet_name="NewData", index=False)

4. Controlling Output

Change column order and formatting

df.to_excel("output.xlsx", columns=["Name","Salary","Age"], index=False)

Add styling (optional with Styler)

styled = df.style.highlight_max(color="lightgreen")
styled.to_excel("styled.xlsx", engine="openpyxl")

5. Reading/Writing Excel with multiple sheets

  • Read all sheets into a dictionary:
sheets_dict = pd.read_excel("report.xlsx", sheet_name=None)
for sheet, data in sheets_dict.items():
    print(f"Sheet: {sheet}")
    print(data.head())
  • Write dictionary of DataFrames to multiple sheets:
dfs = {"Sales": df1, "Expenses": df2, "Summary": df3}
with pd.ExcelWriter("output.xlsx") as writer:
    for sheet, data in dfs.items():
        data.to_excel(writer, sheet_name=sheet, index=False)

6. Small End-to-End Example

import pandas as pd

# Load data
df = pd.read_excel("employees.xlsx", sheet_name="EmployeeData")

# Clean: fill missing salaries with average
df['Salary'].fillna(df['Salary'].mean(), inplace=True)

# Transform: add yearly salary
df['YearlySalary'] = df['Salary'] * 12

# Save cleaned data to new Excel
df.to_excel("employees_cleaned.xlsx", sheet_name="CleanedData", index=False)

Quick Exercises

FIles to use: https://github.com/slidescope/Pandas-Read-or-write-to-excel

  1. Read the Excel file sales.xlsx from the sheet 2024_Q1 and display the first 5 rows.
  2. Read only the Name and Revenue columns from sales.xlsx.
  3. Write a DataFrame of student marks to marks.xlsx under sheet Semester1.
  4. Write two DataFrames (sales and expenses) into the same file report.xlsx in separate sheets.
  5. Append new data (df_new) into an existing Excel file as a new sheet.