Pandas DataFrame Introduction

Chapters

Python Pandas Tutorial

A Pandas DataFrame is a two-dimensional, tabular data structure in Python that is similar to an Excel spreadsheet or a SQL table. It consists of rows and columns, where each column is a Pandas Series.

Creating a Pandas DataFrame

1. Creating a DataFrame from a Dictionary

import pandas as pd

# Creating a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

# Creating DataFrame
df = pd.DataFrame(data)

print(df)

Output:

     Name  Age         City
0   Alice   25     New York
1     Bob   30  Los Angeles
2  Charlie   35      Chicago

2. Creating a DataFrame from a List of Lists

data = [['Alice', 25, 'New York'], ['Bob', 30, 'Los Angeles'], ['Charlie', 35, 'Chicago']]
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])

print(df)

This method is useful when data is available as nested lists.

3. Creating a DataFrame from a List of Dictionaries

data = [
    {'Name': 'Alice', 'Age': 25, 'City': 'New York'},
    {'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'},
    {'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'}
]

df = pd.DataFrame(data)

print(df)

Accessing Data in a DataFrame

1. Accessing a Column

print(df['Name'])  # Accessing the 'Name' column

Output:

0     Alice
1       Bob
2  Charlie
Name: Name, dtype: object

2. Accessing Multiple Columns

print(df[['Name', 'Age']])

Output:

     Name  Age
0   Alice   25
1     Bob   30
2  Charlie   35

3. Accessing Rows Using `loc` and `iloc`

Using `loc[]` (Label-Based Indexing)

print(df.loc[1])  # Access row with index label 1

Output:

Name         Bob
Age           30
City    Los Angeles
Name: 1, dtype: object

Using `iloc[]` (Integer Position-Based Indexing)

print(df.iloc[0])  # Access the first row

Modifying Data in a DataFrame

1. Adding a New Column

df['Salary'] = [50000, 60000, 70000]
print(df)

2. Updating a Value

df.at[1, 'Age'] = 32  # Change Bob's age to 32
print(df)

Deleting Rows and Columns

1. Deleting a Column

df.drop(columns=['Salary'], inplace=True)

2. Deleting a Row

df.drop(index=1, inplace=True)  # Deletes the row with index 1

Importing and Exporting Data

1. Reading Data from a CSV File

df = pd.read_csv('data.csv')

2. Writing Data to a CSV File

df.to_csv('output.csv', index=False)

Summary of Pandas DataFrame

✅ 2D tabular data structure
✅ Can be created from dictionaries, lists, NumPy arrays, etc.
✅ Supports data selection, filtering, and manipulation
✅ Easy import/export with CSV, Excel, SQL, JSON