1. Introduction

In real-world datasets, information is often stored in a wide format, where each column represents a variable. While this format is good for human readability, it is not always the best for data analysis and visualization.
The melt() function in Pandas helps convert wide data into long (tidy) format, making it easier for analysis and integration with libraries like Seaborn or Matplotlib.


2. Syntax of melt()

pd.melt(frame, 
        id_vars=None, 
        value_vars=None, 
        var_name=None, 
        value_name='value')
  • frame → DataFrame to reshape
  • id_vars → Columns to keep fixed (identifiers)
  • value_vars → Columns to unpivot (default: all except id_vars)
  • var_name → Name of the “variable” column (default: variable)
  • value_name → Name of the “value” column (default: value)

3. Example Dataset

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 90, 95],
    'Science': [88, 92, 96],
    'English': [80, 85, 89]
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Math  Science  English
0    Alice    85       88       80
1      Bob    90       92       85
2  Charlie    95       96       89

4. Applying melt()

(a) Basic Melt

df_melted = pd.melt(df, id_vars=['Name'])
print(df_melted)

Output:

      Name variable  value
0    Alice     Math     85
1      Bob     Math     90
2  Charlie     Math     95
3    Alice  Science     88
4      Bob  Science     92
5  Charlie  Science     96
6    Alice  English     80
7      Bob  English     85
8  Charlie  English     89

(b) Custom Column Names

df_melted = pd.melt(df, 
                    id_vars=['Name'], 
                    var_name='Subject', 
                    value_name='Score')
print(df_melted)

Output:

      Name  Subject  Score
0    Alice     Math     85
1      Bob     Math     90
2  Charlie     Math     95
3    Alice  Science     88
4      Bob  Science     92
5  Charlie  Science     96
6    Alice  English     80
7      Bob  English     85
8  Charlie  English     89

5. Why Use melt()?

  • Makes datasets tidy (each row = 1 observation, each column = 1 variable).
  • Easier to plot using Seaborn (sns.barplot, sns.lineplot).
  • Simplifies statistical analysis.

6. Example with Multiple Identifiers

data = {
    'Student': ['Alice', 'Bob', 'Charlie'],
    'Class': ['A', 'B', 'A'],
    'Math': [85, 90, 95],
    'Science': [88, 92, 96]
}

df = pd.DataFrame(data)
df_melted = pd.melt(df, 
                    id_vars=['Student', 'Class'], 
                    var_name='Subject', 
                    value_name='Score')

print(df_melted)

Output:

   Student Class  Subject  Score
0    Alice     A     Math     85
1      Bob     B     Math     90
2  Charlie     A     Math     95
3    Alice     A  Science     88
4      Bob     B  Science     92
5  Charlie     A  Science     96

7. Conclusion

The melt() function is powerful for reshaping wide-format data into long-format data. This transformation is crucial for tidy data principles and ensures compatibility with advanced analysis and visualization techniques.

1. Introduction

In real-world datasets, information is often stored in a wide format, where each column represents a variable. While this format is good for human readability, it is not always the best for data analysis and visualization.
The melt() function in Pandas helps convert wide data into long (tidy) format, making it easier for analysis and integration with libraries like Seaborn or Matplotlib.


2. Syntax of melt()

pd.melt(frame, 
        id_vars=None, 
        value_vars=None, 
        var_name=None, 
        value_name='value')
  • frame → DataFrame to reshape
  • id_vars → Columns to keep fixed (identifiers)
  • value_vars → Columns to unpivot (default: all except id_vars)
  • var_name → Name of the “variable” column (default: variable)
  • value_name → Name of the “value” column (default: value)

3. Example Dataset

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 90, 95],
    'Science': [88, 92, 96],
    'English': [80, 85, 89]
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Math  Science  English
0    Alice    85       88       80
1      Bob    90       92       85
2  Charlie    95       96       89

4. Applying melt()

(a) Basic Melt

df_melted = pd.melt(df, id_vars=['Name'])
print(df_melted)

Output:

      Name variable  value
0    Alice     Math     85
1      Bob     Math     90
2  Charlie     Math     95
3    Alice  Science     88
4      Bob  Science     92
5  Charlie  Science     96
6    Alice  English     80
7      Bob  English     85
8  Charlie  English     89

(b) Custom Column Names

df_melted = pd.melt(df, 
                    id_vars=['Name'], 
                    var_name='Subject', 
                    value_name='Score')
print(df_melted)

Output:

      Name  Subject  Score
0    Alice     Math     85
1      Bob     Math     90
2  Charlie     Math     95
3    Alice  Science     88
4      Bob  Science     92
5  Charlie  Science     96
6    Alice  English     80
7      Bob  English     85
8  Charlie  English     89

5. Why Use melt()?

  • Makes datasets tidy (each row = 1 observation, each column = 1 variable).
  • Easier to plot using Seaborn (sns.barplot, sns.lineplot).
  • Simplifies statistical analysis.

6. Example with Multiple Identifiers

data = {
    'Student': ['Alice', 'Bob', 'Charlie'],
    'Class': ['A', 'B', 'A'],
    'Math': [85, 90, 95],
    'Science': [88, 92, 96]
}

df = pd.DataFrame(data)
df_melted = pd.melt(df, 
                    id_vars=['Student', 'Class'], 
                    var_name='Subject', 
                    value_name='Score')

print(df_melted)

Output:

   Student Class  Subject  Score
0    Alice     A     Math     85
1      Bob     B     Math     90
2  Charlie     A     Math     95
3    Alice     A  Science     88
4      Bob     B  Science     92
5  Charlie     A  Science     96

7. Conclusion

The melt() function is powerful for reshaping wide-format data into long-format data. This transformation is crucial for tidy data principles and ensures compatibility with advanced analysis and visualization techniques.