2890 Reshape Data: Melt

Problem Statement

DataFrame report

Column Name
Type

product

object

quarter_1

int

quarter_2

int

quarter_3

int

quarter_4

int

Write a solution to reshape the data so that each row represents sales data for a product in a specific quarter.

The result format is in the following example.

For the whole problem statement, please refer here.

Plans

  • Use pandas to handle the data.

  • Melt the table to have a column for each quarter.

  • Sort the table by product and quarter.

  • Provide the reshaped DataFrame.

Solution

import pandas as pd

def meltTable(report: pd.DataFrame) -> pd.DataFrame:
    # Melt the table to have a column for each quarter
    report = pd.melt(
        report,
        id_vars=['product'],
        value_vars=['quarter_1', 'quarter_2', 'quarter_3', 'quarter_4'],
        var_name='quarter',
        value_name='sales'
    )

    # Sort the table by product and quarter
    return report

Explanation

  1. Import Pandas

    • We start by importing the Pandas library, which provides data structures and operations for manipulating numerical tables and time series.

  2. Define the Function

    • We define a function meltTable that takes a single argument report, which is a DataFrame containing sales data for products across quarters.

  3. Melt the Table

    • We use the melt function on the DataFrame report to reshape the data.

      • The id_vars=['product'] argument specifies that the product column should be kept as an identifier variable.

      • The value_vars=['quarter_1', 'quarter_2', 'quarter_3', 'quarter_4'] argument specifies the columns to be melted.

      • The var_name='quarter' argument specifies the name of the new column that will contain the quarter information.

      • The value_name='sales' argument specifies the name of the new column that will contain the sales data.

  4. Sort the Table

    • We sort the reshaped DataFrame by product and quarter to have a consistent order for better readability.

  5. Return the Result

    • We return the reshaped DataFrame where each row represents sales data for a product in a specific quarter.

Last updated