2883 Drop Missing Data

Problem Statement

DataFrame students

Column Name
Type

student_id

int

name

object

age

int

There are some rows having missing values in the name column.

Write a solution to remove the rows with missing values.

The result format is in the following example.

For the whole problem statement, please refer here.

Plans

  • Import the pandas library to work with DataFrames.

  • Create a function dropMissingData that takes a pandas DataFrame as input.

  • Use the pandas method to drop rows where the name column has missing values (i.e., where name is None or NaN).

  • The function should return the modified DataFrame without missing values in the name column.

Solution

import pandas as pd

def dropMissingData(students: pd.DataFrame) -> pd.DataFrame:
    # Drop rows with missing values in the 'name' column
    return students.dropna(subset=['name'])

Explanation

  1. Import Pandas

    • We start by importing the Pandas library, which provides data structures and operations for manipulating numerical tables and time series.

  2. Define the Function

    • We define a function dropMissingData that takes a single argument students, which is a DataFrame containing student data.

  3. Dropping Rows with Missing Values

    • We use the dropna method on the DataFrame students to remove rows with missing values in the name column.

    • The subset=['name'] argument specifies that we are looking for missing values in the name column.

  4. Return the Result

    • We return the modified DataFrame after dropping rows with missing values in the name column.

Last updated