Summary of "Handling Date and Time Variables | Day 34 | 100 Days of Machine Learning"

Handling date & time variables (Day 34, 100 Days of ML)

Main idea

Date and time columns are highly informative. Convert columns stored as strings/objects to pandas datetime (pd.to_datetime) so you can extract many useful features (year, month, day, weekday, quarter, time parts, differences, etc.). The original video is a code-focused walkthrough demonstrating how to convert columns, extract features, and compute differences; the accompanying notebook is intended as a future reference.

Key concepts & lessons

Step-by-step methodology

  1. Inspect dtypes

    • Check whether a column is object/string or already datetime:
      • df.info() or df['date'].dtype
  2. Convert string/object to datetime

    • Example:
      • df['date'] = pd.to_datetime(df['date'])
      • df['time'] = pd.to_datetime(df['time']) (if you have a separate time column)
  3. Extract calendar / date features (use .dt)

    • Examples:
      • Year: df['year'] = df['date'].dt.year
      • Month (numeric): df['month'] = df['date'].dt.month
      • Month name: df['month_name'] = df['date'].dt.month_name()
      • Day of month: df['day'] = df['date'].dt.day
      • Day of week (numeric): df['dayofweek'] = df['date'].dt.dayofweek (Mon=0..Sun=6)
      • Day name: df['day_name'] = df['date'].dt.day_name()
      • Is weekend: df['is_weekend'] = df['date'].dt.dayofweek.isin([5, 6])
      • Week number: df['week'] = df['date'].dt.isocalendar().week (.dt.week is deprecated)
      • Day of year: df['day_of_year'] = df['date'].dt.dayofyear
      • Quarter: df['quarter'] = df['date'].dt.quarter
      • Semester (example logic): map quarters 1–2 -> semester 1, quarters 3–4 -> semester 2
  4. Compute differences between dates (timedelta)

    • Example diffs:
      • diff = pd.to_datetime('today') - df['date'] or df['date2'] - df['date1']
      • Days: diff.dt.days
      • Total seconds: diff.dt.total_seconds()
      • Convert to hours/minutes: diff.dt.total_seconds() / 3600 (hours) or / 60 (minutes)
    • If you only want the day portion: diff.dt.days. For component-level access, use attributes of the timedelta or convert as needed.
  5. Work with time-only parts

    • After conversion:
      • Hour: df['hour'] = df['time'].dt.hour
      • Minute: df['minute'] = df['time'].dt.minute
      • Second: df['second'] = df['time'].dt.second
      • Time object only: df['time_only'] = df['time'].dt.time
    • To compute time differences in seconds/minutes, use diff.dt.total_seconds() and divide as appropriate.

Notes & tips

Files / datasets used in the demo

Speakers / sources featured

Category ?

Educational


Share this summary


Is the summary off?

If you think the summary is inaccurate, you can reprocess it with the latest model.

Video