Practical Program: Time Series Data Analytics in Python

Practical Program: Time Series Data Analytics in Python

This program demonstrates time series data analytics using Pandas and Matplotlib. It includes loading time series data, analyzing trends, resampling data, and visualizing the results.
Scenario: Monthly Temperature Analysis  
We have a dataset of average monthly temperatures over several years. The goal is to analyze trends, compute yearly averages, and visualize the results.
Code Implementation
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Create a sample time series dataset
data = {
    'Date': pd.date_range(start='2015-01-01', end='2024-12-01', freq='M'),
    'Temperature': np.random.uniform(20, 35, size=120)  # Generate random temperatures
}
# Convert to a DataFrame
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)
# Display the dataset
print("Original Time Series Data:")
print(df.head())
# --- 1. Time Series Visualization ---
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['Temperature'], label='Monthly Avg Temp', color='blue', linewidth=1)
plt.title('Monthly Average Temperature (2015-2024)')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.legend()
plt.show()
# --- 2. Resampling to Yearly Data ---
yearly_avg = df.resample('Y').mean()
print("\nYearly Average Temperature:")
print(yearly_avg)
# Plot yearly average temperature
plt.figure(figsize=(10, 5))
plt.plot(yearly_avg.index, yearly_avg['Temperature'], marker='o', linestyle='-', color='green', label='Yearly Avg Temp')
plt.title('Yearly Average Temperature (2015-2024)')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.legend()
plt.show()
# --- 3. Rolling Window Analysis ---
df['Rolling Mean (12-month)'] = df['Temperature'].rolling(window=12).mean()
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['Temperature'], label='Monthly Avg Temp', color='blue', alpha=0.5)
plt.plot(df.index, df['Rolling Mean (12-month)'], label='12-Month Rolling Avg', color='red', linewidth=2)
plt.title('Monthly Temperature with 12-Month Rolling Average')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.legend()
plt.show()
# --- 4. Identify and Extract Seasonal Data (e.g., Summer Months) ---
summer_months = df[df.index.month.isin([6, 7, 8])]
print("\nSummer Months Temperature Data:")
print(summer_months.head())
# Seasonal Analysis: Average summer temperature
summer_avg = summer_months.resample('Y').mean()
plt.figure(figsize=(10, 5))
plt.bar(summer_avg.index.year, summer_avg['Temperature'], color='orange', alpha=0.7)
plt.title('Average Summer Temperature (2015-2024)')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.show()
# --- 5. Detect Trends and Anomalies ---
threshold = 30  # Set a threshold for high temperature
high_temp_anomalies = df[df['Temperature'] > threshold]
print("\nHigh Temperature Anomalies:")
print(high_temp_anomalies)
# Highlight anomalies in the original time series plot
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['Temperature'], label='Monthly Avg Temp', color='blue', linewidth=1)
plt.scatter(high_temp_anomalies.index, high_temp_anomalies['Temperature'], color='red', label='Anomalies', zorder=5)
plt.title('Monthly Temperature with Anomalies')
plt.xlabel('Year')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.legend()
plt.show()

Explanation of Code
1. Original Data:
   - The dataset includes monthly temperature values over a period.
   - Random data is generated using NumPy for demonstration.
2. Time Series Visualization:
   - Plots the raw monthly temperature data to observe trends and variability.
3. Resampling:
   - Aggregates the data to yearly frequency to compute yearly averages.
   - Useful for identifying long-term trends.
4. Rolling Mean:
   - Applies a 12-month rolling window to smooth out short-term fluctuations.
   - Highlights overall trends.
5. Seasonal Data Extraction:
   - Filters summer months (June, July, August) for seasonal analysis.
   - Aggregates seasonal data to compute yearly summer averages.
6. Trend and Anomaly Detection:
   - Identifies months with temperatures exceeding a defined threshold.
   - Highlights anomalies in the time series plot.

Sample Output
Original Time Series Data:
                      Temperature
Date                   
2015-01-31    23.452198
2015-02-28    26.783215
2015-03-31    22.234872
2015-04-30    28.992125
2015-05-31    27.211201

Yearly Average Temperature:
                      Temperature
Date                   
2015-12-31    27.192839
2016-12-31    26.574392
2017-12-31    27.784561

Summer Months Temperature Data:
                      Temperature
Date                   
2015-06-30    29.392181
2015-07-31    30.872291
2015-08-31    28.137201
2016-06-30    27.111202
2016-07-31    30.581921

High Temperature Anomalies:
                      Temperature
Date                   
2015-07-31    32.871293
2016-06-30    30.782129

Visuals Included
1. Monthly Temperature Plot:
   - Shows raw data with fluctuations.
2. Yearly Average Temperature Plot:
   - Aggregates data into yearly trends.
3. Rolling Average Plot:
   - Smoothed trends using a 12-month rolling window.
4. Summer Temperature Bar Chart:
   - Highlights average summer temperatures over the years.
5. Anomaly Detection:
   - Marks months with temperatures above the threshold.

Applications
Climate Analysis:
  - Study temperature trends, anomalies, and seasonality.
Sales Forecasting:
  - Analyze seasonal product sales (e.g., ice creams in summer).
Energy Usage:
  - Study temperature effects on energy demand.

This program provides foundational techniques for time series analysis, helping uncover insights from temporal data.

Comments