Skip to content

datascientistshorya/Bike-Sharing-Regression-Prediction-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🚴 Bike Sharing Demand Prediction (OLS & Linear Regression) πŸ“Œ Project Overview This project focuses on building a robust regression framework to predict bike rental demand using the Bike Sharing dataset. The goal was to develop a model that not only performs well but is also interpretable, enabling clear insights into the factors influencing demand. We combined statistical modeling (OLS) with machine learning (Linear Regression) to strike a balance between explainability and predictive performance.

🎯 Problem Statement Accurately predicting bike rental demand is critical for:

Optimizing inventory and availability

Improving operational efficiency

Enhancing customer experience

This project aims to model demand patterns using historical data and uncover key drivers behind rental behavior.

🧠 Workflow & Methodology

  1. Data Preprocessing

Converted and processed date features

Extracted meaningful variables (e.g., month, temporal indicators)

Identified categorical vs numerical features

Handled missing values and ensured data consistency

  1. Feature Engineering

Applied One-Hot Encoding to categorical variables β†’ Enabled models to interpret non-numeric categories

Scaled numerical features β†’ Ensured uniform feature contribution and improved model stability

  1. Train-Test Split

Split dataset into training and testing sets

Ensured unbiased model evaluation and generalization

  1. OLS Regression (Statistical Modeling)

Built an initial Ordinary Least Squares (OLS) model

Used p-values to evaluate feature significance

Iteratively removed insignificant variables

Addressed multicollinearity (reduced condition number)

πŸ“Š Statistical Validation

Residual analysis performed to validate:

Linearity

Normality of residuals

Homoscedasticity

This step ensured the model was statistically sound and interpretable.

  1. Linear Regression (Machine Learning)

Built a sklearn Linear Regression model using selected features

Focused on predictive performance rather than inference

πŸ“ˆ Performance

RΒ² Score β‰ˆ 0.76 β†’ Strong explanatory power

Evaluated using absolute error metrics

Model performs well for most observations, with some larger deviations

πŸ” Key Insights

Demand is strongly influenced by:

🌦️ Weather conditions

πŸ“… Seasonality

πŸ—“οΈ Temporal patterns (weekends vs working days)

Proper feature selection significantly improves model stability

Removing multicollinearity enhances interpretability

πŸ“‰ Error Analysis

Residuals show:

Mild heteroscedasticity

Slight skewness

πŸ‘‰ Interpretation:

Model captures overall trends well

Some complex/non-linear patterns remain unmodeled

Larger errors occur in edge cases

βš–οΈ Model Strengths

βœ… Interpretable (thanks to OLS)

βœ… Good predictive performance (Linear Regression)

βœ… Statistically validated

βœ… Handles multicollinearity effectively

βœ… Real-world aligned insights

⚠️ Limitations

Assumes linear relationships

Does not fully capture:

Non-linear patterns

Feature interactions

Some prediction errors persist for extreme cases

πŸš€ Future Improvements

Add interaction terms

Apply non-linear transformations

Experiment with advanced models:

Ridge Regression

Lasso Regression

Random Forest / Tree-based models

Incorporate additional features:

Holidays

External/environmental factors

🏁 Conclusion This project delivers a strong baseline model for predicting bike rental demand. It successfully combines:

Rigorous statistical analysis

Thoughtful feature engineering

Practical machine learning implementation

The result is a model that balances simplicity, interpretability, and performance, making it a solid foundation for further enhancements and real-world deployment.

πŸ› οΈ Tech Stack

Python 🐍

Pandas & NumPy

Matplotlib & Seaborn

Statsmodels (OLS)

Scikit-learn

https://www.linkedin.com/in/shorya-bisht-a20144349/

About

This project leverages machine learning regression to forecast hourly bike rental demand based on environmental and seasonal factors. Using the UCI Bike Sharing Dataset, I implemented and optimized models like OLS and Linear Regression to handle multi-collinearity and non-linear patterns.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors