📊 Sales Data Analysis – Customer & Product Insights
This project aims to analyze a retail superstore's sales data to derive actionable insights on customer behavior, product trends, regional performance, and profit margins. It leverages Python, SQL, and data visualization tools to uncover patterns that can drive better business decisions.
📁 Project Structure
├── Cleaned_Superstore.csv # Cleaned dataset used for analysis ├── data_cleaning_processing.ipynb # Data cleaning & preprocessing using Python (Pandas) ├── EDA.ipynb # Exploratory Data Analysis and visualizations ├── SQLqury(project02).ipynb # SQL queries for business insights ├── masai_project_presentation.pptx # Final presentation slides
🛠️ Tools & Technologies Used
- Python: Data handling & visualization
- Pandas: Data manipulation
- Matplotlib & Seaborn: Visualizations
- SQL (via SQLite/PostgreSQL/MySQL): Querying and aggregating insights
- Jupyter Notebook: Coding environment
- PowerPoint: Final reporting and presentation
🔍 Key Analysis Performed
📌 Data Cleaning
- Converted date columns to datetime format
- Removed null values and duplicate records
- Extracted new features like Month-Year, Profit Margin
📌 Exploratory Data Analysis
- Grouped data by Category, Sub-Category, Region
- Identified top customers and profitable segments
- Analyzed quantity, discount, and profit relationships
📌 SQL Insights
- Total sales by customer segment
- Average discount by region
- Region-wise profitability
📌 Visualizations
- Bar Chart: Top 10 Products by Sales
- Line Chart: Monthly Sales Trend
- Pie Chart: Sales by Region
- Boxplot: Profit Distribution by Category
💡 Key Business Insights
- Technology category is the most profitable
- West Region has the highest sales and profit
- Central Region has the lowest average profit
- High-discount products can result in negative profits
- Sales peak during November–December, indicating seasonality