Claude Code Learning

Learn to analyze data with Python, pandas, and Claude as your coding partner.

What You'll Build

A complete data analysis project that:

Loads and explores a real dataset
Cleans and transforms data
Creates visualizations
Generates insights
Is version-controlled with Git

Prerequisites

Python installed (Mac setup or Windows setup)
Basic terminal/command line knowledge
Git configured

Project Setup

1. Create Your Project

Bash

1# Create project folder
2mkdir sales-analysis
3cd sales-analysis
4 
5# Initialize Git
6git init
7 
8# Create virtual environment
9python -m venv venv
10 
11# Activate it
12# Mac/Linux:
13source venv/bin/activate
14# Windows:
15.\venv\Scripts\Activate.ps1

2. Install Packages

Bash

pip install pandas numpy matplotlib jupyter

3. Create Project Structure

Bash

1# Create folders
2mkdir data notebooks scripts outputs
3 
4# Create files
5touch README.md
6touch scripts/analysis.py
7touch .gitignore

4. Set Up Git Ignore

Add to .gitignore:

Bash

1venv/
2__pycache__/
3*.pyc
4.ipynb_checkpoints/
5*.csv
6!data/sample.csv
7outputs/*.png

Working with Claude

Ask Claude to Create CLAUDE.md

Open VS Code and ask Claude:

Bash

1Create a CLAUDE.md file for this data analysis project.
2 
3Project: Sales data analysis
4Language: Python
5Libraries: pandas, matplotlib
6Goal: Analyze monthly sales trends
7 
8Include:
9- Project context
10- Common commands
11- Example prompts for data analysis

Claude will generate a customized CLAUDE.md!

Your First Analysis

Load Data

Create scripts/load_data.py:

Ask Claude:

Bash

1Write a Python script to:
21. Load data/sales.csv using pandas
32. Display basic info (shape, columns, data types)
43. Show first few rows
54. Check for missing values
6 
7Include error handling.

Claude will generate something like:

Python

1import pandas as pd
2 
3def load_and_inspect_data(filepath):
4    """Load and inspect CSV data"""
5    try:
6        df = pd.read_csv(filepath)
7        
8        print(f"Dataset shape: {df.shape}")
9        print(f"\nColumns: {df.columns.tolist()}")
10        print(f"\nData types:\n{df.dtypes}")
11        print(f"\nFirst 5 rows:\n{df.head()}")
12        print(f"\nMissing values:\n{df.isnull().sum()}")
13        
14        return df
15    except FileNotFoundError:
16        print(f"Error: {filepath} not found")
17        return None
18 
19if __name__ == "__main__":
20    df = load_and_inspect_data("data/sales.csv")

Clean Data

Ask Claude:

Bash

1I have sales data with these issues:
2- Some missing values in 'revenue' column
3- Date column is string, needs to be datetime
4- Some negative quantities (data errors)
5 
6Write a function to clean this data.

Create Visualizations

Ask Claude:

Bash

1Create visualizations for sales data:
21. Monthly revenue trend (line plot)
32. Revenue by product category (bar chart)
43. Sales distribution (histogram)
5 
6Save plots to outputs/ folder.
7Use clear labels and titles.

Common Data Analysis Patterns

Explore Data

Python

1# Summary statistics
2df.describe()
3 
4# Value counts
5df['category'].value_counts()
6 
7# Group by analysis
8df.groupby('product')['revenue'].sum()
9 
10# Correlation
11df.corr()

Transform Data

Python

1# Create new columns
2df['profit'] = df['revenue'] - df['cost']
3 
4# Filter rows
5high_value = df[df['revenue'] > 1000]
6 
7# Sort
8df.sort_values('date', ascending=False)
9 
10# Aggregate
11monthly_sales = df.groupby('month').agg({
12    'revenue': 'sum',
13    'quantity': 'sum'
14})

Using Jupyter Notebooks

Start Jupyter

Bash

jupyter notebook

Ask Claude for Notebook Structure

Bash

1Create a Jupyter notebook structure for analyzing:
2- Customer purchase patterns
3- Seasonal trends
4- Product performance
5 
6Include markdown sections and code cell placeholders.

Git Workflow for Analysis

Commit Your Progress

Bash

1# After loading data
2git add scripts/load_data.py
3git commit -m "feat: add data loading script"
4 
5# After cleaning
6git add scripts/clean_data.py
7git commit -m "feat: add data cleaning pipeline"
8 
9# After visualization
10git add scripts/visualize.py outputs/
11git commit -m "feat: add sales visualizations"

Using Claude for Commits

Bash

I made these changes to my analysis:
[describe changes]
 
Suggest a good commit message.

Example: Complete Analysis

Full Workflow with Claude

Plan (ask Claude):

Bash

1I have sales data with: date, product, category, quantity, price.
2 
3Help me plan an analysis to find:
4- Best-selling products
5- Revenue trends
6- Seasonal patterns
7 
8What steps should I take?

Implement (with Claude's guidance):

Write loading script
Clean data
Perform calculations
Create visualizations
Document findings

Review (ask Claude):

Bash

1Review my analysis code:
2[paste code]
3 
4Check for:
5- Correctness
6- Efficiency
7- Best practices
8- Missing edge cases

Document (ask Claude):

Bash

1Write a README for this analysis project.
2 
3Include:
4- What the project does
5- How to run it
6- Key findings
7- Required dependencies

Best Practices

Code Organization

Bash

1sales-analysis/
2├── data/              # Raw data (gitignored)
3├── notebooks/         # Jupyter notebooks
4├── scripts/           # Python scripts
5│   ├── load.py
6│   ├── clean.py
7│   └── analyze.py
8├── outputs/           # Generated plots/reports
9├── README.md
10├── CLAUDE.md
11├── .gitignore
12└── requirements.txt

Requirements File

Create requirements.txt:

Bash

pandas==2.1.0
numpy==1.25.0
matplotlib==3.7.0
jupyter==1.0.0

Install from it:

Bash

pip install -r requirements.txt

Document Your Analysis

Ask Claude:

Bash

1Document this analysis function:
2[paste function]
3 
4Write a docstring explaining:
5- Purpose
6- Parameters
7- Returns
8- Example usage

Troubleshooting

Virtual Environment Issues

Bash

1# Can't activate venv
2# Mac: Add to ~/.zshrc
3export PATH="$HOME/.local/bin:$PATH"
4 
5# Windows: Change execution policy
6Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned

Import Errors

Bash

1# Wrong Python/pip
2which python
3which pip
4 
5# Reinstall in venv
6deactivate
7rm -rf venv
8python -m venv venv
9source venv/bin/activate
10pip install -r requirements.txt

Pandas Issues

Ask Claude:

Bash

1I'm getting this pandas error:
2[paste error]
3 
4From this code:
5[paste code]
6 
7How do I fix it?

Next Steps

Expand Your Skills

More Complex Analysis
- Time series forecasting
- Statistical testing
- Machine learning basics
Better Visualizations
- Seaborn for statistical plots
- Plotly for interactive charts
- Dashboards with Streamlit
Automation
- Schedule reports
- Process multiple files
- See Automation track

Sample Projects

Try analyzing:

COVID-19 data
Stock prices
Sports statistics
Weather patterns
Your own data!

Resources

Ready to analyze! Start with a simple dataset and let Claude guide you through the process.