To begin data science, you need to master data analysis. Everything else — machine learning, statistics, visualisation — builds on your ability to wrangle and understand data. Get that right, and the rest follows.

What is Minimal Viable Analytics?

Data analysis can feel overwhelming. There are hundreds of pandas functions, dozens of chart types, and countless tutorials pulling you in every direction. But here's the truth: you only need 6 core skills to handle the vast majority of real-world data tasks.

Minimal Viable Analytics (MVA) borrows from the Minimum Viable Product concept. Instead of learning everything, you learn the minimum set of data-handling skills needed to reach an outcome — a decision, an insight, an answer.

Think of it like building a house.

Decisions & Insights
Group & Aggregate
Filter & Slice
Sort
Merge
Create Columns
Create Graphs
Foundation: Your Data

The Foundation — Your Data

Every house needs a solid foundation. In data analysis, that foundation is your data — the raw material you'll work with to reach decisions.

This is your starting point. It could be a CSV export, a database table, or an API response. The quality of your analysis depends on knowing your data: What columns do you have? What does each row represent? How clean is it?

Before running any analysis, ask yourself: "What data do I have, and what question am I trying to answer?" Everything else flows from that.

The 6 Pillars — Core Skills

The pillars hold up the roof. These are the 6 data-handling operations that cover almost everything you'll need. Each pillar has a dedicated interactive tutorial where you can edit and run real Python code directly in your browser — no setup required.

1

Grouping & Aggregation

Group rows by category and summarise with counts, sums, or averages. The backbone of any analysis.

2

Filtering & Slicing

Select subsets of data based on conditions. Zero in on exactly the rows you need.

3

Sorting

Order data by one or more columns to find top and bottom values quickly.

4

Merging

Combine datasets from different sources. Join tables together like pieces of a puzzle.

5

Creating Columns

Derive new columns with calculations, binning, and encoding. Transform raw data into useful features.

6

Creating Graphs

Visualise patterns and trends. A good graph makes the data speak for itself.

1

Grouping & Aggregation

Group rows by category and summarise with group_by and agg expressions. No index headaches.

2

Filtering & Slicing

Select subsets of data with filter expressions, column selection, and slicing.

3

Sorting

Order data by columns, handle nulls, and find top/bottom values with top_k and bottom_k.

4

Joining

Combine datasets with inner, left, full, anti, and semi joins. Native anti/semi support.

5

Creating Columns

Derive new columns with expressions, conditional logic, binning, and one-hot encoding.

6

Creating Graphs

Visualise patterns and trends using matplotlib with Polars DataFrames.

1

Grouping & Aggregation

Collapse rows into categories with GROUP BY and aggregate with COUNT, SUM, AVG. Filter groups with HAVING.

2

Filtering & Slicing

Select subsets of data with WHERE, IN, BETWEEN, LIKE, and IS NULL. Paginate with LIMIT/OFFSET.

3

Sorting

Order results with ORDER BY, handle NULLs, and find top/bottom values with LIMIT.

4

Joining

Combine tables with INNER JOIN, LEFT JOIN, CROSS JOIN, and the anti-join pattern.

5

Creating Columns

Derive new columns with CASE WHEN, COALESCE, ROUND, CAST, and string functions.

6

Creating Graphs

Query data with SQL and visualise it with Chart.js — bar, line, pie, and doughnut charts.

1

Grouping & Aggregation

Group rows with groupBy and aggregate with F.count, F.sum, F.avg. Filter groups with a chained .filter().

2

Filtering & Slicing

Select subsets with .filter(), F.col(), .isin(), .between(), .like(), and .isNull().

3

Sorting

Order data with .orderBy(), F.desc(), F.asc(), multi-column sort, and .limit() for top values.

4

Joining

Combine DataFrames with .join() — inner, left, left_anti, and .crossJoin(). Native anti-join support.

5

Creating Columns

Derive new columns with .withColumn(), F.when().otherwise(), F.coalesce(), F.round(), and .cast().

6

Creating Graphs

Convert to pandas with .toPandas() and visualise with matplotlib — bar, line, pie, and scatter charts.

The Roof — Decisions

The roof is what the house is for — shelter. In data analysis, the roof is the decision or insight you reach. The 6 pillars exist to get you there.

With these 6 skills you can answer questions like:

That's the whole point. Data analysis isn't about knowing every tool — it's about reaching a decision with confidence.

Why This Works

MVA is technology-agnostic — the 6 pillars apply whether you use pandas, SQL, Excel, or R. But pandas is a great place to start because:

The reductionist mindset is the key. You don't need to learn 200 pandas functions. You need to learn 6 operations well enough to reach your decisions.

How to Use This Learning Path

Work through the tutorials in order. Each one is interactive — you can edit the code and run it directly in your browser. No Python installation needed.

By the end, you'll have the skills to tackle real data analysis problems with confidence.

Ready to start?

Begin with the first pillar — grouping and aggregation in pandas.

Start Pillar 1: Grouping →

Begin with the first pillar — grouping and aggregation with Polars.

Start Pillar 1: Grouping →

Begin with the first pillar — grouping and aggregation with SQL.

Start Pillar 1: Grouping →

Begin with the first pillar — grouping and aggregation with PySpark.

Start Pillar 1: Grouping →