To begin data science, you need to master data analysis. Everything else — machine learning, statistics, visualisation — builds on your ability to wrangle and understand data. Get that right, and the rest follows.
What is Minimal Viable Analytics?
Data analysis can feel overwhelming. There are hundreds of pandas functions, dozens of chart types, and countless tutorials pulling you in every direction. But here's the truth: you only need 6 core skills to handle the vast majority of real-world data tasks.
Minimal Viable Analytics (MVA) borrows from the Minimum Viable Product concept. Instead of learning everything, you learn the minimum set of data-handling skills needed to reach an outcome — a decision, an insight, an answer.
Think of it like building a house.
The Foundation — Your Data
Every house needs a solid foundation. In data analysis, that foundation is your data — the raw material you'll work with to reach decisions.
This is your starting point. It could be a CSV export, a database table, or an API response. The quality of your analysis depends on knowing your data: What columns do you have? What does each row represent? How clean is it?
Before running any analysis, ask yourself: "What data do I have, and what question am I trying to answer?" Everything else flows from that.
The 6 Pillars — Core Skills
The pillars hold up the roof. These are the 6 data-handling operations that cover almost everything you'll need. Each pillar has a dedicated interactive tutorial where you can edit and run real Python code directly in your browser — no setup required.
Grouping & Aggregation
Group rows by category and summarise with counts, sums, or averages. The backbone of any analysis.
Filtering & Slicing
Select subsets of data based on conditions. Zero in on exactly the rows you need.
Sorting
Order data by one or more columns to find top and bottom values quickly.
Merging
Combine datasets from different sources. Join tables together like pieces of a puzzle.
Creating Columns
Derive new columns with calculations, binning, and encoding. Transform raw data into useful features.
Creating Graphs
Visualise patterns and trends. A good graph makes the data speak for itself.
Grouping & Aggregation
Group rows by category and summarise with group_by and agg expressions. No index headaches.
Filtering & Slicing
Select subsets of data with filter expressions, column selection, and slicing.
Sorting
Order data by columns, handle nulls, and find top/bottom values with top_k and bottom_k.
Joining
Combine datasets with inner, left, full, anti, and semi joins. Native anti/semi support.
Creating Columns
Derive new columns with expressions, conditional logic, binning, and one-hot encoding.
Creating Graphs
Visualise patterns and trends using matplotlib with Polars DataFrames.
Grouping & Aggregation
Collapse rows into categories with GROUP BY and aggregate with COUNT, SUM, AVG. Filter groups with HAVING.
Filtering & Slicing
Select subsets of data with WHERE, IN, BETWEEN, LIKE, and IS NULL. Paginate with LIMIT/OFFSET.
Sorting
Order results with ORDER BY, handle NULLs, and find top/bottom values with LIMIT.
Joining
Combine tables with INNER JOIN, LEFT JOIN, CROSS JOIN, and the anti-join pattern.
Creating Columns
Derive new columns with CASE WHEN, COALESCE, ROUND, CAST, and string functions.
Creating Graphs
Query data with SQL and visualise it with Chart.js — bar, line, pie, and doughnut charts.
Grouping & Aggregation
Group rows with groupBy and aggregate with F.count, F.sum, F.avg. Filter groups with a chained .filter().
Filtering & Slicing
Select subsets with .filter(), F.col(), .isin(), .between(), .like(), and .isNull().
Sorting
Order data with .orderBy(), F.desc(), F.asc(), multi-column sort, and .limit() for top values.
Joining
Combine DataFrames with .join() — inner, left, left_anti, and .crossJoin(). Native anti-join support.
Creating Columns
Derive new columns with .withColumn(), F.when().otherwise(), F.coalesce(), F.round(), and .cast().
Creating Graphs
Convert to pandas with .toPandas() and visualise with matplotlib — bar, line, pie, and scatter charts.
The Roof — Decisions
The roof is what the house is for — shelter. In data analysis, the roof is the decision or insight you reach. The 6 pillars exist to get you there.
With these 6 skills you can answer questions like:
- Which product category generates the most revenue?
- How has customer churn changed month over month?
- Which stores should we expand, and which should we consolidate?
- What's the trend in EV vs diesel car registrations?
That's the whole point. Data analysis isn't about knowing every tool — it's about reaching a decision with confidence.
Why This Works
MVA is technology-agnostic — the 6 pillars apply whether you use pandas, SQL, Excel, or R. But pandas is a great place to start because:
- It's free and runs everywhere (including in your browser with our tutorials)
- The syntax maps cleanly to the 6 pillar operations
- It handles small and medium datasets with ease
- It's the most widely-used data analysis library in Python
The reductionist mindset is the key. You don't need to learn 200 pandas functions. You need to learn 6 operations well enough to reach your decisions.
How to Use This Learning Path
Work through the tutorials in order. Each one is interactive — you can edit the code and run it directly in your browser. No Python installation needed.
- Start with Pillar 1 (Grouping) — it introduces the dataset and core patterns
- Work through Pillars 2-5 — each builds on the same fundamentals
- Finish with Pillar 6 (Graphs) — bring your analysis to life visually
By the end, you'll have the skills to tackle real data analysis problems with confidence.
Ready to start?
Begin with the first pillar — grouping and aggregation in pandas.
Start Pillar 1: Grouping →Begin with the first pillar — grouping and aggregation with Polars.
Start Pillar 1: Grouping →Begin with the first pillar — grouping and aggregation with SQL.
Start Pillar 1: Grouping →Begin with the first pillar — grouping and aggregation with PySpark.
Start Pillar 1: Grouping →