Filtering and slicing data is a critical skill to master in data science and data analysis. Every data professional needs to access individual data points, slice subsets, and filter rows based on conditions. Pandas provides two powerful methods for this: iloc (integer-location based) and loc (label-based). In this article you will learn:
- How to use
ilocto slice data by position - How to use
locto slice data by label - How to conditionally filter DataFrames with
ilocandloc
All examples use an Australian petrol station dataset that you can edit and run directly in your browser — no install required.
The dataset
We will use a petrol station dataset with fuel transactions recorded across Australia. Each row captures a station_id, station name, state, fuel_type, litres sold, and price_per_litre. Some values are intentionally missing to mirror real-world data quality issues. We set station_id as the index so we can demonstrate the difference between iloc and loc.
The DataFrame has five columns and a custom station_id index (F001 to F015). This index is important — iloc ignores it and uses integer positions, while loc uses these labels directly.
How to use iloc to slice data?
With iloc, every cell in the DataFrame is referenced by its integer position. Rows are numbered starting from zero at the top, and columns are numbered starting from zero on the left. Let's access the value at row 0, column 0 — that should be the first station name.
Note that indexing starts from zero, not one. Now let's slice a range. Say you want rows F006, F007, and F008 (integer positions 5, 6, 7) and columns station, state, and fuel_type (positions 0, 1, 2). With iloc, the end of the range is excluded — so you write 5:8 and 0:3.
Be careful with ranges — the maximum value is not included. Row 8 and column 3 are excluded from the result. What if you don't want every row in a range, but specific rows and columns? You can pass lists of integer positions.
A lesser-known feature of iloc is that you can pass boolean lists instead of integer lists. The boolean list must have the same length as the number of rows (or columns). Let's reproduce the same result using True/False values.
The boolean approach is less common with iloc directly, but it becomes very useful when we get to conditional filtering later. Let's move on to loc.
How to use loc to slice data?
The loc method uses labels — index values and column names — instead of integer positions. Let's access the same first cell, but this time using the index label 'F001' and the column name 'station'.
Same result. Now let's slice a range with loc. One key difference: loc ranges are inclusive — both the start and end values are included in the result.
Notice that F008 and fuel_type are both included — unlike iloc where the end of the range is excluded. You can also pass lists of specific labels and column names.
Most things iloc can do, loc can do too — just with labels instead of integers. Now let's look at the real power: conditional filtering.
How to conditionally filter the DataFrame?
Conditional filtering lets you pass a logical expression to loc or iloc to select only the rows that meet your criteria. Say the business wants all transactions from the Caltex Bondi station.
The same can be achieved with iloc. The only difference is you need to convert the boolean Series to a list first.
Now let's add a second condition. The business wants to find all Caltex Bondi transactions where fuel_type is missing (NaN). We combine conditions with & (and) and wrap each condition in parentheses.
Updating filtered values
Filtering is not just for reading data — you can also update values. Say all Shell Fortitude Valley transactions should stop selling fuel and we want to set their litres and price_per_litre to NaN.
The same can be achieved with iloc — just replace column names with integer positions and convert the boolean condition to a list.
Which method should you use?
Both iloc and loc can achieve the same results. The choice depends on your situation:
iloc— best when you know the position of the data you want. Ideal for iterating through rows, selecting the first/last N rows, or working with DataFrames where the index is not meaningful.loc— best when you know the labels. More readable and less error-prone when columns have clear names. The go-to choice for conditional filtering.- Conditional filtering — use
locfor readability. You can combine multiple conditions with&(and) and|(or), and use.isna()to find missing values.
Try editing the code blocks above to experiment — filter by state, slice different column ranges, or add your own conditions to see how each method behaves.
References
- Original article: How to filter and slice data with Pandas? — Medium
- pandas documentation: pandas.DataFrame.iloc
- pandas documentation: pandas.DataFrame.loc
- pandas documentation: Indexing and selecting data