How do you filter rows in Polars?

Use df.filter(pl.col('column') == 'value') to filter rows in Polars. The filter() method accepts expression-based conditions built with pl.col(). You can combine multiple conditions with & (AND) and | (OR), and each condition must be wrapped in parentheses.

What is the Polars equivalent of pandas isin()?

The Polars equivalent of pandas isin() is is_in(). Use df.filter(pl.col('column').is_in(['value1', 'value2'])) to filter rows where a column value matches any item in a list.

How do you select specific columns in Polars?

Use df.select(['col1', 'col2', 'col3']) to select specific columns in Polars. This returns a new DataFrame containing only the requested columns. You can also use expressions inside select() for computed columns.

How do you slice rows by position in Polars?

Use df.slice(offset, length) or Python bracket notation df[0:3] to slice rows by position in Polars. For example, df.slice(0, 3) returns the first three rows. This is the equivalent of pandas df.iloc[0:3].

How to Filter and Slice Data with Polars — filter, select, slice Guide

Filtering is the most fundamental operation in data analysis. Whether you need transactions from a single state, rows matching multiple criteria, or just the first few records, Polars gives you a clean, expression-based API that avoids the quirks of pandas indexing.

If you have used pandas, you have probably written df.loc[df['col'] > 50] or wrestled with .iloc vs .loc distinctions. Polars replaces all of that with .filter() for row selection, .select() for column selection, and .slice() for positional access — no index to worry about.

This article covers eight patterns, each demonstrated on an interactive dataset you can edit and run directly in your browser:

filter() with a single condition — basic row filtering
filter() with multiple conditions using & — AND logic
filter() with is_in() — membership testing
select() — column selection
slice() and bracket indexing — row slicing by position
filter() with is_null() — finding missing values
Combining filter() and select() — rows and columns together

The dataset

We will use the same petrol station dataset from the Polars grouping tutorial. Each row represents a fuel transaction recorded at stations across Australia. The columns capture the station name, fuel type, litres sold, price per litre, and the state where the station is located. Some values are intentionally missing (null) to mirror real-world data quality issues.

Python — editable

import polars as pl

fuel = {
    "station": [
        'Caltex Bondi','Caltex Bondi','Caltex Bondi',
        'BP Southbank','BP Southbank','BP Southbank','BP Southbank',
        'Shell Fortitude Valley','Shell Fortitude Valley','Shell Fortitude Valley',
        'Caltex Bondi','Caltex Bondi',
        'BP Southbank','BP Southbank','BP Southbank'
    ],
    "fuel type": [
        'Unleaded','Diesel','Premium',
        'Unleaded','Unleaded','Diesel','Premium',
        'Diesel','Unleaded','Premium',
        'Diesel','Unleaded',
        'Diesel','Premium','Unleaded'
    ],
    "litres": [
        45.2, 60.0, 38.5,
        52.1, 47.8, None, 41.0,
        55.3, 44.9, None,
        58.7, 40.1,
        63.2, 35.6, 49.0
    ],
    "price per litre": [
        1.89, 1.95, 2.12,
        1.85, 1.85, 1.92, 2.09,
        1.93, None, 2.15,
        1.95, 1.89,
        1.92, 2.09, 1.85
    ],
    "state": [
        'NSW','NSW','NSW',
        'VIC','VIC','VIC','VIC',
        'QLD','QLD', None,
        'NSW','NSW',
        'VIC','VIC','VIC'
    ]
}

df = pl.DataFrame(fuel)
df

Figure 1: Fuel transactions — 15 rows, 5 columns.

The dataset has 15 transactions spread across three Australian petrol stations: Caltex Bondi (NSW), BP Southbank (VIC) and Shell Fortitude Valley (QLD). Let's start by filtering rows for a single state.

Basic filtering with `filter()`

The simplest filter selects rows where a column matches a value. Use pl.col() to reference the column and a comparison operator to define the condition:

Python — editable

df.filter(pl.col('state') == 'NSW')

Figure 2: Only NSW transactions — 5 rows returned.

This returns all rows where state equals 'NSW'. The syntax is similar to pandas (df[df['state'] == 'NSW']), but Polars wraps it in a dedicated .filter() method that takes an expression. The result is always a DataFrame — never a copy-on-write view or an ambiguous slice.

Multiple conditions with `&` (AND)

To combine conditions, use & for AND and | for OR. Each condition must be wrapped in parentheses:

Python — editable

df.filter(
    (pl.col('state') == 'VIC') & (pl.col('fuel type') == 'Diesel')
)

Figure 3: VIC Diesel transactions only.

The parentheses around each condition are required because Python's & operator has higher precedence than ==. Without them, you will get a confusing error. This is the same rule as pandas, but Polars makes it slightly more readable by keeping everything inside .filter() rather than using bracket notation.

You can also use | for OR logic. For example, df.filter((pl.col('state') == 'NSW') | (pl.col('state') == 'QLD')) would return rows from either state — though for multiple values, is_in() is cleaner.

Membership testing with `is_in()`

When you need to check if a column value is in a list of options, use .is_in() instead of chaining multiple OR conditions:

Python — editable

df.filter(pl.col('fuel type').is_in(['Diesel', 'Premium']))

Figure 4: Diesel and Premium transactions — Unleaded excluded.

This is the Polars equivalent of pandas df[df['fuel type'].isin(['Diesel', 'Premium'])]. The method name is is_in (with an underscore) rather than isin. It accepts any iterable — a list, tuple, or even a Polars Series.

Column selection with `select()`

To pick specific columns from a DataFrame, use .select(). Pass a list of column names:

Python — editable

df.select(['station', 'litres', 'price per litre'])

Figure 5: Three columns selected — station, litres, price per litre.

Unlike pandas where df[['col1', 'col2']] works for column selection, Polars uses the explicit .select() method. This is deliberate — bracket indexing in Polars is reserved for row slicing, which avoids the ambiguity that plagues pandas when a single bracket could mean either rows or columns.

You can also use expressions inside .select() for computed columns, but for simple column picking, a list of names is all you need.

Row slicing with `slice()`

To get rows by position (like pandas .iloc), use .slice(offset, length) or Python's bracket notation:

Python — editable

# Both produce the same result: first 3 rows
print("Using df[0:3]:")
print(df[0:3])
print("\nUsing df.slice(0, 3):")
print(df.slice(0, 3))

Figure 6: First 3 rows using bracket slicing and .slice().

df.slice(0, 3) takes two arguments: the starting offset and the number of rows. The bracket notation df[0:3] works the same way as standard Python slicing. Both are the Polars equivalent of pandas df.iloc[0:3].

Note that Polars does not have .iloc or .loc — it uses .slice() for positional access and .filter() for conditional access. This simpler model eliminates the common pandas confusion between label-based and position-based indexing.

Filtering for null values

Real-world data has missing values. Use .is_null() to find rows where a column is null, or .is_not_null() to exclude them:

Python — editable

df.filter(pl.col('litres').is_null())

Figure 7: Rows where litres is null — missing volume data.

This is the Polars equivalent of pandas df[df['litres'].isna()]. Polars uses null (not NaN) for missing data, and the method names use underscores: is_null() and is_not_null(). You can combine null checks with other conditions — for example, df.filter(pl.col('litres').is_not_null() & (pl.col('state') == 'VIC')) to get VIC rows with valid litre values.

Combining `filter()` and `select()`

The real power comes from chaining operations. Filter rows first, then select columns — or vice versa:

Python — editable

df.filter(
    (pl.col('state') == 'NSW') & (pl.col('litres').is_not_null())
).select(
    ['station', 'fuel type', 'litres']
)

Figure 8: NSW transactions with valid litres — 3 columns selected.

This is the Polars equivalent of the pandas pattern df.loc[mask, ['col1', 'col2']]. In Polars you chain .filter() for the row condition and .select() for the column list. The result is always a clean DataFrame, and the order of operations is explicit and readable.

Polars vs pandas: filtering compared

If you are coming from pandas, here is how the key filtering operations map:

df.loc[df['col'] > 50] → df.filter(pl.col('col') > 50)
df[df['col'].isin([...])] → df.filter(pl.col('col').is_in([...]))
df.loc[mask, ['col1','col2']] → df.filter(mask).select(['col1','col2'])
df.iloc[0:3] → df.slice(0, 3) or df[0:3]

The core advantage: Polars eliminates the .loc vs .iloc confusion. There is one method for conditional row selection (.filter()), one for column selection (.select()), and one for positional slicing (.slice()). No index to manage, no ambiguous bracket behaviour.

Try editing the code blocks above — change the filter column to price per litre, add an OR condition with |, or combine is_in() with select() to see how each pattern behaves.

Data Science Polars Python filtering

References

Polars documentation: polars.DataFrame.filter
Polars documentation: polars.DataFrame.select
Polars documentation: polars.DataFrame.slice
Polars documentation: polars.Expr.is_in
Polars user guide: Column selections
Pandas equivalent: How to Filter Data in Pandas

Suhith Illesinghe

Curiosity is the first step to make a difference. I hope to inspire others to explore, build and champion collaborative growth.

How to Filter and Slice Data with Polars

The dataset

Basic filtering with filter()

Multiple conditions with & (AND)

Membership testing with is_in()

Column selection with select()

Row slicing with slice()

Filtering for null values

Combining filter() and select()

Polars vs pandas: filtering compared

References

Related Articles

Basic filtering with `filter()`

Multiple conditions with `&` (AND)

Membership testing with `is_in()`

Column selection with `select()`

Row slicing with `slice()`

Combining `filter()` and `select()`