How do you create a new column in Polars?

Use df.with_columns() to create new columns in Polars. Pass an expression like pl.lit('value').alias('col_name') for constants, or (pl.col('a') * pl.col('b')).alias('result') for computed columns. with_columns always returns a new DataFrame without modifying the original.

What is the Polars equivalent of pandas df['new_col'] = value?

In Polars, use df.with_columns(pl.lit(value).alias('new_col')). Unlike pandas, Polars DataFrames are immutable, so with_columns returns a new DataFrame rather than modifying in place. Use pl.lit() to wrap scalar values into column expressions.

How do you create a conditional column in Polars?

Use pl.when(condition).then(value_if_true).otherwise(value_if_false).alias('col_name') inside with_columns. This is the Polars equivalent of numpy.where() or pandas apply with if/else. You can chain multiple .when().then() calls for multi-condition logic.

How do you one-hot encode columns in Polars?

Use df.to_dummies(columns=['column_name']) to one-hot encode categorical columns in Polars. This is equivalent to pandas pd.get_dummies(df, columns=['column_name']). Each unique value becomes a binary column with 1 or 0.

Can you create multiple columns at once in Polars?

Yes. Pass multiple expressions to a single with_columns() call: df.with_columns(expr1, expr2, expr3). Each expression creates or overwrites a column. This is more efficient than chaining multiple with_columns calls because Polars can optimise the expressions together.

How to Create Columns with Polars — with_columns, lit, when/then Guide

Creating new columns is one of the most fundamental operations in data analysis. Whether you are adding a constant label, computing a derived value from existing columns, or categorising rows based on conditions, column creation is something you will do in virtually every project.

If you have used pandas, you are probably used to assigning columns with df['new'] = value or df.assign(). Polars takes a different approach. Because Polars DataFrames are immutable, you use with_columns() to return a new DataFrame with the added columns. This might feel unusual at first, but it leads to cleaner, more composable code.

This article covers seven patterns for creating columns in Polars, each demonstrated on an interactive dataset you can edit and run directly in your browser:

pl.DataFrame() — create the dataset
pl.lit() — add a constant column
Arithmetic expressions — compute a column from existing columns
pl.when().then().otherwise() — conditional column
.cut() — bin continuous values into categories
.to_dummies() — one-hot encoding
Multiple columns in a single with_columns() call

The dataset

We will use a small petrol station dataset. Each row represents a fuel transaction recorded at stations across Australia. The columns capture the station name, fuel type, litres sold, price per litre, and the state where the station is located. Some values are intentionally missing (None) to mirror real-world data quality issues.

Python — editable

import polars as pl

fuel = {
    "station": [
        'Caltex Bondi','Caltex Bondi','Caltex Bondi',
        'BP Southbank','BP Southbank','BP Southbank','BP Southbank',
        'Shell Fortitude Valley','Shell Fortitude Valley','Shell Fortitude Valley',
        'Caltex Bondi','Caltex Bondi',
        'BP Southbank','BP Southbank','BP Southbank'
    ],
    "fuel type": [
        'Unleaded','Diesel','Premium',
        'Unleaded','Unleaded','Diesel','Premium',
        'Diesel','Unleaded','Premium',
        'Diesel','Unleaded',
        'Diesel','Premium','Unleaded'
    ],
    "litres": [
        45.2, 60.0, 38.5,
        52.1, 47.8, None, 41.0,
        55.3, 44.9, None,
        58.7, 40.1,
        63.2, 35.6, 49.0
    ],
    "price per litre": [
        1.89, 1.95, 2.12,
        1.85, 1.85, 1.92, 2.09,
        1.93, None, 2.15,
        1.95, 1.89,
        1.92, 2.09, 1.85
    ],
    "state": [
        'NSW','NSW','NSW',
        'VIC','VIC','VIC','VIC',
        'QLD','QLD', None,
        'NSW','NSW',
        'VIC','VIC','VIC'
    ]
}

df = pl.DataFrame(fuel)
df

Figure 1: Fuel transactions — 15 rows, 5 columns.

The dataset has 15 transactions spread across three Australian petrol stations: Caltex Bondi (NSW), BP Southbank (VIC) and Shell Fortitude Valley (QLD). Now let's start adding new columns.

Adding a constant column with `pl.lit()`

The simplest column creation is adding a constant value to every row. In Polars, you wrap scalar values with pl.lit() (short for "literal") and give the column a name with .alias():

Python — editable

df.with_columns(pl.lit('AUD').alias('currency'))

Figure 2: Every row now has a "currency" column set to "AUD".

The pl.lit('AUD') expression creates a column where every row contains the string "AUD", and .alias('currency') names it. This is the Polars equivalent of df['currency'] = 'AUD' in pandas — but instead of mutating in place, with_columns() returns a new DataFrame. The original df remains unchanged.

Computing a column from existing columns

You can create columns by combining existing columns with arithmetic expressions. Here we multiply litres by price per litre to calculate the total cost of each transaction:

Python — editable

df.with_columns(
    (pl.col('litres') * pl.col('price per litre')).alias('total_cost')
)

Figure 3: A new "total_cost" column computed from litres and price per litre.

Notice that rows where either litres or price per litre is null produce a null in total_cost. Polars propagates nulls through arithmetic automatically — no special handling needed. In pandas, you would write df['total_cost'] = df['litres'] * df['price per litre'], which looks simpler but mutates the DataFrame in place.

Conditional columns with `when/then/otherwise`

For if/else logic, Polars provides pl.when(), .then(), and .otherwise(). This is the equivalent of numpy.where() or a pandas .apply() with a lambda, but it runs as a native Polars expression — no Python loop overhead:

Python — editable

df.with_columns(
    pl.when(pl.col('litres') > 50)
      .then(pl.lit('Large'))
      .otherwise(pl.lit('Small'))
      .alias('fill_size')
)

Figure 4: Transactions classified as "Large" or "Small" based on litres.

Rows where litres exceeds 50 are labeled "Large"; everything else is "Small". Rows with null litres will produce "Small" because null > 50 evaluates to false in the condition. If you need to handle nulls separately, you can chain an additional .when(pl.col('litres').is_null()).then(pl.lit('Unknown')) before .otherwise().

Binning with `cut()`

When you need to bucket continuous values into labeled categories, Polars offers .cut() on column expressions. This is the equivalent of pd.cut() in pandas:

Python — editable

df.with_columns(
    pl.col('litres')
      .cut([30, 45, 60], labels=['Small', 'Medium', 'Large', 'XL'])
      .alias('volume_band')
)

Figure 5: Litres binned into volume bands — Small, Medium, Large, XL.

The breaks [30, 45, 60] create four bins: up to 30 (Small), 30–45 (Medium), 45–60 (Large), and above 60 (XL). The labels parameter assigns human-readable names to each bin. Null values in litres will produce null in volume_band.

One-hot encoding with `to_dummies()`

For machine learning pipelines, you often need to convert categorical columns into binary indicator columns. Polars provides to_dummies() for this:

Python — editable

df.to_dummies(columns=['fuel type'])

Figure 6: One-hot encoded fuel type columns — each unique value becomes a binary column.

Each unique value in fuel type becomes its own column (fuel type_Diesel, fuel type_Premium, fuel type_Unleaded) with a 1 where the row matches and 0 otherwise. This is the Polars equivalent of pd.get_dummies(df, columns=['fuel type']) in pandas.

Multiple columns in one `with_columns()` call

One of Polars' biggest strengths is creating multiple columns in a single with_columns() call. Each expression produces a column in the output, and Polars can optimise them together:

Python — editable

df.with_columns(
    pl.lit('AUD').alias('currency'),
    (pl.col('litres') * pl.col('price per litre')).alias('total_cost'),
    pl.when(pl.col('litres') > 50)
      .then(pl.lit('Large'))
      .otherwise(pl.lit('Small'))
      .alias('fill_size'),
    pl.col('litres')
      .cut([30, 45, 60], labels=['Small', 'Medium', 'Large', 'XL'])
      .alias('volume_band')
)

Figure 7: Four new columns created in a single with_columns call.

All four columns — currency, total_cost, fill_size, and volume_band — are added in one operation. This is more efficient than chaining multiple with_columns() calls because Polars can run the expressions in parallel internally. In pandas, you would typically need separate assignment statements or use df.assign() with multiple keyword arguments.

Polars vs pandas: creating columns compared

If you are coming from pandas, here is how the key column-creation operations map:

df['new'] = value → df.with_columns(pl.lit(value).alias('new'))
df.assign(new=...) → df.with_columns(...)
df['a'] * df['b'] → pl.col('a') * pl.col('b') inside with_columns
pd.cut(df['col'], bins) → pl.col('col').cut(breaks)
pd.get_dummies(df) → df.to_dummies()
np.where(cond, a, b) → pl.when(cond).then(a).otherwise(b)

The core difference: pandas mutates DataFrames in place (or returns copies depending on the method), while Polars always returns a new DataFrame. This immutability makes Polars code easier to reason about and debug — you never have to worry about accidental side effects or SettingWithCopyWarning.

Try editing the code blocks above — change the threshold in the when() condition, add new bin boundaries to cut(), or compute your own derived columns to see how each pattern behaves.

Data Science Polars Python with_columns

References

Polars documentation: polars.DataFrame.with_columns
Polars documentation: polars.lit
Polars documentation: polars.when
Polars documentation: polars.Expr.cut
Polars documentation: polars.DataFrame.to_dummies
Polars user guide: Column selections

Suhith Illesinghe

Curiosity is the first step to make a difference. I hope to inspire others to explore, build and champion collaborative growth.

How to Create Columns with Polars

The dataset

Adding a constant column with pl.lit()

Computing a column from existing columns

Conditional columns with when/then/otherwise

Binning with cut()

One-hot encoding with to_dummies()

Multiple columns in one with_columns() call

Polars vs pandas: creating columns compared

References

Related Articles

Adding a constant column with `pl.lit()`

Conditional columns with `when/then/otherwise`

Binning with `cut()`

One-hot encoding with `to_dummies()`

Multiple columns in one `with_columns()` call