📈 #76 Optimizing with Google OR-Tools: strategies and insights from three years of real-world experience

Leveraging boolean logic, interval variables, and decision strategies

May 19, 2025

∙ Paid

It was nearly the summer of 2020.

In the first summer of the pandemic, I was enrolled in a Proof of Concept for a distribution of goods problem on hot days in Spain without knowing where to go. Basically, a Capacitated VRP with Time Windows problem.

I wanted to develop the thing, including a user interface, as fast as possible.

A user interface?

A user interface.

So I mixed and matched Streamlit and the routing module of Google OR-Tools.

It was the first time I used OR-Tools and it was pretty easy. The docs for that module are decent, and you don’t need to spend time developing a mathematical model or complex algorithms to solve the problem. I saw a lot of value in that module, really.

But for one year and a half, I didn’t have the chance to use more features of OR-Tools.

Until I started solving planning problems.

I saw how to manage interval variables, the speed when having lots of booleans, and the parameterization and ideas behind the CP-SAT solver.

Then I started using it for testing ideas. Then, ideas became proof of concepts. Those PoCs evolved into production mode.

I’ve been using it for more than three years now.

Today in Feasible, I want to share with you the good, the bad, and the ugly of it. We’ll see:

🏆 Why Google OR-Tools (and not others)
💡 Modeling ideas after using it for more than 3 years
📚 On infeasibility, GAPs, and where to look for good documentation (and ask for help)

Are you ready? Let’s dive in… 🪂

🏆 Why Google OR-Tools (and not others)

I bet you read the last State of Mathematical Report from Gurobi.

It says that most of the optimization problems tackled in the industry fall into planning, operational applications, or logistics.

I have the pleasure of working in the intersection of them, and especially in this area we have:

Lots of boolean decision variables.
Integer variables with small domains.
Logic constraints that usually relate them with nonlinear functions.

The CP-SAT solver from Google OR-Tools is super useful there, too.

CP-SAT marries constraint programming (propagation) with SAT solving (learned clauses) and sprinkles in cutting-plane tricks.

The hybrid engine has become the de facto flagship of OR-Tools and a benchmark predator. It swept all gold medals in the 2024 MiniZinc Challenge and did almost the same the year before.

Why is that?

It essentially translates CP constraints into a SAT formula (incrementally), and then uses a SAT solver to find a solution or prove unsatisfiability.

Propagation (CP side) shrinks variable domains by local reasoning (all-different, cumulative, bounds …).
Conflict-Driven Clause Learning (SAT side) analyzes a failure, extracts a nogood clause that cuts off all equivalent failing assignments, and backjumps.
Lazy Clause Generation (LCG) glues them. Every propagator can explain why it shrank a domain; that explanation is pushed into the SAT core as a clause. Next time a similar partial assignment appears, the SAT engine prunes it before CP work is repeated.

This hybrid approach, invented by Stuckey et al.) often gives CP-SAT one to two orders of magnitude fewer search nodes than plain CP and faster backjumps than plain SAT.

It’s particularly effective for hard combinatorial optimization and satisfaction problems.

Add a layer of heuristics with a portfolio of sub-solvers, several decision strategies that you can configure at your will, Large-Neighborhood Search, and automatic restarts, and you get one of the fastest and most friendly solvers in the market.

The project is also moving fast. The last release 9.12 in February added Python 3.13 support, revamped hint handling, and sped up the linear expression layer. All under an Apache 2.0 license.

The downside of all this? Documentation. It’s terrible. You can read everywhere that docs are quite sparse, and Google engineers push newcomers to read the proto files or GitHub issues.

That’s why I wanted to give you some ideas today to add more efficiency to your CP-SAT models as well as where to find proper documentation.

💡 Modeling ideas after using it for more than 3 years

I’ll start with easy-to-digest modeling ideas and then enter the world of efficiency.

So let’s start!

`AddHint` ≠ warm-start

When reading about hints, the first thing you think is “this is a way to do warm-starts.”

But hints must cover all variables they touch and remain feasible after presolve or they’re ignored... Silently.

So if the solver finds no feasible solutions, your hint was inconsistent.

However, this is improving over time.

Use `IntervalVars` whenever possible

Let’s imagine you need to schedule tasks with their starting and ending times, as well as their duration.

You could attempt to model that with binary variables to select whether the task is in the solution or not, plus integer variables for times, plus specific constraints that apply.

But this would make it more difficult and verbose. Use IntervalVars for such problems and you’ll get an easy-to-code model and built-in functions like cumulative or no overlap constraints.

`OnlyEnforceIf` or logical constraints?

I love one specific thing from CP-SAT: the way to handle if statements for constraints.

So maybe you want to express that a constraint is only enforced if something else happens.

This is conditional logic. It doesn’t add any non-linearity to the model as it’s just a way to write Big-M constraints.

For example:

b = model.NewBoolVar("b")
model.Add(x + y >= 10).OnlyEnforceIf(b)

That says I’ll apply this constraint (x + y ≥ 10) only if ‘b’ happens, it’s managed internally as:

x + y + M·(1 – b) ≥ 10

Where M is a bound that CP-SAT computes automatically.

You can also use this feature to create boolean variables with a meaningful purpose.

Let’s say you have to compute whether a worker has any assignment in order to create a penalty for assigning more workers. So you create a constraint to force that:

def _ct_work(self) -> None:
    for person in range(self.num_people):
        all_assignments = sum([work_var
                               for (person_id, _), work_var in self.v01_assign.items()
                               if person_id == person])
        self.model.Add(all_assignments > 0).OnlyEnforceIf(self.v01_work[person])
        self.model.Add(all_assignments == 0).OnlyEnforceIf(self.v01_work[person].Not())

Pretty simple stuff: compute all the assignments. If the sum is higher than 0, it is because there’s been an assignment; otherwise there hasn’t been any assignment at all. You store it in the boolean variable self.v01_work[person].

In any case, that comes with a cost. The extra Big-M constant can weaken propagation if the bound is very loose, and you may pay that in search nodes. So, if you can model the same logic without reification, that’s often faster. That way you’ll get the most out of the efficiency of this solver.

How to do that? For example, with native boolean logic like:

Keep reading with a 7-day free trial

Subscribe to Feasible to keep reading this post and get 7 days of free access to the full post archives.