Common Expression Language (CEL) is an expression language created by Google. It allows to define constraints that can be used to validate input data.
This language is being used by some open source projects and products, like:
- Google Cloud Certificate Authority Service
- Envoy
- There’s even a Kubernetes Enhancement Proposal that would use CEL to validate Kubernetes’ CRDs.
I’ve been looking at CEL since some time, wondering how hard it would be to find a way to write Kubewarden validation policies using this expression language.
Some weeks ago SUSE Hackweek 21 took place, which gave me some time to play with this idea.
This blog post describes the first step of this journey. Two other blog posts will follow.
Picking a CEL runtime
Currently the only mature implementations of the CEL language are written in Go and C++.
Kubewarden policies are implemented using WebAssembly modules.
The official Go compiler isn’t yet capable of producing WebAssembly modules that can be run outside of the browser. TinyGo, an alternative implementation of the Go compiler, can produce WebAssembly modules targeting the WASI interface. Unfortunately TinyGo doesn’t yet support the whole Go standard library. Hence it cannot be used to compile cel-go.
Because of that, I was left with no other choice than to use the cel-cpp runtime.
C and C++ can be compiled to WebAssembly, so I thought everything would have been fine.
Spoiler alert: this didn’t turn out to be “fine”, but that’s for another blog post.
CEL and protobuf
CEL is built on top of protocol buffer types. That means CEL expects the input data (the one to be validated by the constraint) to be described using a protocol buffer type. In the context of Kubewarden this is a problem.
Some Kubewarden policies focus on a specific Kubernetes resource; for example,
all the ones implementing Pod Security Policies are inspecting only Pod
resources.
Others, like the ones looking at labels
or annotations
attributes, are instead
evaluating any kind of Kubernetes resource.
Forcing a Kubewarden policy author to provide a protocol buffer definition of the object to be evaluated would be painful. Luckily, CEL evaluation libraries are also capable of working against free-form JSON objects.
The grand picture
The long term goal is to have a CEL evaluator program compiled into a WebAssembly module.
At runtime, the CEL evaluator WebAssembly module would be instantiated and would receive as input three objects:
- The validation logic: a CEL constraint
- Policy settings (optional): these would provide a way to tune the constraint. They would be delivered as a JSON object
- The actual object to evaluate: this would be a JSON object
Having set the goals, the first step is to write a C++ program that takes as input a CEL constraint and applies that against a JSON object provided by the user.
There’s going to be no WebAssembly today.
Taking a look at the code
In this section I’ll go through the critical parts of the code. I’ll do that to help other people who might want to make a similar use of cel-cpp.
There’s basically zero documentation about how to use the cel-cpp library. I had to learn how to use it by looking at the excellent test suite. Moreover, the topic of validating a JSON object (instead of a protocol buffer type) isn’t covered by the tests. I just found some tips inside of the GitHub issues and then I had to connect the dots by looking at the protocol buffer documentation and other pieces of cel-cpp.
TL;DR The code of this POC can be found inside of this repository.
Parse the CEL constraint
The program receives a string containing the CEL constraint and has to
use it to create a CelExpression
object.
This is pretty straightforward, and is done inside of these lines
of the evaluate.cc
file.
As you will notice, cel-cpp makes use of the Abseil
library. A lot of cel-cpp APIs are returning absl::StatusOr
// invoke API
auto parse_status = cel_parser::Parse(constraint);
if (!parse_status.ok())
{
// handle error
std::string errorMsg = absl::StrFormat(
"Cannot parse CEL constraint: %s",
parse_status.status().ToString());
return EvaluationResult(errorMsg);
}
// Obtain the actual result
auto parsed_expr = parse_status.value();
Handle the JSON input
cel-cpp expects the data to be validated to be loaded into a CelValue
object.
As I said before, we want the final program to read a generic JSON object as input data. Because of that, we need to perform a series of transformations.
First of all, we need to convert the JSON data into a protobuf::Value
object.
This can be done using the protobuf::util::JsonStringToMessage
function.
This is done by these lines
of code.
Next, we have to convert the protobuf::Value
object into a CelValue
one.
The cel-cpp library doesn’t offer any helper. As a matter of fact, one of
the oldest open issue of cel-cpp
is exactly about that.
This last conversion is done using a series of helper functions I wrote inside
of the proto_to_cel.cc
file.
The code relies on the introspection capabilities of protobuf::Value
to
build the correct CelValue
.
Evaluate the constraint
Once the CEL expression object has been created, and the JSON data has been converted into a `CelValue, there’s only one last thing to do: evaluate the constraint against the input.
First of all we have to create a CEL Activation
object and insert the
CelValue
holding the input data into it. This takes just few lines of code.
Finally, we can use the Evaluate
method of the CelExpression
instance
and look at its result. This is done by these lines of code,
which include the usual pattern that handles absl::StatusOr<T>
objects.
The actual result of the evaluation is going to be a CelValue
that holds
a boolean type inside of itself.
Building
This project uses the Bazel build system. I never used Bazel before, which proved to be another interesting learning experience.
A recent C++ compiler is required by cel-cpp. You can use either gcc (version 9+) or clang (version 10+). Personally, I’ve been using clag 13.
Building the code can be done in this way:
CC=clang bazel build //main:evaluator
The final binary can be found under bazel-bin/main/evaluator
.
Usage
The program loads a JSON object called request
which is then embedded
into a bigger JSON object.
This is the input received by the CEL constraint:
{
"request": < JSON_OBJECT_PROVIDED_BY_THE_USER >
}
The idea is to later add another top level key called
settings
. This one would be used by the user to tune the behavior of the constraint.
Because of that, the CEL constraint must access the request values by
going through the request.
key.
This is easier to explain by using a concrete example:
./bazel-bin/main/evaluator \
--constraint 'request.path == "v1"' \
--request '{ "path": "v1", "token": "admin" }'
The CEL constraint is satisfied because the path
key of the request
is equal to v1
.
On the other hand, this evaluation fails because the constraint is not satisfied:
$ ./bazel-bin/main/evaluator \
--constraint 'request.path == "v1"' \
--request '{ "path": "v2", "token": "admin" }'
The constraint has not been satisfied
The constraint can be loaded from file. Create a file
named constraint.cel
with the following contents:
!(request.ip in ["10.0.1.4", "10.0.1.5", "10.0.1.6"]) &&
((request.path.startsWith("v1") && request.token in ["v1", "v2", "admin"]) ||
(request.path.startsWith("v2") && request.token in ["v2", "admin"]) ||
(request.path.startsWith("/admin") && request.token == "admin" &&
request.ip in ["10.0.1.1", "10.0.1.2", "10.0.1.3"]))
Then create a file named request.json
with the following contents:
{
"ip": "10.0.1.4",
"path": "v1",
"token": "admin",
}
Then run the following command:
./bazel-bin/main/evaluator \
--constraint_file constraint.cel \
--request_file request.json
This time the constraint will not be satisfied.
Note: I find the
_
symbols inside of the flags a bit weird. But this is what is done by the Abseil flags library that I experimented with. 🤷
Let’s evaluate a different kind of request:
./bazel-bin/main/evaluator \
--constraint_file constraint.cel \
--request '{"ip": "10.0.1.1", "path": "/admin", "token": "admin"}'
This time the constraint will be satisfied.
Summary
This has been a stimulating challenge.
Getting back to C++
I didn’t write big chunks of C++ code since a long time! Actually, I never had a chance to look at the latest C++ standards. I gotta say, lots of things changed for the better, but I still prefer to pick other programming languages 😅
Building the universe with Bazel
I had prior experience with autoconf
& friends, qmake
and cmake
, but I
never used Bazel before.
As a newcomer, I found the documentation of Bazel quite good. I appreciated
how easy it is to consume libraries that are using Bazel. I also like how
Bazel can solve the problem of downloading dependencies, something
you had to solve on your own with cmake
and similar tools.
The concept of building inside of a sandbox, with all the dependencies vendored, is interesting but can be kinda scary. Try building this project and you will see that Bazel seems to be downloading the whole universe. I’m not kidding, I’ve spotted a Java runtime, a Go compiler plus a lot of other C++ libraries.
Bazel build
command gives a nice progress bar. However, the number of tasks to
be done keeps growing during the build process. It kinda reminded me of the old
Windows progress bar!
I gotta say, I regularly have this feeling of “building the universe” with Rust, but Bazel took that to the next level! 🤯
Code spelunking
Finally, I had to do a lot of spelunking inside of different C++ code bases: envoy, protobuf’s c++ implementation, cel-cpp and Abseil to name a few. This kind of activity can be a bit exhausting, but it’s also a great way to learn from the others.
What’s next?
Well, in a couple of weeks I’ll blog about my next step of this journey: building C++ code to standalone WebAssembly!
Now I need to take some deserved vacation time 😊!
⛰️ 🚶👋