Flavio Castelli

Building a unikernel that runs WebAssembly - part 1

2023-02-07T18:00:00+02:00

Hackweek 22 took place last week. During this week all the SUSE employees are free to hack on whatever they want. This one of the perks of working at SUSE 😎.

This time my personal project has been about building a unikernel that runs WebAssembly.

I wanted this blog post to contain all the details about this journey. However I realized this would have been too much for a single post. I hence decided to split everything into smaller chunks. I’ll update this section to keep track of all the posts.

In the meantime, you can find the code of the POC here.

Why

There are multiple reasons why I did that, but I don’t want to repeat what I wrote inside of the project description. Learning and fun goals aside, I think there’s actually a good reason to mix unikernels and WebAssembly.

From the application developer POV, porting/writing an application to the unikernel is not an easy task. The application and all its dependencies have to support the target unikernel. Some patching might be required inside of the whole application stack to make it work.

From the unikernel maintainers POV, they have to invest quite some energies to ensure any kind of application can run in a seamless way on top of their platform. They don’t know which kind of system primitives the user applications will leverage, this makes everything harder.

On the other hand, when targeting a WebAssembly platform (think of Spin or Spiderlightning), the application has a clear set of capabilities that have to be provided by the WebAssembly runtime.

If you look at the Spiderlightning scenario, an application might be requiring Key/Value store capabilities at runtime. However, how these capabilities are implemented on the host side is not relevant to the application. That means the same .wasm module can be run by a runtime that implements the K/V store using Redis or using Azure Cosmos DB. That would be totally transparent to the end user application.

You might see where I’m going with all that…

If we write a unikernel application that runs WebAssembly modules and supports a set of Spiderlightning APIs, then the same Spiderlightning application could be run both on top of the regular slight runtime and of this unikernel.

All of that without any additional work from the application developer. The Wasm module wouldn’t even realize that. The complexity would fall only on the unikernel developer who, whoever, would have a clear set of functionalities to implement (as opposed to “let’s try to make any kind of application work”).

How

Sometimes ago I stumbled over the RustyHermit project, this is a unikernel written in Rust. I decided to use it as the foundation to write my unikernel application.

Building a RustyHermit application is pretty straightforward. Their documentation, even though is a bit scattered, is good and their examples help a lot.

The cool thing is that RustyHermit is part of Rust nightly, which makes the whole developer experience great. It feels like writing a regular Rust application.

Obviously you cannot expect all kind of Rust crates to just work with RustyHermit. You will see how that influenced the development of the POC.

The next sections go over some of the major challenges I faced during the last week. I’ll share more details inside of the upcoming blog posts (see the disclaimer section at the top of the page).

The WebAssembly runtime

Unfortunately Wasmtime, my favorite WebAssembly runtime, does not build on top of RustyHermit. Many of its dependencies expect libc or other low level libraries to be around. The same applies to wasmer.

I thought about using something like WebAssembly Micro Runtime (WAMR), but I preferred to stick with something written in Rust and have the “full RustyHermit experience”.

After some searching I found wasmi a pure Rust WebAssembly runtime. This works fine on top of RustyHermit, plus its design is inspired by the one of Wasmtime, which allowed me to reuse a lot of my previous knowledge.

WebAssembly Component Model

Spiderlightning leverages the WebAssembly Component Model proposal to offer capabilities to the WebAssembly guests and to allow the host to consume capabilities offered by the WebAssembly guest.

The communication between the host and the guest happens using types defined with the Wasm Interface Type.

To give some concrete examples, the demo I’m going to run leverages the WebAssembly Component Model in these ways:

The guest asks the host to start a HTTP server. When doing that, the guest informs the host about the HTTP routes that have to be registered, plus the names of its internal handlers (the functions that have to be executed). This is done by using the http-server types. In this case it’s the guest that leverages capabilities offered by the host.
The host handles the incoming HTTP requests using the routing information provided by the guest. The http handlers mentioned before are functions exposes by the WebAssembly guest. The server is now consuming capabilities offered by the guest. The communication is done using the http-handler types.
Some of the http handlers defined by the guest are also interacting with a Key/Value store. Also in this case the guest is leveraging a set of capabilities offered by the host. These are defined using the keyvalue types.

As you can see there are many WIT types involved. For each one of them we need code both inside of the guest (a SDK basically) and on the host (the code that implements the guest SDK). This code can be scaffolded by a cli tool called wit-bindgen, which generates host/guest code starting from a .wit file.

In this case I only had to implement the host side of these interfaces inside of the unikernel.

The code generated by wit-bindgen is doing low level operations using the WebAssembly runtime. The code to be scaffolded depends on the programming language and on the WebAssembly runtime used on the host side.

Obviously the wasmi WebAssembly runtime was not supported by wit-bindgen, hence I had to extend wit-bindgen to handle it. The code can be found inside of this fork, under the wasmi branch.

With all of that in place, I scaffolded the host side of the Key/Value capability and I made a simple implementation of the host traits. The host code was just emitting some debug information. I was then able run the vanilla keyvalue-demo from the Spiderlightning project. 🥳

Summary

You made to the bottom of this long post, kudos! I think you deserve a prize for that, so here we go…

This is a recording of the unikernel application running the Spiderlightning http-server demo.

I hope you enjoyed the reading. Stay tuned for the next part of the journey. This will cover Rust async, Redis and some weird errors.

Playing with Common Expression Language

2022-07-21T19:00:00+02:00

Common Expression Language (CEL) is an expression language created by Google. It allows to define constraints that can be used to validate input data.

This language is being used by some open source projects and products, like:

Google Cloud Certificate Authority Service
Envoy
There’s even a Kubernetes Enhancement Proposal that would use CEL to validate Kubernetes’ CRDs.

I’ve been looking at CEL since some time, wondering how hard it would be to find a way to write Kubewarden validation policies using this expression language.

Some weeks ago SUSE Hackweek 21 took place, which gave me some time to play with this idea.

This blog post describes the first step of this journey. Two other blog posts will follow.

Picking a CEL runtime

Currently the only mature implementations of the CEL language are written in Go and C++.

Kubewarden policies are implemented using WebAssembly modules.

The official Go compiler isn’t yet capable of producing WebAssembly modules that can be run outside of the browser. TinyGo, an alternative implementation of the Go compiler, can produce WebAssembly modules targeting the WASI interface. Unfortunately TinyGo doesn’t yet support the whole Go standard library. Hence it cannot be used to compile cel-go.

Because of that, I was left with no other choice than to use the cel-cpp runtime.

C and C++ can be compiled to WebAssembly, so I thought everything would have been fine.

Spoiler alert: this didn’t turn out to be “fine”, but that’s for another blog post.

CEL and protobuf

CEL is built on top of protocol buffer types. That means CEL expects the input data (the one to be validated by the constraint) to be described using a protocol buffer type. In the context of Kubewarden this is a problem.

Some Kubewarden policies focus on a specific Kubernetes resource; for example, all the ones implementing Pod Security Policies are inspecting only Pod resources. Others, like the ones looking at labels or annotations attributes, are instead evaluating any kind of Kubernetes resource.

Forcing a Kubewarden policy author to provide a protocol buffer definition of the object to be evaluated would be painful. Luckily, CEL evaluation libraries are also capable of working against free-form JSON objects.

The grand picture

The long term goal is to have a CEL evaluator program compiled into a WebAssembly module.

At runtime, the CEL evaluator WebAssembly module would be instantiated and would receive as input three objects:

The validation logic: a CEL constraint
Policy settings (optional): these would provide a way to tune the constraint. They would be delivered as a JSON object
The actual object to evaluate: this would be a JSON object

Having set the goals, the first step is to write a C++ program that takes as input a CEL constraint and applies that against a JSON object provided by the user.

There’s going to be no WebAssembly today.

Taking a look at the code

In this section I’ll go through the critical parts of the code. I’ll do that to help other people who might want to make a similar use of cel-cpp.

There’s basically zero documentation about how to use the cel-cpp library. I had to learn how to use it by looking at the excellent test suite. Moreover, the topic of validating a JSON object (instead of a protocol buffer type) isn’t covered by the tests. I just found some tips inside of the GitHub issues and then I had to connect the dots by looking at the protocol buffer documentation and other pieces of cel-cpp.

TL;DR The code of this POC can be found inside of this repository.

Parse the CEL constraint

The program receives a string containing the CEL constraint and has to use it to create a CelExpression object.

This is pretty straightforward, and is done inside of these lines of the evaluate.cc file.

As you will notice, cel-cpp makes use of the Abseil library. A lot of cel-cpp APIs are returning absl::StatusOr objects. That leads to use a specific pattern, something like:

// invoke API
auto parse_status = cel_parser::Parse(constraint);
if (!parse_status.ok())
{
  // handle error
  std::string errorMsg = absl::StrFormat(
      "Cannot parse CEL constraint: %s",
      parse_status.status().ToString());
  return EvaluationResult(errorMsg);
}

// Obtain the actual result
auto parsed_expr = parse_status.value();

Handle the JSON input

cel-cpp expects the data to be validated to be loaded into a CelValue object.

As I said before, we want the final program to read a generic JSON object as input data. Because of that, we need to perform a series of transformations.

First of all, we need to convert the JSON data into a protobuf::Value object. This can be done using the protobuf::util::JsonStringToMessage function. This is done by these lines of code.

Next, we have to convert the protobuf::Value object into a CelValue one. The cel-cpp library doesn’t offer any helper. As a matter of fact, one of the oldest open issue of cel-cpp is exactly about that.

This last conversion is done using a series of helper functions I wrote inside of the proto_to_cel.cc file. The code relies on the introspection capabilities of protobuf::Value to build the correct CelValue.

Evaluate the constraint

Once the CEL expression object has been created, and the JSON data has been converted into a `CelValue, there’s only one last thing to do: evaluate the constraint against the input.

First of all we have to create a CEL Activation object and insert the CelValue holding the input data into it. This takes just few lines of code.

Finally, we can use the Evaluate method of the CelExpression instance and look at its result. This is done by these lines of code, which include the usual pattern that handles absl::StatusOr objects.

The actual result of the evaluation is going to be a CelValue that holds a boolean type inside of itself.

Building

This project uses the Bazel build system. I never used Bazel before, which proved to be another interesting learning experience.

A recent C++ compiler is required by cel-cpp. You can use either gcc (version 9+) or clang (version 10+). Personally, I’ve been using clag 13.

Building the code can be done in this way:

CC=clang bazel build //main:evaluator

The final binary can be found under bazel-bin/main/evaluator.

Usage

The program loads a JSON object called request which is then embedded into a bigger JSON object.

This is the input received by the CEL constraint:

{
  "request": < JSON_OBJECT_PROVIDED_BY_THE_USER >
}

The idea is to later add another top level key called settings. This one would be used by the user to tune the behavior of the constraint.

Because of that, the CEL constraint must access the request values by going through the request. key.

This is easier to explain by using a concrete example:

./bazel-bin/main/evaluator \
  --constraint 'request.path == "v1"' \
  --request '{ "path": "v1", "token": "admin" }'

The CEL constraint is satisfied because the path key of the request is equal to v1.

On the other hand, this evaluation fails because the constraint is not satisfied:

$ ./bazel-bin/main/evaluator \
  --constraint 'request.path == "v1"' \
  --request '{ "path": "v2", "token": "admin" }'
The constraint has not been satisfied

The constraint can be loaded from file. Create a file named constraint.cel with the following contents:

!(request.ip in ["10.0.1.4", "10.0.1.5", "10.0.1.6"]) &&
  ((request.path.startsWith("v1") && request.token in ["v1", "v2", "admin"]) ||
  (request.path.startsWith("v2") && request.token in ["v2", "admin"]) ||
  (request.path.startsWith("/admin") && request.token == "admin" &&
  request.ip in ["10.0.1.1",  "10.0.1.2", "10.0.1.3"]))

Then create a file named request.json with the following contents:

{
  "ip": "10.0.1.4",
  "path": "v1",
  "token": "admin",
}

Then run the following command:

./bazel-bin/main/evaluator \
  --constraint_file constraint.cel \
  --request_file request.json

This time the constraint will not be satisfied.

Note: I find the _ symbols inside of the flags a bit weird. But this is what is done by the Abseil flags library that I experimented with. 🤷

Let’s evaluate a different kind of request:

./bazel-bin/main/evaluator \
  --constraint_file constraint.cel \
  --request '{"ip": "10.0.1.1", "path": "/admin", "token": "admin"}'

This time the constraint will be satisfied.

Summary

This has been a stimulating challenge.

Getting back to C++

I didn’t write big chunks of C++ code since a long time! Actually, I never had a chance to look at the latest C++ standards. I gotta say, lots of things changed for the better, but I still prefer to pick other programming languages 😅

Building the universe with Bazel

I had prior experience with autoconf & friends, qmake and cmake, but I never used Bazel before. As a newcomer, I found the documentation of Bazel quite good. I appreciated how easy it is to consume libraries that are using Bazel. I also like how Bazel can solve the problem of downloading dependencies, something you had to solve on your own with cmake and similar tools.

The concept of building inside of a sandbox, with all the dependencies vendored, is interesting but can be kinda scary. Try building this project and you will see that Bazel seems to be downloading the whole universe. I’m not kidding, I’ve spotted a Java runtime, a Go compiler plus a lot of other C++ libraries.

Bazel build command gives a nice progress bar. However, the number of tasks to be done keeps growing during the build process. It kinda reminded me of the old Windows progress bar!

I gotta say, I regularly have this feeling of “building the universe” with Rust, but Bazel took that to the next level! 🤯

Code spelunking

Finally, I had to do a lot of spelunking inside of different C++ code bases: envoy, protobuf’s c++ implementation, cel-cpp and Abseil to name a few. This kind of activity can be a bit exhausting, but it’s also a great way to learn from the others.

What’s next?

Well, in a couple of weeks I’ll blog about my next step of this journey: building C++ code to standalone WebAssembly!

Now I need to take some deserved vacation time 😊!

⛰️ 🚶👋

Write kubectl plugins using WebAssembly and WASI

2022-04-26T21:00:00+02:00

A long time passed since the last time I wrote something on this blog! 😅 I haven’t been idle during this time, quite the opposite… I kept myself busy experimenting with WebAssembly and Kubernetes.

You probably have already heard about WebAssembly, but there are high chances that happened in the context of Web application development. There’s however a new emerging trend that consists of using WebAssembly outside of the browser.

WebAssembly has many interesting properties that make it great for writing plugin systems or even distributing small computational units (think of FaaS).

WebAssembly is what is being used to power Kubewarden, a project I created almost two years ago at SUSE Rancher, with the help of Rafa and other awesome folks. This is where the majority of my “blogging energies” have been focused.

Now, let’s go back to the main focus of today’s blog entry: write kubectl plugins using WebAssembly.

The current state of things

As you all know, kubectl can be easily extended by writing external plugins.

These plugins are executables named kubectl- that, once put in your $PATH can be invoked via kubectl . This is the same mechanism used to write git plugins.

These plugins can be managed via a tool called Krew.

The kubectl tool is available for multiple operating systems and architectures, which means these plugins must be available for many platforms.

Can WebAssembly help here?

I think writing kubectl plugins using WebAssembly has the following advantages:

Portability: you don’t have to build your WebAssembly-powered plugin for all the possible operating systems and architectures the end users might want.
Security: each WebAssembly module is executed inside of a dedicated sandbox. They cannot see other modules or processes running on the host. They also don’t have access to the host resources (filesystem, devices, network). Think of them as containers.
Size: the majority of kubectl plugins are written using Go, which produces big binaries. The average size of a kubectl plugin is around 9 Mb. WebAssembly on the other hand, can produce plugins that are half the size.

Last but not least, this sounds like a fun experiment!

Introducing `krew-wasm`

The idea about writing kubectl plugins with WebAssembly originated during a brainstorming session I was doing with Rafa about our upcoming talk for WasmDay EU 2022. The idea kinda “infected” me, I had to hack on it ASAP!!! This is how the krew-wasm project was created.

krew-wasm takes inspiration from Krew, but it does not aim to replace it. That’s quite the opposite, it’s a complementary tool that can be used alongside with Krew.

The sole purpose of krew-wasm is to manage and execute kubectl plugins written using WebAssembly and WASI.

krew-wasm plugins are WebAssembly modules that are distributed using container registries, the same infra used to host container images.

krew-wasm can download kubectl WebAssembly plugins from a container registry and make them discoverable to kubectl. This is achieved by creating a symbolic link for each managed plugin. This symbolic link is named kubectl- but, instead of pointing to the WebAssembly module, it points to the krew-wasm executable.

Once invoked, krew-wasm determines its usage mode which could either be a “direct invocation” (when the user invokes the krew-wasm binary to manage plugins) or it could be a “wrapper invocation” done via kubectl.

When invoked in “wrapper mode”, krew-wasm takes care of loading the WebAssembly plugin and invoking it. krew-wasm works as a WebAssembly host, and takes care of setting up the WASI environment used by the plugin.

I’ll leave the technical details out of this post, but if you want you can find more on the GitHub project page.

Some examples

The POC would not be complete without some plugins to run. Guess what, you can find a one right here!

The kubectl decoder plugin dumps Kubernetes Secret objects to the standard output, decoding all the data that is base64-encoded. On top of that, when a x509 certificate is found inside of the Secret, a detailed output is shown rather then the not so helpful PEM encoded representation of the certificate.

If you want to experiment with this idea, you can write your plugins using Rust and this SDK.

Summary

This has been a nice experiment. It proves the combination of WebAssembly and WASI can be used to produce working kubectl plugins.

What’s more interesting is the fact these technologies could be used to extend other Cloud Native projects. Did someone say helm? 😜

There are however some limitations, mostly caused by the freshness of WASI. These are documented here. However, I’m pretty sure things will definitely improve over the next months. After all the WebAssembly ecosystem is moving at a fast pace!

Build multi-architecture container images using argo workflow

2020-10-05T18:30:00+02:00

Note well: this blog post is part of a series, checkout the previous episode about running containerized buildah on top of Kubernetes.

Quick recap

I have a small Kubernetes cluster running at home that is made of ARM64 and x86_64 nodes. I want to build multi-architecture images so that I can run them everywhere on the cluster, regardless of the node architecture. My plan is to leverage the same cluster to build these container images. That leads to a “Inception-style” scenario: building container images from within a container itself.

To achieve that I decided to rely on buildah to build the container images. I’ve shown how run buildah in a containerized fashion without using a privileged container and with a tailor-made AppArmor profile to secure it.

The previous blog post also showed the definition of Kubernetes PODs that would build the actual images.

Today’s goals

What I’m going to show today is how to automate the whole building process.

Given the references to the Git repository that provides a container image definition, I want to automate these steps:

Build the container image on a ARM64 node, push the image to a container registry.
Build the container image on a x86_64 node, push the image to a container registry.
Create a multi-architecture container image manifest, push it to a container registry.

Steps #1 and #2 can be done in parallel, while step #3 needs to wait for the previous ones to complete.

This kind of automation can be done using some pipeline solution.

Kubernetes native pipeline solutions

There are many Continuous Integration and Continuous Delivery solutions that are available for Kubernetes. If you love to seek enlightenment by staring in front of beautiful logos, checkout this portion of the CNCF landscape dedicated to CI and CD solutions. 🤯

After some research I came up with two potential candidates: Argo and Tekton.

Both are valid projects with active communities. However I decided to settle on Argo. The main reason that led to this decision was the lack of ARM64 support from Tekton.

Interestingly enough, both Tekton and kaniko (which I discussed in the previous blog post of this series) use the same mechanism to build themselves, a mechanism that can produce only x86_64 container images and is not so easy to extend.

Argo is an umbrella of different projects, each one of them tackling specific problems like:

The projects above are just the mature ones, many others can be found under the Argo project labs GitHub organization. These projects are not yet considered production ready, but are super interesting.

My favourite ones are:

The majority of these projects don’t have ARM64 container images yet, but work is being done and this work is significantly simpler compared to the one of porting Tekton. Most important of all: the core projects I need have already been ported.

Creating pipelines using Argo Workflow

A pipeline can be created inside Argo by defining a Workflow resource.

Copying from the core concepts documentation page of Argo Workflow, these are the elements I’m going to use:

Workflow: a Kubernetes resource defining the execution of one or more template.
Template: a step, steps or dag.
Step: a single step of a workflow, typically runs a container based on inputs and capture the outputs.
Steps: a list of steps.
Directed Acyclic Graph (DAG): a set of steps (nodes) and the dependencies (edges) between them.

Spoiler alert, I’m going to create multiple Argo Templates, each one of them focusing on one specific part of the problem. Then I’ll use a DAG to explicit the dependencies between all these Templates. Finally, I’ll define an Argo Workflow to “wrap” all these objects.

I could show you the final result right away, but you would probably be overwhelmed by it. I’ll instead go step-by-step as I did. I’ll start with a small subset of the problem and then I’ll keep building on top of it.

Porting our build POD to an Argo Workflow

By the end of the previous blog post, I was able to build a container image by using the following Kubernetes POD definition:

apiVersion: v1
kind: Pod
metadata:
  name: builder
  annotations:
    container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
spec:
  nodeSelector:
    kubernetes.io/arch: "amd64"
  containers:
  - name: main
    image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
    command: ["/bin/sh"]
    args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
    volumeMounts:
      - name: code
        mountPath: /code
    resources:
      limits:
        github.com/fuse: 1
  initContainers:
  - name: git-sync
    image: k8s.gcr.io/git-sync/git-sync:v3.1.7
    args: [
      "--one-time",
      "--depth", "1",
      "--dest", "checkout",
      "--repo", "https://github.com/flavio/guestbook-go.git",
      "--branch", "master"]
    volumeMounts:
      - name: code
        mountPath: /tmp/git
  volumes:
  - name: code
    emptyDir:
      medium: Memory

These are the key points of this POD:

It uses an Init Container to retrieve the source code of the container image from a Git repository.
A Kubernetes Volume is used to share the source code of the container image to be built between the Init Container and the main one.
The Git repository details, the image name and other references are all hard-coded.
The POD just builds the container image, there’s no push action at the end of it.
The POD is forcefully scheduled on a x86_64 node; hence this will produce only x86_64 container images.
The POD requires a Fuse resource, this is required to allow buildah to use the performant overlay graph driver.
The POD uses a specific AppArmor profile, not the default one provided by the container engine.

Starting from something like Argo’s “Hello world Workflow”, we can transpose the POD defined above to something like that:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: simple-build-
spec:
  entrypoint: buildah
  templates:
  - name: buildah
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    nodeSelector:
      kubernetes.io/arch: "amd64"
    container:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: ["/bin/sh"]
      args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
      volumeMounts:
        - name: code
          mountPath: /code
      resources:
        limits:
          github.com/fuse: 1
    initContainers:
    - name: git-sync
      image: k8s.gcr.io/git-sync/git-sync:v3.1.7
      args: [
        "--one-time",
        "--depth", "1",
        "--dest", "checkout",
        "--repo", "https://github.com/flavio/guestbook-go.git",
        "--branch", "master"]
      volumeMounts:
        - name: code
          mountPath: /tmp/git
    volumes:
    - name: code
      emptyDir:
        medium: Memory

As you can see the POD definition has been transformed into a Template object. The contents of the POD spec section have been basically copied and pasted under the Template. The POD annotations have been moved straight under the template.metadata section.

I have to admit this was pretty confusing to me in the beginning, but everything became clear once I started to look at the field documentation of the Argo resources.

The workflow can be submitted using the argo cli tool:

$ argo submit workflow-simple-build.yaml
Name:                simple-build-qk4t4
Namespace:           argo
ServiceAccount:      default
Status:              Pending
Created:             Wed Sep 30 15:45:20 +0200 (now)

This will be visible also from the Argo Workflow UI:

Refactoring the Argo Workflow

The previous Workflow definition can be cleaned up a bit, leading to the following YAML file:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: simple-build-
spec:
  entrypoint: buildah
  templates:
  - name: buildah
    inputs:
      parameters:
      - name: arch
      - name: repository
      - name: branch
      - name: image_name
      - name: image_tag
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    nodeSelector:
      kubernetes.io/arch: "amd64"
    script:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: [bash]
      source: |
        set -xe
        cd /code/
        # needed to workaround protected_symlink - we can't just cd into /code/checkout
        cd $(readlink checkout)
        buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
        buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
        echo Image built and pushed to remote registry
      volumeMounts:
        - name: code
          mountPath: /code
        - name: certs
          mountPath: /certs
          readOnly: true
      resources:
        limits:
          github.com/fuse: 1
    initContainers:
    - name: git-sync
      image: k8s.gcr.io/git-sync/git-sync:v3.1.7
      args: [
        "--one-time",
        "--depth", "1",
        "--dest", "checkout",
        "--repo", "{{inputs.parameters.repository}}",
        "--branch", "{{inputs.parameters.branch}}"]
      volumeMounts:
        - name: code
          mountPath: /tmp/git
    volumes:
    - name: code
      emptyDir:
        medium: Memory
    - name: certs
      secret:
        secretName: registry-cert

Compared to the previous definition, this one doesn’t have any hard-coded value inside of it. The details of the Git repository, the image name, the container registry,… all of that is now passed dynamically to the template by using the input.parameters map.

The main container has also been rewritten to use an Argo Workflow specific field: script.source. This is really handy because it provides a nice way to write a bash script to be executed inside the container.

The source script has been also extended to perform a push operation at the end of the build process. As you can see the architecture of the image is appended to the tag of the image. This is a common pattern used when building multi-architecture container images.

One final note about the push operation. The destination registry is secured using a self-signed certificate. Because of that either the CA that signed the certificate or the registry’s certificate have to be provided to buildah. This can be done by using the --cert-dir flag and by placing the certificates to be loaded under the specified path. Note well, the certificate files must have the .crt file extension otherwise they won’t be handled.

I “loaded” the certificate into Kubernetes by using a Kubernetes secret like this one:

apiVersion: v1
kind: Secret
metadata:
  name: registry-cert
  namespace: argo
type: Opaque
data:
  ca.crt: `base64 -w 0 actualcert.crt`

As you can see the main container is now mounting the contents of the registry-cert Kubernetes Secret under /certs.

This time, when submitting the workflow, we must specify its parameters:

$ argo submit workflow-simple-build-2.yaml \
    -p arch=amd64 \
    -p repository=https://github.com/flavio/guestbook-go.git \
    -p branch=master \
    -p image_name=registry-testing.svc.lan/guestbook-go \
    -p image_tag=0.0.1
Name:                simple-build-npqdw
Namespace:           argo
ServiceAccount:      default
Status:              Pending
Created:             Wed Sep 30 15:52:06 +0200 (now)
Parameters:
  arch:              {1 0 amd64}
  repository:        {1 0 https://github.com/flavio/guestbook-go.git}
  branch:            {1 0 master}
  image_name:        {1 0 registry-testing.svc.lan/guestbook-go}
  image_tag:         {1 0 0.0.1}

Building on multiple architectures

The Workflow object defined so far is still hard-coded to be scheduled only on x86_64 nodes (see the nodeSelector constraint).

I could create a new Workflow definition by copying one shown before and then change the nodeSelector constraint to reference the ARM64 architecture. However, this would violate the DRY principle.

Instead, I will abstract the Workflow definition by leveraging a feature of Argo Workflow called loops. I will define a parameter for the target architecture and then I will iterate over two possible values: amd64 and arm64.

This is the resulting Workflow definition:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: simple-build-
spec:
  entrypoint: build-images-arch-loop
  templates:
  - name: build-images-arch-loop
    inputs:
      parameters:
      - name: repository
      - name: branch
      - name: image_name
      - name: image_tag
    steps:
    - - name: build-image
        template: buildah
        arguments:
          parameters:
          - name: arch
            value: "{{item.arch}}"
          - name: repository
            value: "{{inputs.parameters.repository}}"
          - name: branch
            value: "{{inputs.parameters.branch}}"
          - name: image_name
            value: "{{inputs.parameters.image_name}}"
          - name: image_tag
            value: "{{inputs.parameters.image_tag}}"
        withItems:
          - { arch: 'amd64' }
          - { arch: 'arm64' }
  - name: buildah
    inputs:
      parameters:
      - name: arch
      - name: repository
      - name: branch
      - name: image_name
      - name: image_tag
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    nodeSelector:
      kubernetes.io/arch: "{{inputs.parameters.arch}}"
    script:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: [bash]
      source: |
        set -xe
        cd /code/
        # needed to workaround protected_symlink - we can't just cd into /code/checkout
        cd $(readlink checkout)
        buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
        buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
        echo Image built and pushed to remote registry
      volumeMounts:
        - name: code
          mountPath: /code
        - name: certs
          mountPath: /certs
          readOnly: true
      resources:
        limits:
          github.com/fuse: 1
    initContainers:
    - name: git-sync
      image: k8s.gcr.io/git-sync/git-sync:v3.1.7
      args: [
        "--one-time",
        "--depth", "1",
        "--dest", "checkout",
        "--repo", "{{inputs.parameters.repository}}",
        "--branch", "{{inputs.parameters.branch}}"]
      volumeMounts:
        - name: code
          mountPath: /tmp/git
    volumes:
    - name: code
      emptyDir:
        medium: Memory
    - name: certs
      secret:
        secretName: registry-cert

The workflow definition grew a bit. I’ve added a new template called build-images-arch-loop, which is now the entry point of the workflow. This template performs a loop over the [ { arch: 'amd64' }, { arch: 'arm64' } ] array, each time invoking the buildah template with slightly different input parameters. The only parameter that changes across the invocations is the arch one, which is used to define the nodeSelector constraint.

Executing this workflow results in two steps being executed at the same time: one building the image on a random x86_64 node, the other doing the same thing on a random ARM64 node.

This can be clearly seen from the Argo Workflow UI:

When the workflow execution is over, the registry will contain two different images:

:-amd64
:-arm64

Now there’s just one last step to perform: create a multi-architecture container manifest referencing these two images.

Creating the image manifest

The Image manifest Version 2, Schema 2 specification defines a new type of image manifest called “Manifest list” (application/vnd.docker.distribution.manifest.list.v2+json).

Quoting the official specification:

The manifest list is the “fat manifest” which points to specific image manifests for one or more platforms. Its use is optional, and relatively few images will use one of these manifests. A client will distinguish a manifest list from an image manifest based on the Content-Type returned in the HTTP response.

The creation of such a manifest is pretty easy and it can be done with docker, podman and buildah in a similar way.

I will still use buildah to create the manifest and push it to the registry where all the images are stored.

This is the Argo Template that takes care of that:

 - name: create-manifest
    inputs:
      parameters:
      - name: image_name
      - name: image_tag
      - name: architectures
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    volumes:
    - name: certs
      secret:
        secretName: registry-cert
    script:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: [bash]
      source: |
        set -xe
        image_name="{{inputs.parameters.image_name}}"
        image_tag="{{inputs.parameters.image_tag}}"
        architectures="{{inputs.parameters.architectures}}"
        target="${image_name}:${image_tag}"
        architectures_list=($(echo $architectures | tr "," "\n"))
        buildah manifest create ${target}
        #Print the split string
        for arch in "${architectures_list[@]}"
        do
          arch_image="${image_name}:${image_tag}-${arch}"
          buildah pull --cert-dir /certs ${arch_image}
          buildah manifest add ${target} ${arch_image}
        done
        buildah manifest push --cert-dir /certs ${target} docker://${target}
        echo Manifest creation done
      volumeMounts:
        - name: certs
          mountPath: /certs
          readOnly: true
      resources:
        limits:
          github.com/fuse: 1

The template has an input parameter called architectures, this string is made of the architectures names joined by a comma; e.g. "amd64,arm64".

The script creates a manifest with the name of the image and then, iterating over the architectures, it adds the architecture-specific images to it. Once this is done the manifest is pushed to the container registry.

To make a simple example, assuming the following scenario:

We are building the guestbook-go application with release v0.1.0
We want to build the image for the x86_64 and the ARM64 architectures
We want to push the images to the registry.svc.lan registry

The Argo Template that creates the manifest will pull the following images:

registry.svc.lan/guestbook-go:v0.1.0-amd64: the x86_64 image
registry.svc.lan/guestbook-go:v0.1.0-arm64: the ARM64 image

Finally, the Template will create and push a manifest named registry.svc.lan/guestbook-go:v0.1.0. This image reference will always return the right container image to the node requesting it.

Adding the container image to the manifest is done with the buildah manifest add command. This command doesn’t actually need to have the container image available locally, it would be enough to reach out to the registry hosting it to obtain the manifest digest.

In our case the images are stored on a registry secured with a custom certificate. Unfortunately, the manifest add command was lacking some flags (like the cert one); because of that I had to introduce the workaround of pre-pulling all the images referenced by the manifest. This has the side effect of wasting some time, bandwidth and disk space.

I’ve submitted patches both to buildah and to podman to enrich their manifest add commands; both pull requests have been merged into the master branches. The next release of buildah will ship with my patch and the manifest creation Template will be simpler and faster.

Explicating dependencies between Argo templates

Argo allows to define a workflow sequence with clear dependencies between each step. This is done by defining a DAG.

Our workflow will be made of one Argo Template of type DAG, that will have two tasks:

Build the multi-architecture images. This is done with the Argo Workflow loop shown above.
Create the manifest. This task depends on the successful completion of the previous one.

This is the Template definition:

- name: full-process
  dag:
    tasks:
    - name: build-images
      template: build-images-arch-loop
      arguments:
        parameters:
        - name: repository
          value: "{{workflow.parameters.repository}}"
        - name: branch
          value: "{{workflow.parameters.branch}}"
        - name: image_name
          value: "{{workflow.parameters.image_name}}"
        - name: image_tag
          value: "{{workflow.parameters.image_tag}}"
    - name: create-multi-arch-manifest
      dependencies: [build-images]
      template: create-manifest
      arguments:
        parameters:
        - name: image_name
          value: "{{workflow.parameters.image_name}}"
        - name: image_tag
          value: "{{workflow.parameters.image_tag}}"
        - name: architectures
          value: "{{workflow.parameters.architectures_string}}"

As you can see the Template takes the usual series of parameters we’ve already defined, and forwards them to the tasks.

This is the full definition of our Argo workflow, hold on… this is really long 🙀

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: build-multi-arch-image-
spec:
  ttlStrategy:
    secondsAfterCompletion: 60
  entrypoint: full-process
  arguments:
    parameters:
    - name: repository
      value: https://github.com/flavio/guestbook-go.git
    - name: branch
      value: master
    - name: image_name
      value: registry-testing.svc.lan/guestbook
    - name: image_tag
      value: 0.0.1
    - name: architectures_string
      value: "arm64,amd64"
  templates:
  - name: full-process
    dag:
      tasks:
      - name: build-images
        template: build-images-arch-loop
        arguments:
          parameters:
          - name: repository
            value: "{{workflow.parameters.repository}}"
          - name: branch
            value: "{{workflow.parameters.branch}}"
          - name: image_name
            value: "{{workflow.parameters.image_name}}"
          - name: image_tag
            value: "{{workflow.parameters.image_tag}}"
      - name: create-multi-arch-manifest
        dependencies: [build-images]
        template: create-manifest
        arguments:
          parameters:
          - name: image_name
            value: "{{workflow.parameters.image_name}}"
          - name: image_tag
            value: "{{workflow.parameters.image_tag}}"
          - name: architectures
            value: "{{workflow.parameters.architectures_string}}"
  - name: build-images-arch-loop
    inputs:
      parameters:
      - name: repository
      - name: branch
      - name: image_name
      - name: image_tag
    steps:
    - - name: build-image
        template: buildah
        arguments:
          parameters:
          - name: arch
            value: "{{item.arch}}"
          - name: repository
            value: "{{inputs.parameters.repository}}"
          - name: branch
            value: "{{inputs.parameters.branch}}"
          - name: image_name
            value: "{{inputs.parameters.image_name}}"
          - name: image_tag
            value: "{{inputs.parameters.image_tag}}"
        withItems:
          - { arch: 'amd64' }
          - { arch: 'arm64' }
  - name: buildah
    inputs:
      parameters:
      - name: arch
      - name: repository
      - name: branch
      - name: image_name
      - name: image_tag
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    nodeSelector:
      kubernetes.io/arch: "{{inputs.parameters.arch}}"
    volumes:
    - name: code
      emptyDir:
        medium: Memory
    - name: certs
      secret:
        secretName: registry-cert
    script:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: [bash]
      source: |
        set -xe
        cd /code/
        # needed to workaround protected_symlink - we can't just cd into /code/checkout
        cd $(readlink checkout)
        buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
        buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
        echo Image built and pushed to remote registry
      volumeMounts:
        - name: code
          mountPath: /code
        - name: certs
          mountPath: /certs
          readOnly: true
      resources:
        limits:
          github.com/fuse: 1
    initContainers:
    - name: git-sync
      image: k8s.gcr.io/git-sync/git-sync:v3.1.7
      args: [
        "--one-time",
        "--depth", "1",
        "--dest", "checkout",
        "--repo", "{{inputs.parameters.repository}}",
        "--branch", "{{inputs.parameters.branch}}"]
      volumeMounts:
        - name: code
          mountPath: /tmp/git
  - name: create-manifest
    inputs:
      parameters:
      - name: image_name
      - name: image_tag
      - name: architectures
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
    volumes:
    - name: certs
      secret:
        secretName: registry-cert
    script:
      image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
      command: [bash]
      source: |
        set -xe
        image_name="{{inputs.parameters.image_name}}"
        image_tag="{{inputs.parameters.image_tag}}"
        architectures="{{inputs.parameters.architectures}}"
        target="${image_name}:${image_tag}"
        architectures_list=($(echo $architectures | tr "," "\n"))
        buildah manifest create ${target}
        #Print the split string
        for arch in "${architectures_list[@]}"
        do
          arch_image="${image_name}:${image_tag}-${arch}"
          buildah pull --cert-dir /certs ${arch_image}
          buildah manifest add ${target} ${arch_image}
        done
        buildah manifest push --cert-dir /certs ${target} docker://${target}
        echo Manifest creation done
      volumeMounts:
        - name: certs
          mountPath: /certs
          readOnly: true
      resources:
        limits:
          github.com/fuse: 1

That’s how life goes with Kubernetes, sometimes there’s just a lot of YAML…

Now we can submit the workflow to Argo:

$ argo submit build-pipeline-final.yml
Name:                build-multi-arch-image-wndlr
Namespace:           argo
ServiceAccount:      default
Status:              Pending
Created:             Thu Oct 01 16:22:46 +0200 (now)
Parameters:
  repository:        {1 0 https://github.com/flavio/guestbook-go.git}
  branch:            {1 0 master}
  image_name:        {1 0 registry-testing.svc.lan/guestbook}
  image_tag:         {1 0 0.0.1}
  architectures_string: {1 0 arm64,amd64}

The visual representation of the workflow is pretty nice:

As you might have noticed, I didn’t provide any parameter to argo submit; the Argo Workflow now has default values for all the input parameters.

Garbage collector

Something worth of note, Argo Workflow leaves behind all the containers it creates. This is good to triage failures, but I don’t want to clutter my cluster with all these resources.

Argo provides cost optimization parameters to implement cleanup strategies. The one I’ve used above is the Workflow TTL Strategy.

You can see these lines at the top of the full Workflow definition:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: build-multi-arch-image-
spec:
  ttlStrategy:
    secondsAfterCompletion: 60

This triggers an automatic cleanup of all the PODs spawned by the Workflow 60 seconds after its completion, be it successful or not.

Summary

Today we have seen how to create a pipeline that builds container images for multiple architectures on top an existing Kubernetes cluster.

Argo Workflow proved to be a good solution for this kind of automation. There’s quite some YAML involved with that, but I highly doubt over projects would have spared us from that.

What can we do next? Well, to me the answer is pretty clear. The definition of the container image is stored inside of a Git repository; hence I want to connect my Argo Workflow to the events happening inside of the Git repository.

Stay tuned for more updates! In the meantime feedback is always welcome.

Build multi-architecture container images using Kubernetes

2020-09-16T10:00:00+02:00

Recently I’ve added some Raspberry Pi 4 nodes to the Kubernetes cluster I’m running at home.

The overall support of ARM inside of the container ecosystem improved a lot over the last years with more container images made available for the armv7 and the arm64 architectures.

But what about my own container images? I’m running some homemade containerized applications on top of this cluster and I would like to have them scheduled both on the x64_64 nodes and on the ARM ones.

There are many ways to build ARM container images. You can go from something as simple, and tedious, as performing manual builds on a real/emulated ARM machines or you can do something more structured like using this GitHub Action, relying on something like the Open Build Service,…

My personal desire was to leverage my mixed Kubernetes cluster and perform the image building right on top of it.

Implementing this design has been a great learning experience, something IMHO worth to be shared with others. The journey has been too long to fit into a single blog post; I’ll split my story into multiple posts.

Our journey begins with the challenge of building a container image from within a container.

Image building

The most known way to build a container image is by using docker build. I didn’t want to use docker to build my images because the build process will take place right on top of Kubernetes, meaning the build will happen in a containerized way.

Some people are using docker as the container runtime of their Kubernetes clusters and are leveraging that to mount the docker socket inside of some of their containers. Once the docker socket is mounted, the containerized application has full access to the docker daemon that is running on the host. From there ~~it’s game over~~ the container can perform actions such as building new images.

I’m a strong opponent of this approach because it’s highly insecure. Moreover I’m not using docker as container runtime and I guess many people will stop doing that in the near future once dockershim gets deprecated. Translated: the majority of the future Kubernetes cluster will either have containerd, CRI-O or something similar instead of docker - hence bye bye to the docker socket hack.

There are however many other ways to build containers that are not based on docker build.

If you do a quick internet search about containerized image building you will definitely find kaniko. kaniko does exactly what I want: it performs containerized builds without using the docker daemon. There are also many examples covering image building on top of Kubernetes with kaniko. Unfortunately, at the time of writing, kaniko supports only the x86_64 architecture.

Our chances are not over yet because there’s another container building tool that can help us: buildah.

Buildah is part of the “libpod ecosystem”, which includes projects such as podman, skopeo and CRI-O. All these tools are available for multiple architectures: x86_64, aarch64 (aka ARM64), s390x and ppc64le.

Running buildah containerized

Buildah can build container images starting from a Dockerfile or in a more interactive way. All of that without requiring any privileged daemon running on your system.

During the last years the buildah developers spent quite some efforts to support the use case of “containerized buildah”. This is just the most recent blog post that discusses this scenario in depth.

Upstream has even a Dockerfile that can be used to create a buildah container image. This can be found here.

I took this Dockerfile, made some minor adjustments and uploaded it to this project on the Open Build Service. As a result I got a multi architecture container image that can be pulled from registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest.

The storage driver

As some container veterans probably know, there are several types of storage drivers that can be used by container engines.

In case you’re not familiar with this topic you can read these great documentation pages from Docker:

Note well: despite being written for the docker container engine, this applies also to podman, buildah, CRI-O and containerd.

The most portable and performant storage driver is the overlay one. This is the one we want to use when running buildah containerized.

The overlay driver can be used in safe way even inside of a container by leveraging fuse-overlay; this is described by the buildah blog post I linked above.

However, using the overlay storage driver inside of a container requires Fuse to be enabled on the host and, most important of all, it requires the /dev/fuse device to be accessible by the container.

The share operation cannot be done by simply mounting /dev/fuse as a volume because there are some extra “low level” steps that must be done (like properly instructing the cgroup device hierarchy).

These extra steps are automatically handled by docker and podman via the --device flag of the run command:

$ podman run --rm -ti --device /dev/fuse buildahimage bash

This problem will need to be solved in a different way when buildah is run on top of Kubernetes.

Kubernetes device plugin

Special host devices can be shared with containers running inside of a Kubernetes POD by using a recent feature called Kubernetes device plugins.

Quoting the upstream documentation:

Kubernetes provides a device plugin framework that you can use to advertise system hardware resources to the Kubelet .

Instead of customizing the code for Kubernetes itself, vendors can implement a device plugin that you deploy either manually or as a DaemonSet . The targeted devices include GPUs, high-performance NICs, FPGAs, InfiniBand adapters, and other similar computing resources that may require vendor specific initialization and setup.

This Kubernetes feature is commonly used to allow containerized machine learning workloads to access the GPU cards available on the host.

Luckily someone wrote a Kubernetes device plugin that exposes /dev/fuse to Kubernetes-managed containers: fuse-device-plugin.

I’ve forked the project, made some minor fixes to its Dockerfile and created a GitHub action to build the container image for amd64, armv7 and amd64 (a PR is coming soon). The images are available on the Docker Hub as: flavio/fuse-device-plugin.

The fuse-device-plugin has to be deployed as a Kubernetes DaemonSet via this yaml file:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fuse-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: fuse-device-plugin-ds
  template:
    metadata:
      labels:
        name: fuse-device-plugin-ds
    spec:
      hostNetwork: true
      containers:
      - image: flavio/fuse-device-plugin:latest
        name: fuse-device-plugin-ctr
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins

This is basically this file, with the flavio/fuse-device-plugin image being used instead of the original one (which is built only for x86_64).

Once the DaemonSet PODs are running on all the nodes of the cluster, we can see the Fuse device being exposed as an allocatable resource identified by the github.com/fuse key:

$ kubectl get nodes -o=jsonpath=$'{range .items[*]}{.metadata.name}: {.status.allocatable}\n{end}'
jam-2: map[cpu:4 ephemeral-storage:224277137028 github.com/fuse:5k memory:3883332Ki pods:110]
jam-1: map[cpu:4 ephemeral-storage:111984762997 github.com/fuse:5k memory:3883332Ki pods:110]
jolly: map[cpu:4 ephemeral-storage:170873316014 github.com/fuse:5k gpu.intel.com/i915:1 hugepages-1Gi:0 hugepages-2Mi:0 memory:16208280Ki pods:110]

The Fuse device can then be made available to a container by specifying a resource limit:

apiVersion: v1
kind: Pod
metadata:
  name: fuse-example
spec:
  containers:
  - name: main
    image: alpine
    command: ["ls", "-l", "/dev"]
    resources:
      limits:
        github.com/fuse: 1

If you look at the logs of this POD you will see something like that:

$ kubectl logs fuse-example
total 0
lrwxrwxrwx    1 root     root            11 Sep 15 08:31 core -> /proc/kcore
lrwxrwxrwx    1 root     root            13 Sep 15 08:31 fd -> /proc/self/fd
crw-rw-rw-    1 root     root        1,   7 Sep 15 08:31 full
crw-rw-rw-    1 root     root       10, 229 Sep 15 08:31 fuse
drwxrwxrwt    2 root     root            40 Sep 15 08:31 mqueue
crw-rw-rw-    1 root     root        1,   3 Sep 15 08:31 null
lrwxrwxrwx    1 root     root             8 Sep 15 08:31 ptmx -> pts/ptmx
drwxr-xr-x    2 root     root             0 Sep 15 08:31 pts
crw-rw-rw-    1 root     root        1,   8 Sep 15 08:31 random
drwxrwxrwt    2 root     root            40 Sep 15 08:31 shm
lrwxrwxrwx    1 root     root            15 Sep 15 08:31 stderr -> /proc/self/fd/2
lrwxrwxrwx    1 root     root            15 Sep 15 08:31 stdin -> /proc/self/fd/0
lrwxrwxrwx    1 root     root            15 Sep 15 08:31 stdout -> /proc/self/fd/1
-rw-rw-rw-    1 root     root             0 Sep 15 08:31 termination-log
crw-rw-rw-    1 root     root        5,   0 Sep 15 08:31 tty
crw-rw-rw-    1 root     root        1,   9 Sep 15 08:31 urandom
crw-rw-rw-    1 root     root        1,   5 Sep 15 08:31 zero

Now that this problem is solved we can move to the next one. 😉

Obtaining the source code of our image

The source code of the “container image to be built” must be made available to the containerized buildah.

As many people do, I keep all my container definitions versioned inside of Git repositories. I had to find a way to clone the Git repository holding the definition of the “container image to be built” inside of the container running buildah.

I decided to settle for this POD layout:

The main container of the POD is going to be the one running buildah.
The POD will have a Kubernetes init container that will git clone the source code of the “container image to be built” before the main container is started.

The contents produced by the git clone must be placed into a directory that can be accessed later on by the main container. I decided to use a Kubernetes volume of type emptyDir to create a shared storage between the init and the main containers. The emptyDir volume is just perfect: it doesn’t need any fancy Kubernetes Storage Class and it will automatically vanish once the build is done.

To checkout the Git repository I decided to settle on the official Kubernetes git-sync container.

Quoting its documentation:

git-sync is a simple command that pulls a git repository into a local directory. It is a perfect “sidecar” container in Kubernetes - it can periodically pull files down from a repository so that an application can consume them.

git-sync can pull one time, or on a regular interval. It can pull from the HEAD of a branch, from a git tag, or from a specific git hash. It will only re-pull if the target of the run has changed in the upstream repository. When it re-pulls, it updates the destination directory atomically. In order to do this, it uses a git worktree in a subdirectory of the –root and flips a symlink.

git-sync can pull over HTTP(S) (with authentication or not) or SSH.

This is just what I was looking for.

I will start git-sync with the following parameters:

--one-time: this is needed to make git-sync exit once the checkout is done; otherwise it will keep running forever and it will periodically look for new commits inside of the repository. I don’t need that, plus this would cause the main container to wait indefinitely for the init container to exit.
--depth 1: this is done to limit the checkout to the latest commit. I’m not interested in the history of the repository. This will make the checkout faster and use less bandwidth and disk space.
--repo : the repo I want to checkout.


--branch : the branch to checkout.



The git-sync container image was already built for multiple architectures, but
unfortunately it turned out
the non x86_64 images were broken.
The issue has been recently solved with the v3.1.7.

While waiting for the issue to be fixed I just rebuilt the container image on
the Open Build Service. This is no longer needed, everybody can just use the
official image.

Trying the first build

It’s now time to perform a simple test run. We will define a simple
Kubernetes POD that will:


Checkout the source code of a simple container image
Build the container iamge using buildah


This is the POD definition:
apiVersion: v1
kind: Pod
metadata:
  name: builder-amd64
spec:
  nodeSelector:
    kubernetes.io/arch: "amd64"
  initContainers:
  - name: git-sync
    image: k8s.gcr.io/git-sync/git-sync:v3.1.7
    args: [
      "--one-time",
      "--depth", "1",
      "--dest", "checkout",
      "--repo", "https://github.com/flavio/guestbook-go.git",
      "--branch", "master"]
    volumeMounts:
      - name: code
        mountPath: /tmp/git
  volumes:
  - name: code
    emptyDir:
      medium: Memory
  containers:
  - name: main
    image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
    command: ["/bin/sh"]
    args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
    volumeMounts:
      - name: code
        mountPath: /code
    resources:
      limits:
        github.com/fuse: 1
Let’s break it down into pieces.

Determine image architecture

The POD uses a Kubernetes node selector
to ensure the build happens on a node with the x86_64 architecture. By doing
that we will know the architecture of the final image.

Checkout the source code

As said earlier, the Git repository is checked out using an init container:
  initContainers:
  - name: git-sync
    image: k8s.gcr.io/git-sync/git-sync:v3.1.7
    args: [
      "--one-time",
      "--depth", "1",
      "--dest", "checkout",
      "--repo", "https://github.com/flavio/guestbook-go.git",
      "--branch", "master"]
    volumeMounts:
      - name: code
        mountPath: /tmp/git
The Git repository and the branch are currently hard-coded into the POD
definition, this is going to be fixed later on. Right now that’s good
enough to see if things are working (spoiler alert: they won’t 😅).

The git-sync container will run before the main container and it will write
the source code of the “container image to be built” inside of a Kubernetes
volume named code.

This is how the volume will look like after git-sync has ran:
$ ls -lh 
drwxr-xr-x 9 65533 65533 300 Sep 15 09:41 .git
lrwxrwxrwx 1 65533 65533  44 Sep 15 09:41 checkout -> rev-155a69b7f81d5b010c5468a2edfbe9228b758d64
drwxr-xr-x 6 65533 65533 280 Sep 15 09:41 rev-155a69b7f81d5b010c5468a2edfbe9228b758d64
The source code is stored under the rev- directory. There’s
a symlink named checkout that points to it. As you will see later, this will
lead to a small twist.

Shared volume

The source code of our application is stored inside of a Kubernetes volume of
type emptyDir:
  volumes:
  - name: code
    emptyDir:
      medium: Memory
I’ve also instructed Kubernetes to store the volume in memory. Behind the scene
Kubelet will use tmpfs to do that.

The buildah container

The POD will have just one container running inside of it. This is called main
and its only purpose is to run buildah.

This is the definition of the container:
  containers:
  - name: main
    image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
    command: ["/bin/sh"]
    args: ["-c", "cd /code; cd $(readlink checkout); buildah bud -t guestbook ."]
    volumeMounts:
      - name: code
        mountPath: /code
    resources:
      limits:
        github.com/fuse: 1
As expected the container is mounting the code Kubenetes volume too. Moreover,
the container is requesting one resource of type github.com/fuse; as explained
above this is needed to make /dev/fuse available inside of the container.

The container executes a simple bash script. The oneliner can be expanded
to that:
cd /code
cd $(readlink checkout)
buildah bud -t guestbook .
There’s one interesting detail in there. As you can see I’m not “cd-ing” straight
into /code/checkout, instead I’m moving into /code and then resolving
the actual target of the checkout symlink.

We can’t move straight into /code/checkout because that would give us an
error:
builder:/ # cd /code/checkout
bash: cd: /code/checkout: Permission denied
This happens because /proc/sys/fs/protected_symlinks is turned on by default.
As you can read here, this
is a way to protect from specific type of exploits. Not even root inside of the
container can jump straight into /code/checkout, this is why I’m doing this
workaround.

One last note, as you have probably noticed, buildah is just building the
container image, it’s not pushing it to any registry. We don’t care about
that right now.

An unexpected problem

Our journey is not over yet, there’s one last challenge ahead of us.

Before digging into the issue, let me provide some background. My local cluster
was initially made by one x86_64 node running openSUSE Leap 15.2 and by two
ARM64 nodes running the beta ARM64 build
of Rasperry Pi OS (formerly known as raspbian).

I used the POD definition shown above to define two PODs:


builder-amd64: the nodeSelector constraint targets the amd64 architecture
builder-arm64: the nodeSelector constraint targets the arm64 architecture


That lead to an interesting finding: the builds on ARM64 nodes worked fine, while
all the builds on the x86_64 node failed.

The failure was always the same and happened straight at the beginning of the
process:
$ kubectl logs -f builder-amd64
mount /var/lib/containers/storage/overlay:/var/lib/containers/storage/overlay, flags: 0x1000: permission denied
level=error msg="exit status 125"
To me, that immediately smelled like a security feature blocking buildah.

Finding the offending security check

I needed something faster then kubectl to iterate over this problem.
Luckily I was able to reproduce the same error while running buildah locally
using podman:
$ sudo podman run \
    --rm \
    --device /dev/fuse \
    -v :/code \
    registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
    /bin/sh -c "cd /code; buildah bud -t foo ."
I was pretty sure the failure happened due to some tight security check. To
prove my theory I ran the same container in
privileged mode:
$ sudo podman run \
    --rm \
    --device /dev/fuse \
    --privileged \
    -v :/code \
    registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
    /bin/sh -c "cd /code; buildah bud -t foo ."
The build completed successfully. Running a container in privileged mode is bad
and makes me hurt, it’s not a long term solution but at least that proved the
build failure was definitely caused by some security constraint.

The next step was to identify the security measure at the origin of the failure.
That could be either something related with seccomp or AppArmor. I immediately
ruled out SELinux as the root cause because it’s not used on openSUSE by default.

I then ran the container again, but this time I instructed podman to not apply
any kind of seccomp profile; I basically disabled seccomp for my containerized
workload.

This can be done by using the unconfined mode for seccomp:
$ sudo podman run \
    --rm \
    --device /dev/fuse \
    -v :/code \
    --security-opt=seccomp=unconfined \
    registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
    /bin/sh -c "cd /code; buildah bud -t foo ."
The build failed again with the same error. That meant seccomp was not causing
the failure. AppArmor was left as the main suspect.

Next, I just run the container but I instructed podman to not apply any kind
of AppArmor profile; again, I basically disabled AppArmor for my containerized
workload.

This can be done by using the unconfined mode for AppArmor:
$ sudo podman run \
    --rm \
    --device /dev/fuse \
    -v :/code \
    --security-opt=apparmor=unconfined \
    registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
    /bin/sh -c "cd /code; buildah bud -t foo ."
This time the build completed successfully. Hence the issue was caused by the
default AppArmor profile.

Create an AppArmor profile for buildah

All the container engines (docker, podman, CRI-O, containerd) have an
AppArmor profile that is applied to all the containerized workloads by default.

The containerized Buildah is probably doing something that is not allowed by
this generic profile. I just had to identify the offending operation and create
a new tailor-made AppArmor profile for buildah.

As a first step I had to obtain the default AppArmor profile. This is not as easy
as it might seem. The profile is generated at runtime by all the container
engines and is loaded into the kernel. Unfortunately there’s no way to dump
the information stored into the kernel and have a human-readable AppArmor profile.

After some digging into the source code of podman and some reading on docker’s
GitHub issues, I produced a quick PR
that allowed me to print the default AppArmor profile on to the stdout.

This is the default AppArmor profile used by podman:

#include 


profile default flags=(attach_disconnected,mediate_deleted) {

  #include 


  network,
  capability,
  file,
  umount,


  # Allow signals from privileged profiles and from within the same profile
  signal (receive) peer=unconfined,
  signal (send,receive) peer=default,


  deny @{PROC}/* w,   # deny write for all files directly in /proc (not in a subdir)
  # deny write to files not in /proc//** or /proc/sys/**
  deny @{PROC}/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}/** w,
  deny @{PROC}/sys/[^k]** w,  # deny /proc/sys except /proc/sys/k* (effectively /proc/sys/kernel)
  deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**} w,  # deny everything except shm* in /proc/sys/kernel/
  deny @{PROC}/sysrq-trigger rwklx,
  deny @{PROC}/kcore rwklx,

  deny mount,

  deny /sys/[^f]*/** wklx,
  deny /sys/f[^s]*/** wklx,
  deny /sys/fs/[^c]*/** wklx,
  deny /sys/fs/c[^g]*/** wklx,
  deny /sys/fs/cg[^r]*/** wklx,
  deny /sys/firmware/** rwklx,
  deny /sys/kernel/security/** rwklx,


  # suppress ptrace denials when using using 'ps' inside a container
  ptrace (trace,read) peer=default,

}


A small parenthesis, this AppArmor profile is the same generated by all the other
container engines. Some poor folks keep this file in sync manually, but there’s a
discussion upstream
to better organize things.

Back to the build failure caused by AppArmor… I saved the default profile into
a text file named containerized_buildah and I changed this line

profile default flags=(attach_disconnected,mediate_deleted) {


to look like that:

profile containerized_buildah flags=(attach_disconnected,mediate_deleted,complain) {


This changes the name of the profile and, most important of all, changes the policy mode
to be of type complain instead of enforcement.

Quoting the AppArmor man page:



enforcement -  Profiles loaded in enforcement mode will result in enforcement of the policy defined in the profile as well as reporting policy violation attempts to syslogd.
complain - Profiles loaded in  “complain” mode will not enforce policy.  Instead, it will report policy violation attempts. This mode is convenient for developing profiles.



I then loaded the policy by doing:
$ sudo apparmor_parser -r containerized_buildah
Invoking the aa-status command reports a list of all the modules loaded,
their policy mode and all the processes confined by AppArmor.
$ sudo aa-status
...
2 profiles are in complain mode.
   containerized_buildah
...
One last operation had to done before I could start to debug
the containerized buildah: turn off “audit quieting”. Again, straight
from AppArmor’s man page:


Turn off deny audit quieting

By default, operations that trigger “deny” rules are not logged.  This is called deny audit quieting.

To turn off deny audit quieting, run:

echo -n noquiet >/sys/module/apparmor/parameters/audit


Before starting the container, I opened a new terminal to execute this process:
# tail -f /var/log/audit/audit.log | tee apparmor-build.log
On systems where auditd is running (like mine), all the AppArmor logs are sent
to /var/log/audit/audit.log. This command allowed me to keep an eye open on
the live stream of audit logs and save them into a smaller file named
apparmor-build.log.

Finally, I started the container using the custom AppArmor profile shown above:
$ sudo podman run \
    --rm \
    --device /dev/fuse \
    -v :/code \
    --security-opt=apparmor=containerized_buildah \
    registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
    /bin/sh -c "cd /code; buildah bud -t foo ."
The build completed successfully. Grepping for ALLOWED inside of the audit file
returned a stream of entries like the following ones:

type=AVC msg=audit(1600172410.567:622): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/tmp/containers.o5iLtx" pid=25607 comm="exe" srcname="/usr/bin/buildah" flags="rw, bind"
type=AVC msg=audit(1600172410.567:623): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/tmp/containers.o5iLtx" pid=25607 comm="exe" flags="ro, remount, bind"
type=AVC msg=audit(1600172423.511:624): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/" pid=25629 comm="exe" flags="rw, rprivate"
...


As you can see all these entries are about mount operations, with mount
being invoked with quite an assortment of flags.

The default AppArmor profile explicitly denies mount operations:

...
  deny mount,
...


All I had to do was to change the containerized_buildah AppArmor profile to
that:

#include 


profile containerized_buildah flags=(attach_disconnected,mediate_deleted) {

  #include 


  network,
  capability,
  file,
  umount,
  mount,

  # Allow signals from privileged profiles and from within the same profile
  signal (receive) peer=unconfined,
  signal (send,receive) peer=default,


  deny @{PROC}/* w,   # deny write for all files directly in /proc (not in a subdir)
  # deny write to files not in /proc//** or /proc/sys/**
  deny @{PROC}/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}/** w,
  deny @{PROC}/sys/[^k]** w,  # deny /proc/sys except /proc/sys/k* (effectively /proc/sys/kernel)
  deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**} w,  # deny everything except shm* in /proc/sys/kernel/
  deny @{PROC}/sysrq-trigger rwklx,
  deny @{PROC}/kcore rwklx,

  deny /sys/[^f]*/** wklx,
  deny /sys/f[^s]*/** wklx,
  deny /sys/fs/[^c]*/** wklx,
  deny /sys/fs/c[^g]*/** wklx,
  deny /sys/fs/cg[^r]*/** wklx,
  deny /sys/firmware/** rwklx,
  deny /sys/kernel/security/** rwklx,


  # suppress ptrace denials when using using 'ps' inside a container
  ptrace (trace,read) peer=default,

}


The profile is now back to enforcement mode and, most important of all, it
allows any kind of mount invocation.

I tried to be more granular and allow only the mount flags
actually used by buildah, but the list was too long, there were too many
combinations and that seemed pretty fragile. The last thing I want to happen is
to have AppArmor break buildah in the future if a slightly different mount
operation is done.

Reloading the AppArmor profile via sudo apparmor_parser -r containerized_buildah
and restarting the build proved that the profile was doing its job also in
enforcement mode: the build successfully completed. 🎉🎉🎉

But the journey over yet, not quite…

Why AppArmor is blocking only x86_64 builds?

Once I figured out the root cause of x86_64 builds there was one last mystery
to be solved: why the ARM64 builds worked just fine? Why didn’t AppArmor cause
any issue over there?

The answer was quite simple (and a bit shocking to me): it turned out the
Raspberry Pi OS (formerly known as raspbian) ships a kernel that doesn’t have
AppArmor enabled. I never realized that!

I didn’t find the idea of running containers without any form of
Mandatory Access Control
particularly thrilling. Hence I decided to change the operating system run on
my Raspberry Pi nodes.

I initially picked Raspberry Pi OS because I wanted to have my Raspberry Pi 4
boot straight from an external USB disk instead of the internal memory card.
At the time of writing, this feature requires a bleeding edge firmware and all
the documentation points at Raspberry Pi OS. I just wanted to stick with
what the community was using to reduce my chances of failure…

However, if you need AppArmor support, you’re left with two options:
openSUSE and Ubuntu.

I installed openSUSE Leap 15.2 for aarch64 (aka ARM64) on one of my Raspberry
Pi 4. The process of getting it to boot from USB was pretty straightforward.
I added the node back into the Kubernetes cluster, forced
some workloads to move on top of it and monitored its behaviour.
Everything was great, I was ready to put openSUSE on my 2nd Raspberry Pi 4 when
I noticed something strange: my room was quieter than the usual…

My Raspberry Pis are powered using the official PoE HAT.
I love this hat, but I hate its built-in fan because it’s notoriously
loud (yes, you can tune its thresholds, but it’s still damn noisy when it kicks
in).

Well, my room was suddenly quieter because the fan of the PoE HAT was
not spinning at all. That lead the CPU temperature to reach more than 85 °C 😱

It turns out the PoE HAT needs a driver which is not part of the mainstream
kernel and unfortunately nobody added it to the openSUSE kernel yet.
That means openSUSE doesn’t see and doesn’t even turn on the PoE HAT fan (not
even at full speed).

I filed a enhancement bug report against openSUSE Tumbleweed to get the PoE HAT
driver added to our kernel and moved over to Ubuntu. Unfortunately that was
a blocking issue for me. What a pity 😢

On the other hand, the kernel of Ubuntu Server supports both the PoE HAT fan and
AppArmor. After some testing I switched all my Raspberry Pi nodes to run
Ubuntu 20.04 Server.

To prove my mental sanity, I ran the builder-arm64 POD against the Ubuntu
nodes using the default AppArmor profile. The build failed on ARM64 in the same
way as it did on x86_64. What a relief 😅.

Kubernetes and AppArmor profiles

At this point I’ve a tailor-made AppArmor profile for buildah, plus all the nodes
of my cluster have AppArmor support. It’s time to put all the pieces together!

The previous POD definition has to be extended to ensure the main container
running buildah is using the tailor-made AppArmor profile instead of the
default one.

Kubernetes’ AppArmor
support is a bit primitive, but effective. The only requirement, when using
custom profiles, is to ensure the profile is already known by the AppArmor
system on each node of the cluster.

This can be done in an easy way: just copy the profile under
/etc/apparmor.d and perform a systemct reload apparmor. This has to be done
once, at the next boot the AppArmor service will automatically load all
the profiles found inside of /etc/apparmor.d.

This is how the final POD definition looks like:
apiVersion: v1
kind: Pod
metadata:
  name: builder-amd64
  annotations:
    container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
spec:
  nodeSelector:
    kubernetes.io/arch: "amd64"
  containers:
  - name: main
    image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
    command: ["/bin/sh"]
    args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
    volumeMounts:
      - name: code
        mountPath: /code
    resources:
      limits:
        github.com/fuse: 1
  initContainers:
  - name: git-sync
    image: k8s.gcr.io/git-sync/git-sync:v3.1.7
    args: [
      "--one-time",
      "--depth", "1",
      "--dest", "checkout",
      "--repo", "https://github.com/flavio/guestbook-go.git",
      "--branch", "master"]
    volumeMounts:
      - name: code
        mountPath: /tmp/git
  volumes:
  - name: code
    emptyDir:
      medium: Memory
This time the build will work fine also inside of Kubernetes, regardless of the
node architecture! 🥳

What’s next?

First of all, congratulations for having made up to this point. It has been quite
a long journey, I hope you enjoyed it.

The next step consists of taking this foundation (a Kubernetes POD that can
run buildah to build new container images) and find a way to orchestrate that.

What I’ll show you in the next blog post is how to create a workflow that, given
a GitHub repository with a Dockerfile, builds two container images (amd64 and
arm64), pushes both of them to a container registry and then creates
a multi-architecture manifest referencing them.

As always feedback is welcome, see you soon!



Semantic versioning and containers
2020-02-27T21:48:16+02:00


Developers are used to express the dependencies of their programs using
semantic versioning constraints.

For example a Node.js application relying on left-pad could force only certain
versions of this library to be used by specifying a constraint like
>= 1.1.0 < 1.2.0. This would force npm to install the latest version of the
library that satisfies the constraint.

How does that translates to containers?

Imagine the following scenario: a developer deploys a containerized application
that requires a Redi database. The developer deploys the latest version
of the redis container (eg: redis:4.0.5), ensures his application works fine
and then moves to do other things.

After some weeks a security issue/bug is found inside of Redis and a new patched
release takes place. Suddenly the deployed container is outdated. How can the
developer be aware a new v4 release of Redis is available? Wouldn’t be even
better to have some automated tool taking care of this upgrade?

After some more weeks a new minor release of Redis is released (eg: 4.1.0).
Is it safe to automatically update to a new minor release of Redis, is the
developer application going to work as expected?

Some container images have special tags like v4 or v4.1 and the developer
could just leverage them to kinda pinpoint the redis container to a more
delimited set of versions. However using these tags reduces reproducibility
and debuggability.

Let’s imagine the redis image being deployed is redis:v4.1 and everything is
working as expected. Assume after some time the developer (or some automated tool)
pulls a new version of the redis:v4.1 image and suddenly the application
has some issues. How can the developer understand what really changed?
Wouldn’t it be great to be able to say something like “everything worked fine
with redis:4.1.0 but it broke when I upgraded to redis:4.1.9”?

There are some tools that can be used to find and automatically update old
container images: Watchtower
and ouroboros. However none of them
allows the flexibility I was looking for (in terms of checks), plus they are
both tailored to work only against docker.

Because of that, during the 2020 edition of
SUSE Hackweek, I spent some time working on
a different solution to this use case.

Introducing fresh-container

fresh-container is a tool that
can be used to see if a container can be updated to a more recent release.

fresh-container is different compared to Watchtower and ouroboros because it
relies on semantic versioning to process container
image tags.

Semantic versioning is used to express the version constraints a container
version must satisfy. This gives more flexibility, for example take a look
at the following scenarios:


I’m fine with any release of Redis that is part of the v4 code stream: >= 4.0.0 < 5.0.0
I’m fine only with patch releases of Redis that belong to the 4.1 code stream: >= 4.1.0 < 4.2.0
I’m don’t want any release of Redis after v6: < 6.0.0


CLI mode

fresh-container can be run as a standalone program:
$ fresh-container check --constraint ">= 1.9.0 < 1.10.0" nginx:1.9.0

The 'docker.io/library/nginx' container image can be upgraded from the '1.9.0' tag to the '1.9.15' one and still satisfy the '>= 1.9.0 < 1.10.0' constraint.
Behind the scenes fresh-container will query the container registry hosting
the image to gather the list of all the available tags. The tags that do not
respect semantic versioning will be ignored and finally the tool will
evaluate the constraint provided by the user.

It can also generate computer parsable output by producing a JSON response:
$ fresh-container check -o json --constraint ">= 1.9.0 < 1.10.0" nginx:1.9.0

{
  "image": "docker.io/library/nginx",
  "constraint": ">= 1.9.0 < 1.10.0",
  "current_version": "1.9.0",
  "next_version": "1.9.15",
  "stale": true
}
Server mode

Querying the remote container registries to fetch all the available tags of a
container image is an expensive operation. That gets even worse when multiple
containers have to be inspected on a regular basis.

The fresh-container binary can operate in a server mode to alleviate this issue:
$ fresh-container server
This will start a web server offering a simple REST API that can be used to
perform queries. The remote tags of the container images are cached inside of
an in-memory database to speed up constraint resolution.

It’s possible to run fresh-container check against a fresh-container server
to perform faster queries by using the --server 
flag.

Kubernetes integration

fresh-container is a tool built to serve one specific use case: your provide some
data as input and, as output, it will tell you if the container image can be
updated to a more recent version.

It’s main goal is to be leveraged by other tools to build something bigger like
fresh-container-operator.

This is a kubernetes operator that, once deployed inside of a kubernetes cluster,
will look at all the kubernetes deployments running inside of it and finds the
ones having stale containers.

The operator can also automatically update these outdated deployments to use
the latest version of the container images that satisfy their requirements.

Usage

How does it work? First of all you have to enrich your deployment definition
by adding some ad-hoc annotations.

For each container image used by the deployment you have to specify the semantic
versioning constraint that has to be used to evaluate their “freshness”.

Take a look at the following example:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  annotations:
    fresh-container.autopilot: "false"
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        fresh-container.constraint/nginx: ">= 1.9.0 < 1.10.0"
    spec:
      containers:
      - name: nginx
        image: nginx:1.9.0
        ports:
        - containerPort: 80
In this case the operator will look at the version of the nginx container
in use and evaluate it against the >= 1.9.0 < 1.10.0 constraint.

Note well: deployments that do not have any
fresh-container.constraint/ will be ignored by the operator.

Find stale deployments

The operator adds the special label fresh-container.hasOutdatedContainers=true
to all the deployments that have one or more stale containers inside of them.

This allows quick searches against all the deployments:
$ kubectl get deployments --all-namespaces -l fresh-container.hasOutdatedContainers=true
NAMESPACE   NAME               READY   UP-TO-DATE   AVAILABLE   AGE
default     nginx-deployment   1/1     1            1           19m
Why is a deployment stale?

The details about the stale containers are added by the operator as annotations
of the deployment:
kubectl describe deployments.apps nginx-deployment
Name:                   nginx-deployment
Namespace:              default
CreationTimestamp:      Thu, 27 Feb 2020 10:32:55 +0100
Labels:                 fresh-container.hasOutdatedContainers=true
Annotations:            deployment.kubernetes.io/revision: 1
                        fresh-container.autopilot: false
                        fresh-container.lastChecked: 2020-02-27T09:45:07Z
                        fresh-container.nextTag/nginx: 1.9.15
For each stale container the operator adds an annotation with
fresh-container.nextTag/ as key and the tag of the most recent
container that satisfies the constraint as value.

In the example above you can see that the nginx container inside of the
deployment can be updated to the 1.9.15 tag while still satisfying the
>= 1.9.0 < 1.10.0 constraint.

Automatic upgrades

The next step is to allow fresh-container-operator to update all the
deployments that have stale containers.

This is not done by default, but can be enable on a per-deployment basis
by adding the fresh-container.autopilot=true annotation inside of the
deployment metadata.

What comes next

As I stated in the beginning I created these projects during the 2020 edition of
SUSE Hackweek. They are early prototypes that
need more love.

I would be happy to hear what you think about them. Feel free to leave a comment
below or open an issue on their GitHub projects:


fresh-container
fresh-container-operator




Hackweek Project Docker Registry Mirror
2018-07-18T15:48:16+02:00


As part of SUSE Hackweek 17 I decided to work on a
fully fledged docker registry mirror.

You might wonder why this is needed, after all it’s already possible to run a
docker distribution (aka registry) instance as a
pull-through cache. While
that’s true, this solution doesn’t address the needs of more “sophisticated”
users.

The problem

Based on the feedback we got from a lot of SUSE customers it’s clear that a simple
registry configured to act as a pull-through cache isn’t enough.

Let’s go step by step to understand the requirements we have.

On-premise cache of container images

First of all it should be possible to have a mirror of certain container images
locally. This is useful to save time and bandwidth. For example there’s no
reason to download the same image over an over on each node of a Kubernetes
cluster.

A docker registry configured to act as a pull-through cache can help with that.
There’s still need to warm the cache, this can be left to the organic pull
of images done by the cluster or could be done artificially by some script
run by an operator.

Unfortunately a pull-through cache is not going to solve this problem for
nodes running inside of an air-gapped environment. Nodes operated in such an
environment are located into a completely segregated network, that would make it
impossible for the pull-through registry to reach the external registry.

Retain control over the contents of the mirror

Cluster operators want to have control of the images available inside of the
local mirror.

For example, assuming we are mirroring the Docker Hub, an operator might be
fine with having the library/mariadb image but not the library/redis one.

When operating a registry configured as pull-through cache, all the images of
the upstream registry are at the reach of all the users of the cluster. It’s
just a matter of doing a simple docker pull to get the image cached into
the local pull-through cache and sneak it into all the nodes.

Moreover some operators want to grant the privilege of adding images to the local
mirror only to trusted users.

There’s life outside of the Docker Hub

The Docker Hub is certainly the most known container registry. However there are
also other registries being used: SUSE operates its own registry, there’s Quay.io,
Google Container Registry (aka gcr) and there are even user operated ones.

A docker registry configured to act as a pull-through cache can mirror only one
registry. Which means that, if you are interested in mirroring both the Docker
Hub and Quay.io, you will have to run two instances of docker registry
pull-through caches: one for the Docker Hub, the other for Quay.io.

This is just overhead for the final operator.

A better solution

During the last week I worked to build a PoC to demonstrate we can create a docker
registry mirror solution that can satisfy all the requirements above.

I wanted to have a single box running the entire solution and I wanted all the
different pieces of it to be containerized. I hence resorted to use a
node powered by openSUSE Kubic.

I didn’t need all the different pieces of Kubernetes, I just needed kubelet so
that I could run it in disconnected mode. Disconnected means the kubelet
process is not connected to a Kubernetes API server, instead it reads PODs
manifest files straight from a local directory.

The all-in-one box

I created an openSUSE Kubic node and then I started by deploying a standard
docker registry.
This instance is not configured to act as a pull-through cache. However it
is configured to use an external authorization service. This is needed to allow
the operator to have full control of who can push/pull/delete images.

I configured the registry POD to store the registry data to a directory on the
machine by using a Kubernetes hostPath
volume.

On the very same node I deployed the authorization service needed by the
docker registry. I choose Portus, an open source solution
created at SUSE
a long time ago.

Portus needs a database, hence I deployed a containerized instance of MariaDB
on the same node. Again I used a Kubernetes hostPath to ensure the persistence
of the database contents. I placed both Portus and its MariaDB instance into the
same POD. I configured MariaDB to listen only to localhost, making it reachable
only by the Portus instance (that’s because they are in the same
Kubernetes POD).

I configured both the registry and Portus to bind to a local unix socket,
then I deployed a container running HAProxy to expose both of them to
the world.

The HAProxy is the only container that uses the host network. Meaning it’s
actually listening on port 80 and port 443 of the openSUSE Kubic node.

I went ahead and created two new DNS entries inside of my local network:


registry.kube.lan: this is the FQDN of the registry
portus.kube.lan: this is the FQDN of portus


I configured both the names to be resolved with the IP address of my container
host.

I then used cfssl to generate a CA and
then a pair of certificates and keys for registry.kube.lan and portus.kube.lan.

Finally I configured HAProxy to:


Listen on port 80 and 443.
Automatically redirect traffic from port 80 to port 443.
Perform TLS termination for registry and Portus.
Load balance requests against the right unix socket using
the Server Name Indication (SNI).


By having dedicated FQDN for the registry and Portus and by using HAProxy’s SNI
based load balancing, we can leave the registry listening on a standard port
(443) instead of using a different one (eg: 5000). In my opinion that’s a big
win, based on my personal experience having the registry listen on a non standard
port makes things more confusing both for the operators and the end users.

Once I was over with these steps I was able to log into https://portus.kube.lan
and perform the usual setup wizard of Portus.

Mirroring images

We now have to mirror images from multiple registries into the local one, but
how can we do that?

Sometimes ago I stumbled over this tool,
which can be used to copy images from multiple registries into a single one.
While doing that it can change the namespace of the image to put it all the
images coming from a certain registry into a specific namespace.

I wanted to use this tool, but I realized it relies on the docker open-source
engine to perform the pull and push operations. That’s a blocking issue for me
because I wanted to run the mirroring tool into a container without doing nasty
tricks like mounting the docker socket of the host into a container.

Basically I wanted the mirroring tool to not rely on the docker open source
engine.

At SUSE we are already using and contributing to
skopeo, an amazing tool
that allows interactions with container images and container registries without
requiring any docker daemon.

The solution was clear: extend skopeo to provide mirroring capabilities.

I drafted a design proposal with my colleague Marco Vedovati,
started coding and then ended up with this pull request.

While working on that I also uncovered a small glitch
inside of the containers/image library used by skopeo.

Using a patched skopeo binary (which include both the patches above) I then
mirrored a bunch of images into my local registry:

$ skopeo sync --source docker://docker.io/busybox:musl --dest-creds="flavio:password" docker://registry.kube.lan
$ skopeo sync --source docker://quay.io/coreos/etcd --dest-creds="flavio:password" docker://registry.kube.lan


The first command mirrored only the busybox:musl container image from the
Docker Hub to my local registry, while the second command mirrored all the
coreos/etcd images from the quay.io registry to my local registry.

Since the local registry is protected by Portus I had to specify my credentials
while performing the sync operation.

Running multiple sync commands is not really practical, that’s why we added
a source-file flag. That allows an operator to write a configuration file
indicating the images to mirror. More on that on a dedicated blog post.

At this point my local registry had the following images:


docker.io/busybox:musl
quay.io/coreos/etcd:v3.1
quay.io/coreos/etcd:latest
quay.io/coreos/etcd:v3.3
quay.io/coreos/etcd:v3.3
… more quay.io/coreos/etcd images …


As you can see the namespace of the mirrored images is changed to
include the FQDN of the registry from which they have been downloaded.
This avoids clashes between the images and makes easier to track their
origin.

Mirroring on air-gapped environments

As I mentioned above I wanted to provide a solution that could be used also
to run mirrors inside of air-gapped environments.

The only tricky part for such a scenario is how to get the images from the
upstream registries into the local one.

This can be done in two steps by using the skopeo sync command.

We start by downloading the images on a machine that is connected to the internet.
But instead of storing the images into a local registry we put them on a local
directory:

$ skopeo sync --source docker://quay.io/coreos/etcd dir:/media/usb-disk/mirrored-images


This is going to copy all the versions of the quay.io/coreos/etcd image into
a local directory /media/usb-disk/mirrored-images.

Let’s assume /media/usb-disk is the mount point of an external USB drive.
We can then unmount the USB drive, scan its contents with some tool, and
plug it into computer of the air-gapped network. From this computer we
can populate the local registry mirror by using the following command:

$ skopeo sync --source dir:/media/usb-disk/mirrored-images --dest-creds="username:password" docker://registry.secure.lan


This will automatically import all the images that have been previously downloaded
to the external USB drive.

Pulling the images

Now that we have all our images mirrored it’s time to start consuming them.

It might be tempting to just update all our Dockerfile(s), Kubernetes
manifests, Helm charts, automation scripts, …
to reference the images from registry.kube.lan//:.
This however would be tedious and unpractical.

As you might know the docker open source engine has a --registry-mirror.
Unfortunately the docker open source engine can only be configured to mirror the
Docker Hub, other external registries are not handled.

This annoying limitation lead me and Valentin Rothberg
to create this pull request
against the Moby project.

Valentin is also porting the patch against libpod,
that will allow to have the same feature also inside of
CRI-O and podman.

During my experiments I figured some
little bits
were missing from the original PR.

I built a docker engine with the full patch
applied and I created this /etc/docker/daemon.json configuration file:
{
  "registries": [
    {
      "Prefix": "quay.io",
      "Mirrors": [
        {
          "URL": "https://registry.kube.lan/quay.io"
        }
      ]
    },
    {
      "Prefix": "docker.io",
      "Mirrors": [
        {
          "URL": "https://registry.kube.lan/docker.io"
        }
      ]
    }
  ]
}
Then, on this node, I was able to issue commands like:

$ docker pull quay.io/coreos/etcd:v3.1


That resulted in the image being downloaded from registry.kube.lan/quay.io/coreos/etcd:v3.1,
no communication was done against quay.io. Success!

What about unpatched docker engines/other container engines?

Everything is working fine on nodes that are running this not-yet merged patch,
but what about vanilla versions of docker or other container engines?

I think I have a solution for them as well, I’m going to experiment a bit
with that during the next week and then provide an update.

Show me the code!

This is a really long blog post. I’ll create a new one with all the configuration
files and instructions of the steps I performed. Stay tuned!

In the meantime I would like to thank
Marco Vedovati,
Valentin Rothberg for their help with skopeo and
the docker mirroring patch, plus Miquel Sabaté Solà
for his help with Portus.



Putting openSUSE Docker images on a diet
2015-07-24T14:21:34+02:00
In case you missed the openSUSE images for Docker got suddenly smaller.

During the last week I worked together with Marcus Schäfer
(the author of KIWI) to reduce their size.

We fixed some obvious mistakes (like avoiding to install man pages and
documentation), but we also removed some useless packages.

These are the results of our work:


openSUSE 13.2 image: from 254M down to 82M
openSUSE Tumbleweed image: from 267M down to 87M


Just to make some comparisons, the Ubuntu image is around 188M while the
Fedora one is about 186M. We cannot obviously compete with images like busybox or
Alpine, but the situation definitely improved!

Needless to say, the new images are already on the DockerHub.

Have fun!



Introducing Portus: an authorization service and front-end for Docker registry
2015-04-23T21:43:09+02:00


One of the perks of working at SUSE is hackweek, an entire week you can dedicate
working on whatever project you want. Last week the 12th edition of hackweek
took place. So I decided to spend it working on solving one of the problems many
users have when running an on-premise instance of a Docker registry.

The Docker registry works like a charm, but it’s hard to have full
control over the images you push to it. Also there’s no web interface
that can provide a quick overview of registry’s contents.

So Artem, Federica
and I created the Portus project (BTW
“portus” is the Latin name for harbour).

Portus as an authorization service

The first goal of Portus is to allow users to have a better control over the
contents of their private registries. It makes possible to write policies
like:


everybody can push and pull images to a certain namespace,
everybody can pull images from a certain namespace but only certain users can
push images to it,
only certain users can pull and push to a certain namespace; making all the
images inside of it invisible to unauthorzied users.


This is done implementing the
token based authentication system
supported by the latest version of the
Docker registry.

Docker login and Portus authentication in action



Portus as a front-end for Docker registry

Portus listens to the notifications
sent by the Docker registry and uses them to populate its own database.

Using this data Portus can be used to navigate through all the namespaces and
the repositories that have been pushed to the registry.



We also worked on a client library that can be used to fetch extra
information from the registry (i.e. repositories’ manifests) to extend Portus’
knowledge.

The current status of development

Right now Portus has just the concept of users. When you sign up into Portus a
private namespace with your username will be created. You are the only one with
push and pull rights over it; nobody else will be able to mess with it.
Also pushing and pulling to the “global” namespace is currently not allowed.

The user interface is still a work in progress. Right now you can browse all
the namespaces and the repositories available on your registry. However user’s
permissions are not taken into account while doing that.

If you want to play with Portus you can use the development environment managed
by Vagrant. In the near future we are going to publish a Portus appliance and
obviously a Docker image.

Please keep in mind that Portus is just the result of one week of work. A lot of
things are missing but the foundations are solid.

Portus can be found on this repository on
GitHub. Contributions (not only code, also proposals, bugs,…) are welcome!



Orchestrating Docker containers on openSUSE
2014-11-03T14:51:42+01:00
A couple of weeks ago the 11th edition of SUSE’s hackweek
took place. This year I decided to spend this time to look into the different
orchestration and service discovery tools build around Docker.

In the beginning I looked into the kubernetes
project. I found it really promising but AFAIK not yet ready to be used. It’s
still in its early days and it’s in constant evolution. I will surely keep
looking into it.

I also looked into other projects like consul and
geard but then I focused on using
etcd and
fleet, two of the tools part of
CoreOS.

I ended up creating a small testing environment that is capable of running a
simple guestbook web application talking with a MongoDB database. Both the web
application and the database are shipped as Docker images running on a small
cluster.

The whole environment is created by Vagrant. That
project proved to be also a nice excuse to play with this tool. I found Vagrant
to be really useful.

You can find all the files and instructions required to reproduce my experiments
inside of this repository
on GitHub.

Happy hacking!



Building docker images with KIWI
2014-05-06T00:00:00+00:00


I’m pleased to announce Marcus Schäfer has
just made possible to build docker images with KIWI.

For those who never heard about it, KIWI is a tool which creates Linux systems
for both physical and virtual machines. It can create openSUSE, SUSE and other types of
Linux distributions.

Update: I changed the required version of kiwi and the openSUSE 13.1 template.
Kiwi just received some improvements which do no longer force the image
to include the lxc package.

Why is this important?

As you might know Docker has already its build system
which provides a really easy way to create new images. However these images
must be based on existing ones, which leads to the problem of creating the 1st
parent image. That’s where KIWI comes to the rescue.

Indeed Kiwi can be used to build the openSUSE/SUSE/whatever docker images that are
going to act as the foundation blocks of other ones.

Requirements

Docker support has been added to KIWI 5.06.87. You can find this package inside
of the Virtualization:Appliances
project on OBS.

Install the kiwi and the kiwi-doc packages on your system. Then go to the
/usr/share/doc/packages/kiwi/examples/ directory where you will find a simple
openSUSE 13.1 template.

Building the system

Just copy the whole /usr/share/doc/packages/kiwi/examples/suse-13.1/suse-docker-container
directory to another location and make your changes.

The heart of the whole image is the config.xml file:


 schemaversion="6.1" name="suse-13.1-docker-guest">
   type="system">
    Flavio Castelli
    fcastelli@suse.com
    openSUSE 13.1 docker image
  
  
     image="docker" container="os131">
      
        
         interface="eth0" mode="veth"/>
      
    
    1.0.0
    zypper
    false
    true
    en_US
    us.map.gz
    utc
    US/Eastern
  
   group="root">
     password="$1$wYJUgpM5$RXMMeASDc035eX.NbYWFl0" home="/root" name="root"/>
  
   type="yast2">
     path="opensuse://13.1/repo/oss/"/>
  
   type="image">
     name="coreutils"/>
     name="iputils"/>
  
   type="bootstrap">
     name="filesystem"/>
     name="glibc-locale"/>
     name="module-init-tools"/>
  

This is a really minimal image which contains just a bunch of packages.

The first step is the creation of the image’s root system:

kiwi -p /usr/share/doc/packages/kiwi/examples/suse-13.1/suse-docker-container \
     --root /tmp/myimage


The next step compresses the file system of the image into a single tarball:

    kiwi --create /tmp/myimage --type docker -d /tmp/myimage-result


The tarball can be found under /tmp/myimage-result. This can be imported
into docker using the following command:

docker import - myImage < /path/to/myimage.tbz


The image named myImage is now ready to be used.

What’s next

In the next days I’ll make another blog post explaining how to build docker
images using KIWI and the Open Build Service.
This is a powerful combination which allows to achieve continuous delivery.

Stay tuned and have fun!



Docker and openSUSE getting closer
2014-02-13T00:00:00+00:00
I have some good news about Docker and openSUSE.

First of all the Docker package has been moved from my personal OBS project
to the more official Virtualization
one. The next step is to get the Docker package into Factory :)

I’m going to drop the docker package from home:flavio_castelli:docker,
so make sure to subscribe to the Virtualization repository to get latest versions of
Docker.

I have also submitted some openSUSE related documentation to the official Docker
project. If you visit the “Getting started”
page you will notice the familiar geeko logo. Click it to be redirected to the
openSUSE’s installation instructions.



Better Docker experience on openSUSE
2013-11-28T00:00:00+00:00


I don’t know if you are aware of that, but Docker 0.7.0 has been released a
couple of days ago.

You can read the full announcement here,
but let me talk about the biggest change introduced by this release: storage drivers!

Docker has always used AUFS,
a “unionfs-like” file system, to power
its containers. Unfortunately AUFS is neither part of the official kernel nor
of the openSUSE/SLE one.

In the past I had
to build a custom patched kernel to run Docker on openSUSE. That proved to be
a real pain both for me and for the end users.

Now with storage drivers Docker can still use AUFS, but can also opt for something
different. In our case Docker is going to use thin provisioning,
a consolidated technology which is part of the mainstream kernel since quite some time.

Moreover Docker’s community is working on experimental drivers for BTRFS, ZFS,
Gluster and Ceph.

What changes now for openSUSE?

Running Docker is incredibly simple now: just use the 1 click install
and download it from the ‘home:flavio_castelli:docker’ project.

As I said earlier: no custom kernel is required. You are going to keep the
one shipped by the official openSUSE repositories.

Just keep in mind that Docker does some initialization tasks on its very first
execution (it configures thin provisioning). So just wait a little before hitting its
API with the Docker cli tool (you will just get an error because docker.socket
is not found).

The road ahead

Support SLE

Right now Docker works fine on openSUSE 12.3 and 13.1 but not on SLE 11 SP3. During
the next days I’m going to look into this issue. I want to have a stable and working
package for SLE.

Make it more official

Once the package is proved to be stable enough I’ll submit it for inclusion inside
of the Virtualization
project on OBS.

So please, checkout Docker package  and provide me your feedback!



Docker and openSUSE: a container full of Geekos
2013-04-12T00:00:00+00:00


SUSE’s Hackweek #9 is over. It has
been an awesome week during which I worked hard to make docker
a first class citizen on openSUSE. I also spent
some time working on an openSUSE container that could be used by docker’s users.

The project has been tracked on this
page of hackweek’s wiki, this is a detailed report of what I achieved.

Installing docker on openSUSE 12.3

Docker has been packaged inside of this
OBS project.

So installing it requires just two commands:

sudo zypper ar http://download.opensuse.org/repositories/home:/flavio_castelli:/docker/openSUSE_12.3 docker
sudo zypper in docker


There’s also a 1 Click Install
for the lazy ones :)

Zypper will install docker and its dependencies which are:


lxc: docker’s “magic” is built on top of LXC.
bridge-utils: is used to setup the bridge interface used by docker’s
containers.
dnsmasq: is used to start the dhcp server used by the containers.
iptables: is used to get containers’ networking work.
bsdtar: is used by docker to compress/extract the containers.
aufs3 kernel module: is used by docker to
track the changes made to the containers.


The aufs3 kernel module is not part of the official kernel and is not
available on the official repositories. Hence adding docker will trigger the
installation of a new kernel package on your machine.

Both the kernel and the aufs3 packages are going to be installed from the
home:flavio_castelli:docker repository but they
are in fact links to the packages created by Michal Hrusecky
inside of his aufs project
on OBS.

Note well: docker works only on 64bit hosts. That’s why there are no 32bit
packages.

Docker appliance

If you don’t want to install docker on your system or you are just curious and
want to jump straight into action there’s a SUSE Studio
appliance ready for you. You can find it here.

If you are not familiar with SUSE Gallery let me tell you two things about it:


you can download the appliance on your computer and play with it or…
you can clone the appliance on SUSE Studio and customize it even further or…
you can test the appliance from your browser using SUSE Studio’s Testdrive
feature (no SUSE Studio account required!).


The latter option is really cool, because it will allow you to play with docker
immediately. There’s just one thing to keep in mind about Testdrive: outgoing
connections are disabled, so you won’t be able to install new stuff (or download
new docker images). Fortunately this appliance comes with the busybox container
bundled, so you will be able to play a bit with docker.

Running docker

The docker daemon must be running in order to use your containers. The openSUSE
package comes with a init script which can be used to manage it.

The script is /etc/init.d/docker, but there’s also the usual symbolic link
called /usr/sbin/rcdocker.

To start the docker daemon just type:

sudo /usr/sbin/rcdocker start


This will trigger the following actions:


The docker0 bridge interface is created. This interface is bridged
 with eth0.
A dnsmasq instance listening on the docker0 interface is started.
IP forwarding and IP masquerading are enabled.
Docker daemon is started.


All the containers will get an IP on the 10.0.3.0/24 network.

Playing with docker

Now is time to play with docker.

First of all you need to download an image: docker pull base

This will fetch the official Ubuntu-based image created by the
dotCloud guys.

You will be able to run the Ubuntu container on your openSUSE host without any
problem, that’s LXC’s “magic” ;)

If you want to use only “green” products just pull the openSUSE 12.3 container
I created for you:

docker pull flavio/opensuse-12-3


Please experiment a lot with this image and give me your feedback.
The dotCloud guys proposed me to promote it to top-level base image, but I want
to be sure everything works fine before doing that.

Now you can go through the examples
reported on the official
docker’s documentation.

Create your own openSUSE images with SUSE Studio

I think it would be extremely cool to create docker’s images using
SUSE Studio.
As you might know I’m part of the SUSE Studio team, so I looked a bit into how
to add support to this new format.

– personal opinion –

There are some technical challenges to solve, but I don’t think it would be hard
to address them.

– personal opinion –

If you are interested in adding the docker format to SUSE Studio please create
a new feature request on openFATE and vote it!

In the meantime there’s another way to create your custom docker images, just
keep reading.

Create your own openSUSE images with KIWI

KIWI is the amazing tool at the heart of
SUSE Studio and can be used to create LXC containers.

As said earlier docker runs LXC containers, so we are going to follow
these instructions.

First of all install KIWI from the Virtualization:Appliances project on OBS:

sudo zypper ar http://download.opensuse.org/repositories/Virtualization:/Appliances/openSUSE_12.3 virtualization:appliances
sudo zypper in kiwi kiwi-doc


We are going to use the configuration files of a simple LXC container shipped
the kiwi-doc package:

cp -r /usr/share/doc/packages/kiwi/examples/suse-11.3/suse-lxc-guest ~/openSUSE_12_3_docker


The openSUSE_12_3_docker directory contains two configuration files used by
KIWI (config.sh and config.xml) plus the root directory.

The contents of this directory are going to be added to the resulting container.
It’s really important to create the /etc/resolv.conf file inside of the
final image since docker is going to mount the resol.conf file of the host
system inside of the running guest. If the file is not found docker won’t be able
to start our container.

An empty file is enough:

touch ~/openSUSE_12_3_docker/root/etc/resolv.conf


Now we can create the rootfs of the container using KIWI:

sudo /usr/sbin/kiwi --prepare ~/openSUSE_12_3_docker --root /tmp/openSUSE_12_3_docker_rootfs


We can skip the next step reported on KIWI’s documentation, that’s not needed
with docker because it will produce an invalid container. Just execute the
following command:

sudo tar cvjpf openSUSE_12_3_docker.tar.bz2 -C /tmp/openSUSE_12_3_docker_rootfs/ .


This will produce a tarball containing the rootfs of your container.

Now you can import it inside of docker, there are two ways to achieve that:


from a web server.
from a local file.


Importing the image from a web server is really convenient if you ran KIWI
on a different machine.

Just move the tarball to a directory which is exposed by the web server. If you don’t
have one installed just move to the directory containing the tarball and type the following
command:

python -m SimpleHTTPServer 8080


This will start a simple http server listening on port 8080 of your machine.

On the machine running docker just type:

docker import http://mywebserver/openSUSE_12_3_docker.tar.bz2 my_openSUSE_image latest


If the tarball is already on the machine running docker you just need to type:

cat ~/openSUSE_12_3_docker.tar.bz2 | docker import - my_openSUSE_image latest


Docker will download (just in the 1st case) and import the tarball. The resulting
image will be named ‘my_openSUSE_image’ and it will have a tag named ‘latest’.

The name of the tag is really important since docker tries to run the
image with the ‘latest’ tag unless you explicitly specify a different value.

Conclusion

Hackweek #9 has been both productive and fun for me. I hope you will have fun
too using docker on openSUSE.

As usual, feedback is welcome.



qjson 0.8.1 released
2012-11-27T00:00:00+00:00


Just a quick information, QJson 0.8.1 has been released. This release ensure API and ABI compatibility with version 0.7.1.

The previous 0.8.0 release broke ABI compatibility without changing the SOVERSION.

Toward QJson 1.0.0

I’m not entirely happy with some parts of QJson’s API. I addressed these issues inside of the 1_0_0 branch.

I would appreciate to hear your opinion before merging this branch into master and releasing QJson 1.0.0.



QJson 0.8.0 released
2012-11-21T00:00:00+00:00


Almost three years passed since latest release of QJson.
A lot of stuff happened in my life and QJson definitely paid for that.
I have to admit I’m a bit ashamed.

So here we go, QJson 0.8.0 is out!

What changed

A lot of bugs has been smashed during this time, this new release will fix
issues like this one and
this in a nicer way.

QJson’s API is still backward compatible, while the ABI changed.

Symbian support

Some work has also been done to get QJson work on the Symbian platform. The
development happened a long time before Symbian was declared dead.

Currently I do not offer any kind of support for the Symbian platform because
IMHO Symbian development is a mess and given the current situation in the mobile
world I don’t see any point in investing more efforts on that.

Obviously Symbian patches and documentation are still accepted, as long as they
don’t cause issues to the other target platforms.

QMake support

QJson always used cmake as build system but since some Windows developers
had problems with it I decided to add some .pro files. That proved to be a
bad choice for me since I had to support two build systems. I prefer to invest
my time fixing bugs in the code and adding interesting features rather then
triaging qmake issues on Windows. Hence I decided to remove them from git.

If you are a nostalgic you can still grab these files from git. They
have been removed with commit 66d10c44dd3b21.

Relocating to Github

I decided to move QJson’s code from Gitorious to Github.
Github’s issue sysyem is going to replace Sourceforge’s bug tracking system.

I currently use Github a lot, both for personal projects and for work, and I simply love it.
I think it offers the best tools in the market and that’s really important to me.

QJson’s website and mailing lists are still going to be hosted on Sourceforge.





I think that’s all from now. If you want more details about the changes introduced
take a look at the changelog
or checkout QJson’s website.



Introducing dister, a Heroku like solution for SUSE Studio
2011-03-29T15:39:21+00:00


SUSE Studio is  an awesome tool, with a couple of
clicks you can create an openSUSE/SUSE based system and deploy to your hard
drive, an usb flash,  a live dvd, a VMware/VirtualBox/Xen server and even
Amazon EC2 cloud.

Suppose you want to create a tailored SUSE Studio appliance to run a Ruby on
Rails app, this is a list of things you have to take care of:


install all the gems required by the app (this can be a long list).
install and configure the database used by the app.
install and configure a webserver.
ensure all the required services are started at boot time.
You can save some time by cloning this appliance shared on SUSE Gallery,
but this is still going to be boring.


Dister to the rescue!

Dister is a command line tool similar to the one used by
Heroku (one of the coolest ways to run your Rails app
into the cloud). Within a few steps you will be able to create a SUSE Studio
appliance running your rails application, download it and deploy wherever you
want.

Dister is named after SUSE Studio robot. It has been created by  Dominik
Mayer and me during the latest
hackweek.

How it works

We are going to create a SUSE Studio appliance running a rails application
called “SUSE history”. The app uses bundler to
handle its dependencies. This is its Gemfile file:

source 'http://rubygems.org'
gem 'rails', '3.0.5'
gem 'pg'
gem "flutie", "~> 1.1"


As you can see the app uses rails3, the
flutie gem and PostgreSQL as database.

Appliance creation

Move into the suse_history directory and execute the following command:

dister create suse_history




As you can see dister has
already done a couple of things for you:


created an appliance using latest version of openSUSE supported by SUSE Studio (you can use a different base system of course)
added the devel:language:ruby:extensions repository to the appliance: this repo contains tons of ruby packages (like _modpassenger)
installed a couple of things:


_devel_CC++ pattern: this will allow you to build native gems.
_develruby pattern: provides ruby, rubygems and a couple of development packages needed to build native gems.
rubygem-bundler: bundler is required by dister in order to handle the dependencies of your rails application.
rubygem-passenger-apache2: dister uses passenger and apache2 to deploy your rails application.
postgresql-server: dister noticed suse_history uses PostgreSQL as database, hence it automatically installs it.
rubygem-pg: dister noticed suse_history uses PostgreSQL as database, hence it automatically installs the ruby’s library forPostgreSQL.

uploaded a custom build script which ensures:


mod_passenger module is loaded by Apache
both Apache and PostgreSQL are always started at boot time.
all dependencies are installed: this is done only during the first boot using bundler.
the database user required by your rails app is created. This is done only during the first boot using some SQL code.
the database used by the appliance is properly initialized (aka rails db:create db:migrate). This is done only during the first boot.





Upload your code

It’s time to upload suse_history code. This is done using the following
command:

dister push




As you can see dister packaged the
application source code and all its dependencies into a single archive. Then
uploaded the archive to SUSE Studio as an overlay file. Dister uploaded also
the configuration files required by Apache and by PostgreSQL setup.

Build your appliance

It’s build time!

dister build




The appliance has automatically being built
using the raw disk. You can use different formats of course.

Testdrive

Testdrive is one of the coolest features of SUSE Studio. Unfortunately dister
doesn’t support it yet. Just visit your appliance page and start testdrive
from your browser. Just enable testdrive networking and connect to your
appliance:



Download

Your appliance is working flawlessly. Download it and deploy it wherever you
want.

dister download


Current status

As you can see dister handles pretty fine a simple Rails application, but
there’s still room for improvements.

Here’s a small list of the things on my TODO list:


The dependency management should install gems using rpm packages. This would make the installation of native gems easier, right now the user has to manually add all the development libraries required by the gem. Moreover it would reduce the size of the overlay files uploaded to SUSE Studio.
SUSE Studio Testdrive should be supported.
It should be possible to deploy the SUSE Studio directly to EC2.
Fix bugs!




Bugs and enhancements are going to be tracked
here.

Contribute

Dister code can be found here on github,
fork it and start contributing.

If you are a student you can work on dister during the next Google Summer of
code, apply now!



Jump: a bookmarking system for the bash shell
2010-08-11T21:29:29+00:00
[](http://flavio.castelli.name/wp-content/uploads/2010/08/van-halen-
jump.jpeg)Let me introduce a small project I’ve been working on with a friend
of mine, Giuseppe Capizzi. The project is
called jump and allows you to quickly change directories in the bash
shell using bookmarks.

Thanks to Jump, you won’t have to type those long paths anymore.



You can find jump’s source code, detailed documentation and installation
instructions here.

SUSE packages can be found
here.



Fast user switch plasmoid improvements
2010-07-23T23:32:31+00:00
Just a quick note, I released a new version of the fastuserswitch plasmoid.
This new release implements all the improvements suggested by the users plus
some minor fixes.

Code can be downloaded from here.
openSUSE packages are already available on the build
service.

These are some screenshots illustrating fastuserswitch’s new features.

{% img /images/fast_user_switch/fastuserswitch011.png %}
{% img /images/fast_user_switch/fastuserswitch021.png %}
{% img /images/fast_user_switch/fastuserswitch03.png %}



Fast user switch plasmoid
2010-07-15T07:23:11+00:00


Last week my mother in law started to share her Linux laptop with my wife.
Suddenly my wife asked me how she could switch from one user session to
another. She was looking for something similar to OS X fast user
switch feature but she couldn’t find it. In fact there
wasn’t a fast and easy way to switch between users’ sessions with KDE,
until… now :)

Let me introduce my first plasmoid: the fast user switch plasmoid. It’s a
simple icon in the panel that allows users to swich to another open session or
to open a new login page. Here you can see the mandatory screenshots.

{% img /images/fast_user_switch/fastuserswitch02.png %}
{% img /images/fast_user_switch/fastuserswitch01.png %}

You can find the source code
here. Binary packages for
openSUSE are already available on the build
service.

One last thought about KDM

I think that KDM should allow to switch back to an already open session in a
more transparent way. Right now if an user has already one session open, he
goes back to the login screen and enters his credentials a **new ** session is
started. I think that most users would expect to be switched back to their
already running session. Starting a new session is just confusing for them.



How to run a single rails unit test
2010-05-28T15:57:57+00:00


This post explains how to execute a single unit test (or even a single test
method) instead of running the complete unit test suite.

In order to run the unit tests of your rails application, basically you have
these official possibilities:


rake test: runs all unit, functional and integration tests.
rake test:units: runs all the unit tests.
rake test:functionals: runs all the functional tests.
rake test:integration: runs all the integration tests.
Each one of these commands requires some time and they are not the best
solution while developing a new feature or fixing a bug. In this circumstance
we just want to have a quick feedback from the unit test of the code we are
editing.


Waiting for all the unit/functional tests to complete decreases our
productivity, what we need is to execute just a single unit test. Fortunately
there are different solutions for this problem, let’s go through them.

The easy approach: use your favorite IDE

Most of the IDE supporting ruby allow you to run a single unit test. If you
are using Netbeans running a single unit test is really easy:


make sure the editor if showing the file you want to test or the file containing its unit tests
Hit Ctrl+Shift+F6 or click on the following menu entry: Debug->Debug Test File
Two new windows will be opened: one will contain the output produced by your
unit test, the other one will show the results of the unit test.


As you will notice the summary window contains also some useful information
like the:


hyper links to the exact location of the code that produced the error/failure.
execution time required by each one of the test methods.
As you will experience it will be like “compiling” your ruby code.


From the console

If you are not using Netbeans you can always rely on some command line tools.

No additional tools

These “tricks” don’t require additional gems, hence they will work out of the
box.

The first solution is to call this rake task:

rake test TEST=path_to_test_file


So the final command should look like

rake test TEST=test/unit/invitation_test.rb


Unfortunately on my machine this command repeats the same test three times, I
hope you won’t have the same weird behavior also on your systems…

Alternatively you can use the following command:

ruby -I"lib:test" path_to_test_file"


It’s even possible to call a specific test method of your testcase:

ruby -I"lib:test" path_to_test_file -n name_of_the_method"


So calling:

ruby -I"lib:test" test/unit/invitation_test.rb - test_should_create_invitation


will execute only _InvitationTest::test_should_createinvitation.

It’s also possible to execute only the test methods matching a regular
expression. Look at this example:

ruby -I"lib:test" test/unit/invitation_test.rb -n /.*between.*/


This command will execute only the test methods matching the /.between./
regexp.

Using the single_test gem

If you want to avoid the awful syntax showed in the previous paragraph there’s
a gem that can help you, it’s called
single_test.

The github page contains a nice documentation, but let’s go through the most
common use cases.

You can install the gem as a rails plugin:

script/plugin install git://github.com/grosser/single_test.git


single_test will add new rake tasks to your rails project, but won’t
override the original ones.

Suppose we want to execute the unit test of user.rb, just type the following
command:

rake test:user


If you want to execute the functional test of User just call:

rake test:user_c


Appending _”c” to the class name will automatically execute its functional
test (if it exists).

It’s still possible to execute a specif test method:

rake test:user_c:_test_name_


So calling:

rake test:user_c:test_update_user


Will execute the _test_updateuser method written inside of
_test/functional/user_controllertest.rb.

It’s still possible to use regexp:

rake test:invitation:.*between.*


This syntax is equivalent to ruby -I"lib:test" test/unit/invitation_test.rb
-n /.*between.*/.

Possible issues

When a single unit test is run all the usual database initialization tasks are
not performed. If your code is relying on newly created migrations you will
surely have lots of errors. This is happening because the new migrations have
not been applied to the test database.

In order to fix these errors just execute:

rake db:test:prepare


before running your unit test.



QJson and Symbian
2010-03-14T00:55:51+00:00
I’m really pleased to announce that latest version of QJson on master is
working on Symbian. You can find the installation instruction
here.

Since I’m not a Symbian developer it has been a little hard for me to achieve
that. I would like to thank Antti Luoma for
his help.

There are also good news for Windows developers: now building QJson under
Windows is easier. Checkout the new installation instruction
page.

I hope this will help all the Windows developers who want to use QJson.



QJson code moves to gitorious
2009-12-05T16:48:25+00:00
Just a quick note: I have just moved QJson source code to
this git repository hosted by gitorious.

I’ll keep the code on KDE’s svn synchronized with the git repository.



QJson: from QObject to JSON and vice-versa
2009-12-04T01:07:48+00:00


Some days ago I introduced the possibility to serialize a QObject instance to
JSON. Today I’m going to show you the opposite operation: initializing a
QObject using a JSON object.

I refactored a bit my latest changes: I created a new class called
QObjectHelper that provides the methods required to convert a QObject instance
to a QVariantMap and vice-versa.

This class can be used in conjunction with the Serializer and Parser classes
to serialize and deserialize QObject instances to and from JSON.

Let me show a quick example, suppose the declaration of Person class looks
like this:

{% codeblock [class definition] [lang:cpp ] %}
class Person : public QObject
{
  Q_OBJECT
  Q_PROPERTY(QString name READ name WRITE setName)
  Q_PROPERTY(int phoneNumber READ phoneNumber WRITE setPhoneNumber)
  Q_PROPERTY(Gender gender READ gender WRITE setGender)
  Q_PROPERTY(QDate dob READ dob WRITE setDob)
  Q_ENUMS(Gender)

public:
    Person(QObject* parent = 0);
    ~Person();
    QString name() const;
    void setName(const QString& name);
    int phoneNumber() const;
    void setPhoneNumber(const int  phoneNumber);
    enum Gender {Male, Female};
    void setGender(Gender gender);
    Gender gender() const;
    QDate dob() const;
    void setDob(const QDate& dob);
  private:
    QString m_name;
    int m_phoneNumber;
    Gender m_gender;
    QDate m_dob;
};
{% endcodeblock %}

From QObject to JSON

The following code will serialize an instance of Person to JSON :

{% codeblock [From QObject to JSON] [lang:cpp ] %}
Person person;
person.setName(“Flavio”);
person.setPhoneNumber(123456);
person.setGender(Person::Male);
person.setDob(QDate(1982, 7, 12));
QVariantMap variant = QObjectHelper::qobject2qvariant(&person;);
Serializer serializer;
qDebug() << serializer.serialize( variant);
{% endcodeblock %}

The generated output will be:

{% codeblock [JSON data] [lang:json ] %}
{ “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 }
{% endcodeblock %}

From JSON to QObject

Suppose you have the following JSON data stored into a QString:

{% codeblock [JSON data] [lang:json ] %}
{ “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 }
{% endcodeblock %}

The following code will initialize an already allocated instance of Person
using the JSON values:

{% codeblock [From JSON to QObject] [lang:cpp ] %}
Parser parser;
QVariant variant = parser.parse(json);

Person person;
QObjectHelper::qvariant2qobject(variant.toMap(), &person;);
{% endcodeblock %}

A new release

These changes have been included inside the new release of QJson: 0.7.0.

Packages for openSUSE are building right now.



QJson: easier serialization of QObject instances to JSON
2009-11-30T20:20:58+00:00
I have just committed into trunk a couple of changes that make easier to
serialize a QObject instance to JSON.

This solution relies on the awesome Qt’s property system.

Suppose the declaration of Person class looks like this:

{% codeblock [class definition] [lang:cpp ] %}
class Person : public QObject
{
  Q_OBJECT

    Q_PROPERTY(QString name READ name WRITE setName)
    Q_PROPERTY(int phoneNumber READ phoneNumber WRITE setPhoneNumber)
    Q_PROPERTY(Gender gender READ gender WRITE setGender)
    Q_PROPERTY(QDate dob READ dob WRITE setDob)
    Q_ENUMS(Gender)

  public:
    Person(QObject* parent = 0);
    ~Person();

    QString name() const;
    void setName(const QString& name);

    int phoneNumber() const;
    void setPhoneNumber(const int  phoneNumber);

    enum Gender {Male, Female};
    void setGender(Gender gender);
    Gender gender() const;

    QDate dob() const;
    void setDob(const QDate& dob);

  private:
    QString m_name;
    int m_phoneNumber;
    Gender m_gender;
    QDate m_dob;
};
{% endcodeblock %}

The following code will serialize an instance of Person to JSON:

{% codeblock [Serialize to JSON] [lang:cpp ] %}
    Person person;
    person.setName(“Flavio”);
    person.setPhoneNumber(123456);
    person.setGender(Person::Male);
    person.setDob(QDate(1982, 7, 12));

    Serializer serializer;
    qDebug() << serializer.serialize( &person;);
{% endcodeblock %}

The generated output will be:
{% codeblock [JSON data] [lang:json ] %}
    { “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 }
{% endcodeblock %}

I hope you will find this new feature useful. I’m also considering to create a
similar method inside the Parser class.

As usual suggestions are welcome.



kaveau updates
2009-09-14T23:38:54+00:00


Some weeks have passed since the announcement of kaveau. I’m really proud and
happy about this project because I received a lot of positive feedback
messages and it has been chosen as one of the best Hackweek’s projects.

In the meantime I kept working on kaveau, so let me show you what has changed:


rdiff-backup has been replaced by rsync.
the setup wizard has been improved according to the feedback messages I received.
old backups are now automatically removed.
the code has been refactored a lot.


The switch to rsync

Previously kaveau used rdiff-backup as backup back-end. rdiff-backup is a
great program but unfortunately it relies on the outdated
librsync library. The latest release of
librsync is dated 2004. It has a couple of serious bugs still open and, while
rsync has reached version three, this library is still stuck at version one.

These are the reasons of the switch from rdiff-backup to rsync. This choice
breaks the compatibility with the previous backups but it introduces a lot of
advantages. One of the most important improvements brought by the adoption of
rsync is an easier restore procedure: now all the backups can be accessed
using a standard file manager, while previously rdiff-backup was needed to
access the old backups.

Backup directory structure

On the backup device everything is saved under the
kaveau/hostname/username path.

The directory will have a similar structure:

drwxr-xr-x 3 flavio users 4096 2009-09-12 18:50 2009-09-12T18:50:19
drwxr-xr-x 3 flavio users 4096 2009-09-14 23:07 2009-09-14T23:07:46
drwxr-xr-x 3 flavio users 4096 2009-09-14 23:30 2009-09-14T23:30:36
lrwxrwxrwx 1 flavio users   19 2009-09-14 23:30 current -> 2009-09-14T23:30:36


As you can see there’s one directory per backup, plus a symlink called
current pointing to the latest backup.

Old backup deletion

Nowadays big external storage devices are pretty cheap, but it’s always good
to save some disk space. Now kaveau keeps:


hourly backups for the past 24 hours.
daily backups for the past month.
weekly backups until the external disk is full.
Thanks to hard links’ magic, old backups can be deleted without causing
damages to the other ones.


Plans for the future

Before starting to work on the restore user interface I will spend some time
figuring out how to add support for network devices.

A lot of users requested this feature, hence I want to make them happy :) .

I’m planning to use avahi to discover network shares
(nfs, samba) or network machines running ssh and use them as backup devices.
Honestly, I want to achieve something similar to
Apple’s time capsule.

As usual, feedback messages are really appreciated.



Using QJson under Windows
2009-09-04T13:44:42+00:00
Recently lots of people asked me how to build QJson under Windows. Most of
them reported build/link errors, so I decided to try personally.

The good news is that QJson can be successfully built under Window, I can show
you proof ;)

{% img /images/qjson/qjson_windows_1.png %}
{% img /images/qjson/qjson_windows_2.png %}

I have written the build instructions on QJson website: just take a look
here.

One last note: if you have problems with QJson please subscribe to the
developer mailing list and post a message.



kaveau: easy and integrated backups solution for KDE
2009-07-28T09:47:05+00:00


During the last week I had the possibility to work on anything I wanted,
Novell’s hackweek is so cool :)

I decided to dedicate myself to an idea that has been obsessing me since a
long time. Last December my brand new hard disk suddenly died, making
impossible to recover anything. Fortunately I had just synchronized the most
important documents between my workstation and my laptop, so I didn’t lose
anything important. This incident make me realize that I should perform
backups regularly and I immediately started looking for a good solution.

Personally I think that doing backups is pretty boring so I wanted something
damned easy to setup and use. Something that once configured runs in the
background and does the dirty job without bothering you. Let’s face the truth:
I wanted Apple’s Time Machine for KDE.

After some searches I realized that nothing was fitting my requirements and I
decided to create something new: kaveau.

What is kaveau?

Kaveau is a backup program that makes backup easy for the average user.

As you will see while coding/planning kaveau I made some assumptions and so
only few things are configurable. I really think that sometimes “less is
more”.

What does kaveau?

Current features:


it performs backups to an external storage device: I don’t think it will ever store backup data on a remote host. If you want to do that just use some good project like Backup pc.
it backs up the complete home directory of the user: storage is cheap and average users (like me) keep everything inside their home directory ( but it’s possible to exclude some directories from the backup).
it performs incremental backups.
the backup data are neither compressed nor stored in fancy formats: in this way you can plug your external disk into another machine and access your data without additional work.
backups are performed automatically every hour (of course only if your external disk is plugged).
notification messages are shown if your backup is older that a week.
More enhancements are coming…


What technologies does it use?

Backups are performed using rdiff-backup
because it’s damned easy to use, well tested (it’s used also in production
environments) and packaged by all distributions.

The awesome solid library is used for interacting
with the external disks is super easy.

Status of kaveau

I have been working on kaveau just for five days, so there’s still a lot of
work to do.

A screenshot tour will give you the right idea of its status.

First run - external storage device attached



####Backup wizard - page 1


####Backup wizard - page 2



####Backup wizard - final page



####Backup operation in progress



####Backup completed



Right now the code is available on
this git repository but I don’t
recommend you to try it (unless you want to find and fix bugs ;) ).

I would really appreciate:


feedback about the user interface (right now it looks too much like Time Machine).
icons: it would be great to have a desktop icon and some system tray icons (contact me for more details).
new code, bug fixes, code reviews, hints,…




New QJson release
2009-07-23T11:51:30+00:00
Gran Canaria Desktop Summit has been great and really productive. I had the
pleasure to meet people interested in QJson, chat with them but also hack with
them.

In fact we hacked a lot, doing lots of changes to QJson:


the API has been cleaned, now it can be considered stable
unicode support has been completely rewritten
it’s now possible to convert QVariant objects into JSON ones




So it’s with a great pleasure that I announce the release of QJson 0.6.0.

Beware, since the API has been changed your application will probably break.
I’m really sorry about that, but I guarantee it won’t happen in the future (as
I said both API and ABI interfaces can now be considered stable).

QJson web site has been updated, reflecting
all the changes made to the library. openSUSE packages has been moved from my
home repo to KDE:Qt one.

One last note, if you have problems with QJson please contact me using the
qjson-devel mailing list. You can subscribe
here.



QJson at Gran Canaria Desktop Summit
2009-06-30T19:46:38+00:00
Now that I have booked both the flights and the hotel it’s official: I’ll
attend the Gran Canaria Desktop Summit.

On Thursday 9th July I’ll give a BoF about
QJson.

During the talk I will show:


the advantages brought by QJson.
the usage of QJson.
some real programs using QJson.


See you soon!





rockmarble: see who is going to rock in your town
2009-04-07T17:40:45+00:00
During the last weekend I hacked a bit on rockmarble and I added a new
feature: retrieve all the events happening in a certain city.

As usual data is provided by last.fm, which should return also the events
_“near” _the specified city (don’t ask me to define a value for near :) ).

I have created new openSUSE packages, this time everything should work fine.
Just make sure to remove all qjson packages
before installing this release (in fact all the previous problems were due to
packaging errors of qjson, now I have created new packages called libqjson0).

Packages are available for openSUSE 11.0, 11.1 and Factory (both for i586 and
x86_64 architectures).

One last news: rockmarble has also a new
site, something slightly better than
github wiki ;)





rockmarble: how to follow your favourite artists tour with Marble
2009-04-02T08:51:43+00:00


During the last weekend I wanted to have some fun with
QJson. So I came out with this idea: retrieve
from last.fm the tour dates of my favourite artists
and display the locations using Marble.

After some hacking I created this small application: rockmarble…



If you have a last.fm account rockmarble will import your favourite artist
list. Otherwise you can add one artists at a time.

The tour location will be displayed inside Marble, using
openstreetmap.

Requirements

In order to build/run it you will need:


Qt4
marble
QJson


Installation

You can grab the source code of rockmarble
here.

If you are an openSUSE user you can use 1click install:



Issues

Geolocalization

The geolocalization data are given by last.fm, so if you discover that
Metallica are going to give a concert in the middle of the Pacific Ocean
please don’t bother me :)

Special names

It seems that QJson doesn’t handle properly special characters. Maybe you will
some artist with a blank name. I’m going to fix this issue asap.

More details

Visit rockmarble website

Future

Who wants to integrate it into amarok’s context view? ;)



QJson: a Qt-based library for mapping JSON data to QVariant objects
2008-11-04T13:37:07+00:00


In order to realize a project of mine I started looking for a Qt library for
mapping JSON data to Qt objects.

I came over a couple of solutions but none of them made me happy. So in the
last weekend I wrote my own library : QJson The library is based on Qt
toolkit and converts JSON data to QVariant instances. JSON arrays will be
mapped to QVariantList instances, while JSON’s objects will be mapped to
QVariantMap. The JSON parser is generated with Bison, while the scanner has
been coded by me.

Usage

Converting JSON’s data to QVariant instance is really simple:

{% codeblock [] [lang:cpp ] %}
// create a JSonDriver instance
JSonDriver driver;
bool ok;
// json is a QString containing the data to convert
QVariant result = driver.parse (json, &ok);
{% endcodeblock %}

Suppose you’re going to convert this JSON data:

{% codeblock [JSON data] [lang:json ] %}
{ “encoding” : “UTF-8”, “plug-ins” : [ “python”, “c++”, “ruby” ],
  “indent” : { “length” : 3, “use_space” : true } }
{% endcodeblock %}

The following code would convert the JSON data and parse it:

{% codeblock [] [lang:cpp ] %}
JSonDriver driver;
bool ok;
QVariantMap result = driver.parse (json, &ok).toMap();
if (!ok) {
  qFatal(“An error occured during parsing”);
  exit (1);
}
qDebug() << “encoding:” << result[“encoding”].toString();
qDebug() << “plugins:“;
foreach (QVariant plugin, result[“plug-ins”].toList()) {
  qDebug() << “\t-” << plugin.toString();
}
QVariantMap nestedMap = result[“indent”].toMap();
qDebug() << “length:” << nestedMap[“length”].toInt();
qDebug() << “use_space:” << nestedMap[“use_space”].toBool();
{% endcodeblock %}

The output would be:

encoding: "UTF-8" plugins: - "python" - "c++" - "ruby" length: 3 use_space: true


Requirements

QJson requires:


cmake
Qt


Obtain the source

Actually QJson code is hosted on KDE subversion repository. You can download
it using a svn client:


svn co svn://anonsvn.kde.org/home/kde/trunk/playground/libs/qjson


For more informations visit QJson site



Ruby downloader for Jamendo
2008-10-03T11:28:28+00:00


Also this year I’ll attend the Linux day (a day dedicated to Gnu/Linux  and
FLOSS that occurs every year in Italy) organized by my LUG.
Guess what I’ll be talking about… ;)

While organizing the event somebody proposed to setup a local server with some
music released under CC license. He suggested to download some albums from
Jamendo (due to network issues we won’t be able to
provide direct access to the website).

Since nobody wanted to download the albums by hand, last night I wrote a small
ruby program that does the dirty job.

Requirements:

Ruby and json gem have to be installed on you machine.

Usage:

Help:[](http://www.flavio.castelli.name/wp-
content/uploads/2008/10/jamendo_downloader.rb)

./jamendo_downloader.rb –help

Download the top 10 rock albums:

./jamendo_downloader.rb -g rock -t 10

Have fun

I think there’s nothing more to say… enjoy it!

{% gist 2437530 jamendo_downloader.rb %}



Can dreams became true?
2008-07-16T13:57:52+00:00
During the last years of university I spent some time wondering about my
future job. After the degree I received  lot of offers (fortunately also good
ones), but none of them matched with my “dream job” . There was always
something missing/different. I was just trying to convince myself that the
“perfect job” can exist only in our dreams when my feed reader pointed me
here.

After reading these lines:


” We are exploring new technologies, creating prototypes of future systems,
and trying to find and shape some of the features that will be part of
upcoming SUSE products and the ecosystem around that. It’s a fascinating job,
challenging, fun, and always exciting. For somebody like me who loves to
create new things and enjoys working with an awesome team of innovative people
this is a dream job.”


I realized the description matched with all my wishes!

I immediately applied for the position and now, after more than a month, I’m
looking at a signed proposal.

I’ve waited a long time before spreading my joy to the rest of the world…
but now I just want to say that starting from 1st August I’ll be part of
Suse’s Incubation team! :D

I’m really happy, excited and honoured for this awesome opportunity.

I wish you’ll be able to realize your dreams too!

Just a funny note… While Cornelius was writing about the job offer on his
blog, I was having a dinner with another Novell employee:
Massimiliano Mantione. I was
listening to his personal story, wishing to find a dream job like his one.
I have to admit I was a bit jealous :)



Still coding
2008-05-27T21:36:29+00:00
Yes, long time has passed since my last post (really strange, isn’t it :D ).
It has been a busy period, full of work, dog training and… coding
(fortunately!).

During this time I’ve been working on XesamQLib creation. This is a Qt based
library for accessing Xesam services. Its API is
going to be similar to Xesam-glib one and it will make life easier for developers who want to interact
with programs exposing Xesam service (who talked about Strigi? :D )

Right now I’m finishing to clean the code, in order to publish a first version
of XesamQLib on KDE repository. I’ll keep you updated.



Two new entries in my life
2008-03-26T21:05:25+00:00
Last week two new entries entered my life.A shiny brand-new _Macbook pro
_arrived on Thursday. I’ve always dreamed of one of this wonderful laptops
(I’ve been an happy iBook user for a long time). Two weeks ago I finally
surrendered (I’ve always been scared of the high price) and I’ve ordered this
new toy. Now, after one week of usage I’m really happy about it.

Currently I’m using Linux as primary Os, but I’ll give a try also to KDE4 on
Mac ;)

The other new entry is Baguette, a three-months old dachshund. She’s really
lovely, running around the house eating random things like shoes (I think it’s
a must for all pets) and usb cables (she’s already a geek-dog).

I think I’ll have to keep an eye on Baguette, otherwise I’ll find another bite
on my Macbook pro apple ;)



My current tasks
2008-01-18T15:31:23+00:00
Usually I write blog posts announcing what I have done, but this time
it is
useless. So I’m
going to blog about what I’m going to do.  After latest Strigi irc
meeting, I came out with this task:


KDE integration: Flavio will coordinate the definition of interfaces over
which KDE will handle searching and metadata. He can ask Aaron, Evgeny and Jos
for help with the interface design. The interface will cover:


Querying via Xesam

Configuration of the Strigi daemon

Indexing and deindexing of data by passing it to the daemon (allowing
for indexing for more than just files)

Controlling the daemon (starting, stopping, pausing)



Once this interface will be ready, it will be easy to integrate Strigi
functionalities inside KDE programs. This mean (just reporting the most
relevant cases) that it will be possible to create a Strigi krunner, have
metadata extraction inside Dolphin and Konqueror, interact with Akonadi…

Talking about Xesam, right in these days I got a mail from Fabrice Colin, author of
Pinot. Recently Fabrice made some improvements on
Pinot’s XesamQueryLanguage parser (which is also used by Strigi).
We’re now figuring out how to share our code in a more convenient way. Maybe
we’ll use svn external…



Strigi irc meeting
2008-01-11T10:59:09+00:00
Just giving more voice to original announce:
Tomorrow, Saturday 12 January, at 17 UTC an irc meeting will take place on
#strigi channel (it’s on freenode if you don’t know).

During the meeting Strigi developers will discuss about the future developments of Strigi.

Special guest: Aaron Aseigo. You’re all welcome.



Tyding up Strigi analyzers
2008-01-10T15:49:17+00:00
As you may know, KDE4 will use Strigi for meta information extraction instead
of the old KFilePlugin classes.

Since Strigi’s analyzer work in a different way, lot of code has to be ported.
Unfortunately, after a good start, some relevant analyzers were still missing.

But in the last weeks Strigi gained support of:


wave file
avi files
txt files
dds files
rgb files
sid files
ico files


I’ve also updated this summary page. As you can see
there’s still some work to do, but don’t worry… I’ll try to do the best ;)



Strigi gains FAM support
2007-11-26T12:33:52+00:00
Last Monday I submitted lot of changes into Strigi’s trunk. I’ve heavily
refactored some classes in order to obtain a more flexible file system
notification infrastructure. Thanks to this work now it will be easier to add
support for new file system notification facilities.  For example, in order to
add File Alteration Monitor (aka FAM) support I had to write only 576 loc
(including license and documentation stuff).

So, by now, Strigi supports the following file system monitoring facilities:


polling: used when nothing else is available
inotify: available only on linux platform with kernel >= 2.6.13
FAM: available on most of the *nix systems, I recommend to use Gamin instead of FAM (most linux distributions already use it by default)




Updated git-svn howto
2007-11-23T15:49:12+00:00
Maybe someone has already experimented this situation:


You’re hacking on your local working copy and you want to keep it up-to-date
but, since you have some uncommitted changes, git-svn rebase cannot be
executed


I was just thinking to write something about_ _this problem when I read a
post on digikam blog.

In this post Marcel proposes a workaround using a bash function. In fact
there’s a “cleaner” solution, if you’re interested read the last part of my
git-svn howto.



Linux Day 2007
2007-10-29T12:37:49+00:00
Last Saturday Linux day 2007 took place.

Linux Day is an Italian manifestation that promotes Linux and FOSS. During
this day different organizations (mostly Linux User
Groups) arrange events with
speeches, installation parties and more.

Since lot of people requested it, I gave a speech about KDE 4 during the
Linux day organized by my LUG (BGLug).

The presentation covers the main changes and features introduced by KDE 4. I
took inspiration from Troy’s “Road to KDE 4” articles (I like them really
much).

People liked the speech and, most important of all, showed great interest for
KDE 4.

KDE4 ld2007View more presentations from Flavio Castelli.



Howto use Git and svn together
2007-09-04T12:22:15+00:00


In these days I’ve heard lot of rumors around Git. After
reading some manual/tutorial/guide I discovered that it can be really useful,
especially if you spend lot of time coding off-line (that’s my situation).

This is a really small howto that describes how to work on a project versioned
with svn (maybe taken from KDE repository ;) ) using git.

What’re the advantages?

Since Git is a distributed revision control system (while svn is a centralized
one) you can perform commits, brances, merges,… on your local working dir
without being connected to internet.

Next time you’ll be online, you will be able to “push” your changes back to
the central svn server.

Steps to follow:

You’ve to:


install git and git-svn
create the working dir: mkdir strigi
init your git working dir: cd strigi && git-svn init https://svn.kde.org/home/kde/trunk/kdesupport/strigi git-svn init command is followed by the address of the svn repository (in this case we point to strigi’s repository)
Find a commit regarding the project (you can get it from cia version control). Warning: the command git-log will show project’s history starting from this revision.
Perform the command git-svn fetch -rREVISION Where REVISION is the number obtained before.
Update your working dir: git-svn rebase
Now you’ll be able to work on your project using git as revision control
system.


To keep update your working copy just perform:

git-svn rebase

You can **commit your changes ** to the svn server using the command:

git-svn dcommit

In this way each commit made with git will be “transformed” into a svn one.

Solve git-svn rebase problems

While adding new cool features to your program, you may experiment some
problem when synchronizing with the main development tree. In fact you have to
commit all local modifications (using the git-commit command) before
invoking git-svn rebase.

Sometimes it isn’t reasonable since your changes are not yet ready to be
committed (you haven’t finished/tested/improved your work). But don’t worry,
git has a native solution also for this problem, just follow these steps:


put aside your changes using the command: git-stash
update your working copy using: git-svn rebase as usual
take back your changes typing: git-stash apply
clear “the stash” typing: git-stash clear
After the first step all your uncommitted changes will disappear from the
working copy, so you’ll be able to perform the rebase command without
problems.


For further informations read git-stash man page.

That’s all.

A special mention goes to Thiago Macieira for his help.



Strigi gains full Xesam queries support
2007-08-31T17:30:52+00:00


As I said in
this
previous post, Strigi’s Xesam support was half-done since XesamUserSearchLanguage wasn’t yet
handled. Well, this is no longer true… ;) In these weeks I’ve been working
on XesamUserSearchLanguage support. Ehm… to be honest, I’ve been
fighting with Bison.

But in the end I tamed the beast and now
Xesam support in Strigi is
full.

IMHO XesamUserSearchLanguage can be considered more important than
XesamQueryLanguage
since common users will write queries in this way.

As reported on the project page:
{% blockquote %}
It is [XesamUserSearchLanguage] designed as an extended synthesis of
Apple’s spotlight and Google’s search languages.
{% endblockquote %}

These are some possible queries (examples taken from freedesktop site):


_type:music hendrix_ will return all music items related to hendrix
_type:image size>=1mb tag:flower africa_ will return all pictures displaying a flower greater than 1 Mb and related with africa




Technical aspects

The Xesam’s UserSearchLanguage query –> Strigi::Query object conversion is
made using a hand-written scanner and a C++ parser created by Bison.

You don’t have to worry if you don’t have Bison installed on your system since
all parser generated code is already put into svn. In these days, as soon as
I’ll have some spare time (when?!), I’ll write another post about open-source
scanner and parser generators.

By now I would like to thank Andreas Pakulat
(developer of KDevelop) for his help with parser
generators.



Xesam and bathroom hacking
2007-08-10T15:00:21+00:00
Yesterday morning I was quite arrived at work when Laura (my gf) called me.
Something went wrong in our bathroom and water was everywhere. She closed the
main water tap and I took the first train for home (yes, since I’m an outlier
I take the train two times per day). Once arrived at home I performed some
hacking on the guilty washing machine, checked some pipes and than took the
next train for office.

In the end yesterday I spent approximately four hours on the train. During
this elapse of time I started the Xesam User Language parser :)  During the
travels I:


refreshed my memories about Flex, Bison and language parsers in general
wrote XesamUserLanguage’s BNF grammar
wrote Flex scanner
started Bison parser




Now, after fixing some build errors, I’ll start writing Bison’s grammar rules.
These rules will translate Xesam user language queries into Strigi::Query
objects.

I hope it will work (both bathroom and Xesam parser ;) )



How to have some fun with Strigi and Xesam queries
2007-08-09T13:10:30+00:00
Last day just after I hit the “submit” button a doubt came into my mind:
“did I say everything ?” Well, the answer is “No!” In fact I forgot to
tell you one of the most important things: how to have some fun with Strigi
and Xesam! Actually the only way to perform XesamQueryLanguage queries with
Strigi is through the strigicmd program.

Strigicmd is a command-line tool shipped with Strigi. It can perform
different actions like:


create Strigi indexes
remove items from index
list all files contained into an index
retrieve informations associated to an indexed file
update the contents of your index
query the index
perform a query using XesamQueryLanguage


So, if you want to try the new Xesam support you’ve just to use strigicmd
with the xesamquery option. The command syntax is: strigicmd xesamquery -t
backend -d indexdir [-u xesam_user_language_file] [-q
xesam_query_language_file] As you can expect you’ve to save your Xesam query
to file and point strigicmd to it.

This is a really small step-by-step guide:


Create a new Strigi index (in this case I’ll index all irc logs): strigicmd create -t clucene -d temp/ logs/

Create a simple file containing your Xesam query. You can find some example query on Xesam site or inside strigi tarball (complete path: strigi/src/streamanalyzer/xesam/testqueries/). This is a stupid and easy query:

{% codeblock [query] [lang:xml ] %}




Oever


jos




{% endcodeblock %}

Perform the search, just type:

strigicmd xesamquery -t clucene -d temp/ -q ~/irc_oever.xml

Enjoy the search results ;)


Remember that XesamUserLanguage query language isn’t yet supported.



Strigi gets XesamQueryLanguage queries support
2007-08-07T14:43:27+00:00


Since last Thursday Strigi gained XesamQueryLanguage support. This means
that now is possible to process queries formulated using this syntax.

But why is this important?  If you aren’t able to answer the previous question
probably you don’t know what is Xesam. Here’s a short definition taken from
the official site:  Xesam is an
umbrella project with the purpose of providing unified apis and specs for
desktop search- and metadata services.  Thanks to dbus and Xesam it will be
possible to access the informations indexed by Strigi (and all the desktop
searching programs supporting these technologies) in a standard and easier
way. Isn’t it cool?

Credits

I’ve to say a big “thank you” to Fabrice Colin (author of
pinot) because my Xesam code relies upon his work.

Future tasks

My work isn’t yet finished. Xesam defines two kind of queries:


Xesam user language queries
Xesam query language queries
Fabrice’s code for XesamUserLanguage queries uses Spirit library. Since we don’t want to depend
against the boost library, I’ll write a new parser for this language.


By now I’m thinking to accomplish this task using flex, but I’m just in a
preliminary state. Suggestions are welcome!

P.S. I’m really happy because this is my first post published on PlanetKDE. Hello to everybody!



Create your Simpson avatar
2007-07-26T20:46:32+00:00
Yesterday a colleague pointed me to the Simpson the movie website.
One of the coolest things of it is the possibility to create your own Simpson avatar.

How to resist?! :)

After some clicks I made this beautiful avatar:

{% img /images/avatar_simpson.jpg %}

Isn’t it amazing? :D



Guestbook and spam
2007-07-18T13:49:02+00:00
In the last days I got an exponential grow of spam messages into the
guestbook. Since drupal’s guestbook module has some problems with spam-
prevention components, I’ve disabled anonymous posting.

I’ll enable it again as soon as possible.

BTW: I don’t know who will care about this limitation ;)



New house opening
2007-07-18T10:47:46+00:00
Last friday evening I made a party in my new house. We celebrated my birthday
and also the “opening” of the new house.

Many thanks to Roberto for the photos (made with his mobile phone!).



Lots of improvements into Strigi inotify support
2007-07-10T22:50:34+00:00
Some days ago I committed lots of changes regarding inotify support. Goodies
and improvements have been introduced…  I had this code laying on my laptop
for several weeks because I wasn’t able to fully test it. An annoying bug
afflicting the update index operations was blocking me. But some days ago
Jos fixed it, so I didn’t have any excuse :)

These are the major changes introduced:


event caching: using a small cache it’s now possible to simplify multiple events and prevent high cpu usage when lot of changes occur on the same files
interruption handling during the re-index operations: changing the directories to watch during an indexing operation will break the previous job and start a new one
other small changes for improving cpu utilization




Actually I’m very happy of the first point, but I think the second one can
still be improved…



Remember to commit
2007-07-05T20:43:04+00:00
I have to admit that lately I have a really bad habit: I forget to commit my
changes to Strigi. In this way I end-up performing multiple commits with
different changes inside: from stupid to interesting one.  This is a really
bad behavior, so today I decided to perform some svn status and svn diff
and I made some commits on Strigi trunk. Most of all are just code readability
improvement == don’t exceed the 80 chars per line.

I think that Jos will be really happy to see them ;)



Work, work, work and... new house!
2007-07-03T21:44:47+00:00
In my last message I promised to write some more contents on this site, in
order to keep it more updated. As usual I disappointed this promise :)  So,
what I have been doing in these day? Just a simple answer: I’ve worked a lot.
“Work” can be divided in the following categories:


office work
KDE & Strigi coding
Physical work




The most important of all is the last one. I’ve spent all my week-ends working
in my new house. I’ve been fixing and improving it (wow… it seems I’m
talking about a piece of software ;) ) and now it’s quite finished.

I’ve to say a great “thank you” to my dad. He’s able to do everything (he’s
something better than Mac Gyver) and
helped me a lot (in fact he has done all the dirty jobs).

Prepare yourself, I’ll put some photos of the house really soon…



Site updates
2007-06-07T18:49:20+00:00
In these days I decided to improve the site switching from Joomla to Drupal.
I liked Joomla very much because it was really easy and immediate to use. But
there was a think I didn’t like about it: the way it handled news. I wanted to
have something similar to a blog, and this isn’t Joomla mission. So, after
using Drupal for strigi’s website, I fall in love with
it and I decided to make “the switch”.

During the upgrade I also installed a “real” gallery program (I’m referring
to Gallery), you can navigate through it
clicking here.

I hope you’ll like the new look and will enjoy it!

I also promise that I’ll keep this site more updated, stay tuned… in the
next weeks I’ll start writing some good technical post ;)

Btw, during the upgrade I ported all previous contents and comments to drupal.



Desktop searching
2007-06-05T14:22:16+00:00
An overview of desktop searching under linux.

This speech illustrates:


what’s desktop searching
what’re the benefits of desktop searching technology
what’re the programs availables under linux
overview of Beagle
overview of Strigi


Desktop searchingView more presentations from Flavio Castelli.



KDE4 preview at Gazzaniga's Lug
2007-05-30T15:14:31+00:00
Yesterday I’ve made a speech at Gazzaniga’s Lug regarding next KDE4 major
features and Strigi.

I’ve uploaded the slides, you can find them in the paper category.



KDE4 preview at Gazzaniga (BG)
2007-05-29T12:38:37+00:00
A KDE 4 small overview This speech illustrates some of the major KDE4
features. Slides are shorts since has been made for a “wiki-night” :)

Kde4 previewView more presentations from Flavio Castelli.



dot.kde.org and me
2007-05-10T19:35:41+00:00
I’m really honored to report that dot.kde.org has
published the english translation of my interview for kde-it. You can find it
here.



KDE Italy and me
2007-05-07T19:32:01+00:00
I’m really honored to report that KDE-Italy site published an interview about
me. You can find it here. Many thanks to
Giovanni Venturi, webmaster and maintainer of kde-it.org!



Back from FOSDEM
2007-03-01T19:06:22+00:00


A small resume of my FOSDEM experience…

Friday evening

Departure from Italy. The plane landed in Charlesroi airport, where we (me and
Laura) take a bus for Bruxelles. Once in the city it was so late that there
was no public transport, so we had to take a cab in order to reach our hotel.

Saturday

Small tour in Bruxelles. As we thought Bruxelles is a beautiful city with lot
of things to see, in fact there are too many things to see for a 2 day trip!
In the late morning we reached FOSDEM at the ULB university.

After a long tour across the different stands I found the KDE guys and
especially Jos. I met him at the Nepomuk talk. Here I discovered that since
Sebastian Trueg (author of the famous K3B burning program) was ill, Jos has
taken his place. It has been a really interesting talk, that “warmed” people
making them more curious about Strigi. Then it was my time. Everything went
right and in the end there were many questions.

The day ended with another small walk in Bruxelles and a dinner in one typical
restaurant.

Sunday

We went to FOSDEM for the last time to see Jos talk. As usual Jos made a great
job and all the people liked it. After a small talk with him about some
technical aspects of Strigi, we leave FOSDEM and we went for another small
tour in Bruxelles.

In the early afternoon we had to take the bus for the Charlesroi airport. We
came back home at 10pm with some bottles of good Belgian beer ;)

In the end

It has been a great experience, I’ll try to came back to FOSDEM the next year!

You can find my presentation in the papers section.



Strigi Desktop integration
2007-02-28T19:26:43+00:00


Short abstract:

Strigi desktop integration: how to access Strigi features.

Strigi, the fastest and smallest desktop search engine, provides fast searches
and good metadata extraction capabilities. It will be used by the next KDE,
but it can be easily integrated in other programs.

Long abstract:

Modern humans are using more and more data every day. Keeping data organized
is becoming insufficient, so finding and filtering documents have become key
tasks in modern operating systems. The traditional unix tools find, grep and
locate do no longer suffice for a number of reasons: The amounts of data have
grown more than data access speeds, many document formats are too complex to
handle by simple tools and relations between different types of data are
becoming more important for a convenient user experience. Strigi introduces a
new way of looking at metadata and file formats that enables the creation of
very efficient tools for improving the way users handle their data. It does so
by using simple C++ code with very few dependencies.

Strigi’s features can be easily integrated by other programs using two
different interfaces: socket and DBus. The last one can be considered the best
choice since there are lot of DBus bindings that make possible to interact
with Strigi using different programming languages. DBus inter-process
communication system will be used by KDE4 and is actually used by Gnome.
Moreover, DBus bindings exists for lot of programming languages. Thanks to
this Strigi can be easily integrated in different window manager and inside
programs written in languages different from C++.

KDE developers can also take advantage from Strigi’s JStreams: a set of
classes for fast access to archive files. In fact Strigi defines a dedicated
KIO Slave for JStreams, this can be easily used by other programs, making
archives access faster.

This presentation will show how the current Strigi clients interact with the
daemon, how to communicate with Strigi and, most important of all, how other
projects can benefit too accessing Strigi.

Strigi desktop-integrationView more presentations from Flavio Castelli.



Going to FOSDEM 2007 edition
2007-02-17T20:11:20+00:00
This year I’ll go to FOSDEM 2007 edition where I’ll give a presentation
regarding Strigi.

FOSDEM stands for “Free Open Source Developers European
Meeting”.

This is one of the most important happenings for the open-source scene in
Europe. It takes place in Bruxelles, where for two days the university if full
of talks, and stands of open-source projects.

Big projects have a DevRoom, a place reserved for talks related with them. And
here I came… As Strigi is related with KDE me and Jos van Oever will give
two presentations in the KDE DevRoom.

I’ll give a talk with the title: “Strigi desktop integration”. I’ll talk
about the different interfaces you can use for playing with Strigi. These
interfaces are really easy to use, this makes possible to integrate Strigi in
different projects or write some cool front-end without any hassle. Finally
I’ll talk about the technologies that reside into Strigi internals. Since
they’re really useful, it can be smart to use them in different situations
(also when information retrieval isn’t the main goal). Here you can get more
detailed informations about my talk.



Still alive
2007-02-17T19:15:59+00:00
As usually lot of time passed since my last post on this site. Hundred of
things happened in the meantime…  Going in order:


I took my second level degree in Computer Engineering
I began to work in a IT company
I’m going to give a presentation of strigi to fosdem
It’s really hard (and honestly I don’t have enough time) to give a detailed
resume of all these facts. The graduation went really fine, my teacher and the
other one in the examining session liked my work and I got a good score.
Obviously that gave me a great satisfaction Laughing


The other great thing is that, exactly a month later my graduation, I found a
good work in a IT company. In the meantime I got lot of job interviews, and
I’ve been really lucky because I had lot of choices. The main office is
located in Milan, a great city near Bergamo (where I live). Since I don’t like
to live in Milan (for lot of reasons), I became an outlier. So every day I
take a train, the subway and a tram in order to reach work. It isn’t so hard
as you could thing… I’ve lot of time to spend reading and maybe coding Wink
In the end, for the last point regarding strigi and fosdem, I’ll make a
dedicated post.



klaptopdaemon improvement
2006-12-10T22:24:02+00:00
Last night I spent a couple of hours adding a missing feature to
klaptopdaemon. Are you curious? keep reading…  Yesterday I installed a new
kde icon theme. It looked really nice but there was a problem: klaptopdaemon
tray icon didn’t work. In fact the icon didn’t show the battery level, it was
always at 0%. Quite annoying isn’t it? Since I liked the theme I decided to
add a new option to klaptopdaemon: print the battery level percentage over the
systray icon.

I’ve just contacted klaptopdaemon maintainers and I’m going to commit it into
kde svn, so you’ll get everything into next klaptopdaemon release :)



HpCalc
2006-12-01T12:53:48+00:00


HpCalc is an open-source multi-platform editor for the Hp 39G calculator.  I was
bored by the official “Applet Development kit”, a really old and ugly program
written for Windows 3.1 (!!), so I decided to write something new (and maybe
better).

I made this program just for fun and for learning something (it was the first
time I used the Qt library), so don’t bother me if this isn’t fully comparable
to Hp’s ADK.

I don’t use too much my calculator, the only thing I really needed was a
decent note editor and this’s what HpCalc is going to be: an hp 39G note
editor.

Actually the program is only an alpha version, but it’s quite usable for
loading/editing/saving .not files.

As I said before I don’t use too much my calculator and this isn’t one of my
primary interest, so I don’t know if I’ll ever upgrade it.

Code

HpCalc source code is released under GPL2 license, so you can grab it free from
this github repository.



Netiquette
2006-10-28T12:41:20+00:00
An introduction to netiquette. Revised version

This speech illustrates:


what’s a mailing list and how does it works
what’s a newsgroup
what’s netiquette
how to quote
how to don’t break mailing lists threads


Netiquette v11View more presentations from Flavio Castelli.



Regular expressions
2006-09-12T12:13:13+00:00
An introduction to Regular Expressions.  This speech illustrates:


regular expressions language main rules
regexps with c++ (with boost library)
regexps with python
regexps with perl
regexps with bash


Regular expressionsView more presentations from Flavio Castelli.



My second level thesis
2006-08-08T19:57:32+00:00


Months passed in silence and now you publish two news in a couple of minutes!?
yep, tonight I just want to go over internet and write something here… Wink
It’s time to talk about my second level thesis…

Kat

A couple of months passed since I started seeking something interesting to
challenge with. I found kat, an open-source
information retrieval program for KDE. If you don’t know what’s an information
retrieval program you’ve just to think about a local google. It’s simply a
program that let you search through your local files like google. Other
information retrieval programs are beagle and google desktop. I
started to study kat’s code and I also discovered that its mantainer is a nice
italian guy. Unfortunately kat have some ugly problems and Roberto (the
manteiner) can’t fix them because now he’s really busy. I was going to
investigate over these problems for fixing them when I discovered a similar
project: strigi.

Strigi

strigi is a really young project created by Jos van den Oever. It’s written in
C++ using STL and other external libraries. It runs as a daemon listening over
a socket. In this way you’ve just to write your custom gui using your favorite
language, nice isn’t it? I’ve contacted Jos and I began to send patches and
add new features to strigi, committing them straight into kde subversion
repository (cool, I’m a bit excited about it :) ). Recently I added the
support for the linux kernel inotify interface, an essential component for
strigi.

I really prefer strigi over kat because:


it works
it can be run under different window manager
in the near future it’ll run also under different OS (windows by now, maybe I’ll port it to macos)
it’s highly under development




You can find more informations about strigi here.



QShapes: a process modeling tool
2006-08-08T12:48:55+00:00
A lot of time passed since my last post. As usually I’ve been too busy to keep
the site updated, forgive me! Last time I let fall something about one of my
university projects: qshapes.  Now it’s time to tell you something more…
QShapes is a process modeling tool, in short words a kind of 2D CAD.

I’m going to release it under GPL over berlios site. Actually I I’ve
registered the project (see http://developer.berlios.de/projects/qshapes/ ),
committed some code on the svn repository and uploaded some screenshot). The
program is “quite” stable (there’re yet some crashes) and can run under linux,
macos and windows. It’s written in C++ using Qt for the gui so it’s really
portable. I think I’ll use this program as a starting point for one of my
dreams: a multiplatform open-source diagram creation tool like dia, kivio or
Microsoft visio®. I like dia and kivio but both lacks of some components /
features. Since I don’t like too much “gnomish” software I’ll never improve
dia. On the other hand kivio is quite pretty but poor than dia in some
situations. Especially kivio isn’t multiplatform and this’s a great problem
for a me.

But now I’m really busy so I’ll start working on this project after my second
level thesis (I’ll tell you something about it really soon).



iBook X keyboard mapping
2006-05-12T18:09:06+00:00


How to have a comfortable keyboard layout under X. A guide destinated to
italian iBook linux users

Target

have the following keymap: apple keyboard = alt gr

Instructions

You’ve to edit your /etc/X11/xkb/keycodes/xfree86 file. Remember to make a
backup copy of this file before editing. These’re the easy steps:


under X run a program like xev to find the exact keycode of your apple key
change the  code with the new one
comment all the old command referring the new keycode




Restart X and keep your finger crossed ;)



iBook console keyboard mapping
2006-05-12T12:05:14+00:00


How to have a comfortable italian keyboard layout on your iBook.

Target

We just want this mapping:

apple key = alt gr
numpad return (the key near to left arrow) = canc


Procedure

Simply download the file at the end of the page and load it using loadkeys
ibook-it.map.gz

Simple, isn’t it?

Download

I found this file over internet, the author is Dario Besseghini. Thank you
Dario!

{% gist 2469779 %}



No more exams!
2006-05-10T19:10:07+00:00
Just a small announcement: last month (by 14th september) I’ve finished
all my exams!!

Obviously I’m really happy but I’m also really busy because I’m working hard
on strigi. I’ll let you know something more about it in these days…



iBook mouse emulation
2006-05-05T12:07:21+00:00


In this really small guide you’ll discover howto enable mouse button emulation
on a iBook G4 running linux

Target:

I just wanted to enable mouse button emulation on my iBook using the following
shortcuts:

fn + ctrl = middle mouse button
fn + alt = left mouse button


Requisites:

You’ve to enable CONFIG_MAC_EMUMOUSEBTN in your kernel.

Commands:

In order to enable this shortcut at every boot add the following lines to your
/etc/sysctl.conf:

dev.mac_hid.mouse_button_emulation = 1
dev.mac_hid.mouse_button2_keycode = 97
dev.mac_hid.mouse_button3_keycode = 100


Simple, isn’t it?



Linux on console
2006-04-01T12:38:05+00:00
These slides are about linux and modern console

The article focuses on the installation of linux over some modern console
like:


Sega Dreamcast™
Sony Playstation 2 ™
Microsoft Xbox™
The presentation focuses mostly on the Xbox™ case.


Linux consoleView more presentations from Flavio Castelli.



Svn commit helper
2006-02-16T12:51:41+00:00


Svn-commit is a command line utility for making rapid commit with subversion.
Suppose you’re working on your local copy of a subversion project. If you
forget to run commands like svn del file or svn add file each time you add or
remove a file, when you’ll try to commit your working copy you’ll obtain
something like this:


? file: for each file/directory that isn’t under the revision system

! file: for each file/directory you’ve removed without the command svn del
svn-commit will prevent these errors because it will tell svn to:

add all your unversioned files to the repository

delete all the files you’ve removed from your working directory (be careful !!)


Requirements:

svn-commit requires:


python
svn client utilities


Synopsis:

svn-commit syntax: svn-commit directory A simple example: svn commitpwd``

Code

{% codeblock [svn-commit.py] [lang:python ] %}
#! /usr/bin/python

svn-commit

Created by Flavio Castelli flavio.castelli@gmail.com

svn-commit follows GPL v2

import os,popen2
from sys import argv

if name == “main“:

dir=argv[1];
  pipe = os.popen (‘svn status %s’%(dir)) #executes shell command

result = [x.strip() for x in pipe.readlines() ]
  pipe.close()

to_add=[]
  to_remove=[]

for x in result:
    if x.find(‘?’)!=-1:
      to_add+=[x[x.find(‘/’):len(x)]]
    elif x.find(‘!’)!=-1:
      to_remove+=[x[x.find(‘/’):len(x)]]

print “To add\n”;
  for x in to_add:
    print (“Command svn add %s”%(x))
    pipe= os.popen (“svn add \“%s\“”%(x))
    result = [x.strip() for x in pipe.readlines() ]
    for y in result:
      print y
    pipe.close()
  print “To remove\n”;
  for x in to_remove:
    print (“Command svn delete \“%s\“”%(x))
    pipe= os.popen (“svn delete \“%s\“”%(x))
    result = [x.strip() for x in pipe.readlines() ]
    for y in result:
      print y
    pipe.close()

{% endcodeblock %}



Regexp with boost
2006-02-16T12:42:50+00:00


How can you add regular expressions to C++?
Here you’re three small examples.

Pattern matching

In this example you’ll find how you can match a regexp in a string.

{% codeblock [pattern matching] [lang:c++ ] %}
// Created by Flavio Castelli 
// distrubuted under GPL v2 license

#include 
#include 

int main()
{
  boost::regex pattern (“bg|olug”,boost::regex_constants::icase|boost::regex_constants::perl);
  std::string stringa (“Searching for BsLug”);

if (boost::regex_search (stringa, pattern, boost::regex_constants::format_perl))
    printf (“found\n”);
  else
    printf(“not found\n”);

return 0;
}
{% endcodeblock %}

Substitutions

In this example you’ll find how you can replace a string matching a pattern.

{% codeblock [substitutions] [lang:c++ ] %}
// Created by Flavio Castelli flavio.castelli@gmail.com
// distrubuted under GPL v2 license

#include 
#include 

int main()
{
  boost::regex pattern (“b.lug”,boost::regex_constants::icase|boost::regex_constants::perl);
  std::string stringa (“Searching for bolug”);
  std::string replace (“BgLug”);
  std::string newString;

newString = boost::regex_replace (stringa, pattern, replace);

printf(“The new string is: |%s|\n”,newString.c_str());

return 0;
}
{% endcodeblock %}

Split

In this example you’ll find how you tokenize a string with a pattern.

{% codeblock [split] [lang:c++ ] %}
// Created by Flavio Castelli flavio.castelli@gmail.com
// distrubuted under GPL v2 license

#include 
#include 

int main()
{
  boost::regex pattern (”\D”,boost::regex_constants::icase|boost::regex_constants::perl);

std::string stringa (“26/11/2005 17:30”);
  std::string temp;

boost::sregex_token_iterator i(stringa.begin(), stringa.end(), pattern, -1);
  boost::sregex_token_iterator j;

unsigned int counter = 0;

while(i != j)
  {
    temp = *i;
    printf (“token %i = |%s|\n”, ++counter, temp.c_str());
    i++;
  }

return 0;
}
{% endcodeblock %}

Requirements

In order to build this examples you’ll need:


a c++ compiler (like g++)
boost regexp library




Svn cleaner
2006-02-15T00:17:23+00:00
This simple program removes recursively all .svn directories.

Requirements: In order to run remove-svn requires python.

Synopsis: remove-svn syntax is: remove-svn dir in this way remove-svn will recursively remove all .svn directories found under dir.

An example: remove-svnpwd``

UPDATE A faster way for removing .svn file through this simple bash command: find ./ -name *svn* | xargs rm -rf

The old script has been removed.



QtCanvas
2006-01-01T00:13:43+00:00
Finally qtcanvas classes are available for Qt 4 series

This isn’t the final version of qtcanvas, it’s only a backport of the original
classes shipped with Qt3.

So what’s the difference between this qtcanvas and qt3canvas (available only
through Qt3 support with Qt4)? Simple this version works with all open-source
versions of Qt >= 4.1.0!!

In this way you can use qtcanvas also under windows (before it wasn’t possible
with the open-source edition).

I’ve tried qtcanvas under Mac OS X Tiger, Gnu-Linux and Windows XP and they
work fine.



Gentoo documentation checker
2005-12-08T18:52:16+00:00


gen-docheck is a useful tool for the gentoo italian translation team.
gen-dockeck compares the version number of english document and italian translation.

In this way you can watch the status of one or more guides, keeping the
translations updated.

Features:


mail notification support (straight to guide’s translator or to a specified address)
filter guides using regular expressions


Requirements:

gen-dockeck requires:


perl
perl module LWP::Simple (under debian is provided by libwww-perl)
perl module Net::SMTP
perl module Getopt::Long (usually available by default with all perl installation)
perl module Pod::Usage (usually available by default with all perl installation)


Synopsis:

gen-docheck syntax: gen-docheck [--help] [--man] [--config configuration
file] for more informations read the man page: gen-docheck --manan

Configuration file:

gen-docheck support also configuration files.

This is an example:

#mail sender
sender = gentoo_doccheck@gentoo.orgThis email address is being protected from spam bots, you need Javascript enabled to view it  
#check only guides mathing these names (use "." to match all, "," to separate names)
checkonly = diskless,macos
#checkonly = .  
#send mail notify to translator
mailnotify = 0  
#send all mail notify to this address
force_mail_destination = flavio.castelli@gmail.com  
# smtp server
smtp = smtp.tiscali.it  
# debug smtp commands
smtpdebug = 0


Usage:

You can automate gen-docheck adding it to cron.

Here’s an example:

0 10 * * 0 /home/micron/gen\-docheck/gen\-docheck.pl --config /home/micron/gen\-docheck/gen\-docheck.conff


In this way you’ll run gen-docheck every sunday at 10:00 AM

Download

The code can be found inside of this git repository.



Howto edit multime id3 tags from command line
2005-12-08T18:45:39+00:00


Goal

id3medit is a simple script for tagging all mp3/ogg files present in a
directory.

Requirements:

id3medit relies on id3v2, a command-line tool for editing id3v2 tags file
names must be in format: ’## - trackname.ext’. Where ## is track’s number,
and ext is file’s extension (mp3 or ogg in case insensitive format)

Synopsis:

id3medit syntax is: id3medit artist album year(*) genre(*) Where * denotes
optional arguments You can obtain genre identification number in this way:
id3v2 -L | grep -i genre

Example

id3v2 -L | grep -i rock

   1: Classic Rock
  17: Rock
  40: Alt. Rock
  47: Instrum. Rock
  56: Southern Rock
  78: Rock & Roll
  79: Hard Rock
  81: Folk/Rock
  91: Gothic Rock
  92: Progress. Rock
  93: Psychadel. Rock
  94: Symphonic Rock 
  95: Slow Rock
 121: Punk Rock
 141: Christian Rock


Code

{% gist 2469919 %}



Linux and cryptography
2005-08-12T12:31:05+00:00
A speech about cryptography and linux.

This speech is about:


crypted email with gnupg
crypted file system avaible under linux, an overview and a comparison of loop-AES , cryptoloop, dm-crypt, bestcrypt
creation of a full crypted file-system, root included. The most interesting feature is system’s boot sequence, which uses a usb token for authenticating users


Seminario crittografia-linux-day-2004View more presentations from Flavio Castelli.

Flavio Castelli

Building a unikernel that runs WebAssembly - part 1

Why

How

The WebAssembly runtime

WebAssembly Component Model

Summary

Playing with Common Expression Language

Picking a CEL runtime

CEL and protobuf

The grand picture

Taking a look at the code

Parse the CEL constraint

Handle the JSON input

Evaluate the constraint

Building

Usage

Summary

Getting back to C++

Building the universe with Bazel

Code spelunking

What’s next?

Write kubectl plugins using WebAssembly and WASI

The current state of things

Can WebAssembly help here?

Introducing krew-wasm

Some examples

Summary

Build multi-architecture container images using argo workflow

Quick recap

Today’s goals

Kubernetes native pipeline solutions

Creating pipelines using Argo Workflow

Porting our build POD to an Argo Workflow

Refactoring the Argo Workflow

Building on multiple architectures

Creating the image manifest

Explicating dependencies between Argo templates

Garbage collector

Summary

Build multi-architecture container images using Kubernetes

Image building

Running buildah containerized

The storage driver

Kubernetes device plugin

Obtaining the source code of our image

Trying the first build

Determine image architecture

Checkout the source code

Shared volume

The buildah container

An unexpected problem

Finding the offending security check

Create an AppArmor profile for buildah

Why AppArmor is blocking only x86_64 builds?

Kubernetes and AppArmor profiles

What’s next?

Semantic versioning and containers

Introducing fresh-container

CLI mode

Server mode

Kubernetes integration

Usage

Find stale deployments

Why is a deployment stale?

Automatic upgrades

What comes next

Hackweek Project Docker Registry Mirror

The problem

On-premise cache of container images

Retain control over the contents of the mirror

There’s life outside of the Docker Hub

A better solution

The all-in-one box

Mirroring images

Mirroring on air-gapped environments

Pulling the images

What about unpatched docker engines/other container engines?

Show me the code!

Putting openSUSE Docker images on a diet

Introducing `krew-wasm`