This time my personal project has been about building a unikernel that runs WebAssembly.
I wanted this blog post to contain all the details about this journey. However I realized this would have been too much for a single post. I hence decided to split everything into smaller chunks. I’ll update this section to keep track of all the posts.
In the meantime, you can find the code of the POC here.
There are multiple reasons why I did that, but I don’t want to repeat what I wrote inside of the project description. Learning and fun goals aside, I think there’s actually a good reason to mix unikernels and WebAssembly.
From the application developer POV, porting/writing an application to the unikernel is not an easy task. The application and all its dependencies have to support the target unikernel. Some patching might be required inside of the whole application stack to make it work.
From the unikernel maintainers POV, they have to invest quite some energies to ensure any kind of application can run in a seamless way on top of their platform. They don’t know which kind of system primitives the user applications will leverage, this makes everything harder.
On the other hand, when targeting a WebAssembly platform (think of Spin or Spiderlightning), the application has a clear set of capabilities that have to be provided by the WebAssembly runtime.
If you look at the Spiderlightning scenario, an application might be requiring
Key/Value store capabilities at runtime. However, how these capabilities are
implemented on the host side is not relevant to the application. That means
the same .wasm
module can be run by a runtime that implements the K/V store
using Redis or using Azure Cosmos DB.
That would be totally transparent to the end user application.
You might see where I’m going with all that…
If we write a unikernel application that runs WebAssembly modules and supports a
set of Spiderlightning APIs, then the same Spiderlightning application could be
run both on top of the regular slight
runtime and of this unikernel.
All of that without any additional work from the application developer. The Wasm module wouldn’t even realize that. The complexity would fall only on the unikernel developer who, whoever, would have a clear set of functionalities to implement (as opposed to “let’s try to make any kind of application work”).
Sometimes ago I stumbled over the RustyHermit project, this is a unikernel written in Rust. I decided to use it as the foundation to write my unikernel application.
Building a RustyHermit application is pretty straightforward. Their documentation, even though is a bit scattered, is good and their examples help a lot.
The cool thing is that RustyHermit is part of Rust nightly, which makes the whole developer experience great. It feels like writing a regular Rust application.
Obviously you cannot expect all kind of Rust crates to just work with RustyHermit. You will see how that influenced the development of the POC.
The next sections go over some of the major challenges I faced during the last week. I’ll share more details inside of the upcoming blog posts (see the disclaimer section at the top of the page).
Unfortunately Wasmtime, my favorite WebAssembly runtime,
does not build on top of RustyHermit. Many of its dependencies expect libc
or other low level libraries to be around.
The same applies to wasmer.
I thought about using something like WebAssembly Micro Runtime (WAMR), but I preferred to stick with something written in Rust and have the “full RustyHermit experience”.
After some searching I found wasmi a pure Rust WebAssembly runtime. This works fine on top of RustyHermit, plus its design is inspired by the one of Wasmtime, which allowed me to reuse a lot of my previous knowledge.
Spiderlightning leverages the WebAssembly Component Model proposal to offer capabilities to the WebAssembly guests and to allow the host to consume capabilities offered by the WebAssembly guest.
The communication between the host and the guest happens using types defined with the Wasm Interface Type.
To give some concrete examples, the demo I’m going to run leverages the WebAssembly Component Model in these ways:
http-server
types. In this case it’s the guest that leverages capabilities offered by the
host.http-handler
types.keyvalue
types.As you can see there are many WIT types involved. For each one of them we
need code both inside of the guest (a SDK basically) and on the host (the
code that implements the guest SDK).
This code can be scaffolded by a cli tool called wit-bindgen
,
which generates host/guest code starting from a .wit
file.
In this case I only had to implement the host side of these interfaces inside of the unikernel.
The code generated by wit-bindgen
is doing low level operations using the
WebAssembly runtime. The code to be scaffolded depends on the programming language
and on the WebAssembly runtime used on the host side.
Obviously the wasmi
WebAssembly runtime was not supported by wit-bindgen
,
hence I had to extend wit-bindgen
to handle it. The code can be found inside of
this fork, under the wasmi
branch.
With all of that in place, I scaffolded the host side of the Key/Value capability and I made a simple implementation of the host traits. The host code was just emitting some debug information. I was then able run the vanilla keyvalue-demo from the Spiderlightning project. 🥳
You made to the bottom of this long post, kudos! I think you deserve a prize for that, so here we go…
This is a recording of the unikernel application running the Spiderlightning http-server demo.
I hope you enjoyed the reading. Stay tuned for the next part of the journey. This will cover Rust async, Redis and some weird errors.
]]>This language is being used by some open source projects and products, like:
I’ve been looking at CEL since some time, wondering how hard it would be to find a way to write Kubewarden validation policies using this expression language.
Some weeks ago SUSE Hackweek 21 took place, which gave me some time to play with this idea.
This blog post describes the first step of this journey. Two other blog posts will follow.
Currently the only mature implementations of the CEL language are written in Go and C++.
Kubewarden policies are implemented using WebAssembly modules.
The official Go compiler isn’t yet capable of producing WebAssembly modules that can be run outside of the browser. TinyGo, an alternative implementation of the Go compiler, can produce WebAssembly modules targeting the WASI interface. Unfortunately TinyGo doesn’t yet support the whole Go standard library. Hence it cannot be used to compile cel-go.
Because of that, I was left with no other choice than to use the cel-cpp runtime.
C and C++ can be compiled to WebAssembly, so I thought everything would have been fine.
Spoiler alert: this didn’t turn out to be “fine”, but that’s for another blog post.
CEL is built on top of protocol buffer types. That means CEL expects the input data (the one to be validated by the constraint) to be described using a protocol buffer type. In the context of Kubewarden this is a problem.
Some Kubewarden policies focus on a specific Kubernetes resource; for example,
all the ones implementing Pod Security Policies are inspecting only Pod
resources.
Others, like the ones looking at labels
or annotations
attributes, are instead
evaluating any kind of Kubernetes resource.
Forcing a Kubewarden policy author to provide a protocol buffer definition of the object to be evaluated would be painful. Luckily, CEL evaluation libraries are also capable of working against free-form JSON objects.
The long term goal is to have a CEL evaluator program compiled into a WebAssembly module.
At runtime, the CEL evaluator WebAssembly module would be instantiated and would receive as input three objects:
Having set the goals, the first step is to write a C++ program that takes as input a CEL constraint and applies that against a JSON object provided by the user.
There’s going to be no WebAssembly today.
In this section I’ll go through the critical parts of the code. I’ll do that to help other people who might want to make a similar use of cel-cpp.
There’s basically zero documentation about how to use the cel-cpp library. I had to learn how to use it by looking at the excellent test suite. Moreover, the topic of validating a JSON object (instead of a protocol buffer type) isn’t covered by the tests. I just found some tips inside of the GitHub issues and then I had to connect the dots by looking at the protocol buffer documentation and other pieces of cel-cpp.
TL;DR The code of this POC can be found inside of this repository.
The program receives a string containing the CEL constraint and has to
use it to create a CelExpression
object.
This is pretty straightforward, and is done inside of these lines
of the evaluate.cc
file.
As you will notice, cel-cpp makes use of the Abseil
library. A lot of cel-cpp APIs are returning absl::StatusOr
// invoke API
auto parse_status = cel_parser::Parse(constraint);
if (!parse_status.ok())
{
// handle error
std::string errorMsg = absl::StrFormat(
"Cannot parse CEL constraint: %s",
parse_status.status().ToString());
return EvaluationResult(errorMsg);
}
// Obtain the actual result
auto parsed_expr = parse_status.value();
cel-cpp expects the data to be validated to be loaded into a CelValue
object.
As I said before, we want the final program to read a generic JSON object as input data. Because of that, we need to perform a series of transformations.
First of all, we need to convert the JSON data into a protobuf::Value
object.
This can be done using the protobuf::util::JsonStringToMessage
function.
This is done by these lines
of code.
Next, we have to convert the protobuf::Value
object into a CelValue
one.
The cel-cpp library doesn’t offer any helper. As a matter of fact, one of
the oldest open issue of cel-cpp
is exactly about that.
This last conversion is done using a series of helper functions I wrote inside
of the proto_to_cel.cc
file.
The code relies on the introspection capabilities of protobuf::Value
to
build the correct CelValue
.
Once the CEL expression object has been created, and the JSON data has been converted into a `CelValue, there’s only one last thing to do: evaluate the constraint against the input.
First of all we have to create a CEL Activation
object and insert the
CelValue
holding the input data into it. This takes just few lines of code.
Finally, we can use the Evaluate
method of the CelExpression
instance
and look at its result. This is done by these lines of code,
which include the usual pattern that handles absl::StatusOr<T>
objects.
The actual result of the evaluation is going to be a CelValue
that holds
a boolean type inside of itself.
This project uses the Bazel build system. I never used Bazel before, which proved to be another interesting learning experience.
A recent C++ compiler is required by cel-cpp. You can use either gcc (version 9+) or clang (version 10+). Personally, I’ve been using clag 13.
Building the code can be done in this way:
CC=clang bazel build //main:evaluator
The final binary can be found under bazel-bin/main/evaluator
.
The program loads a JSON object called request
which is then embedded
into a bigger JSON object.
This is the input received by the CEL constraint:
{
"request": < JSON_OBJECT_PROVIDED_BY_THE_USER >
}
The idea is to later add another top level key called
settings
. This one would be used by the user to tune the behavior of the constraint.
Because of that, the CEL constraint must access the request values by
going through the request.
key.
This is easier to explain by using a concrete example:
./bazel-bin/main/evaluator \
--constraint 'request.path == "v1"' \
--request '{ "path": "v1", "token": "admin" }'
The CEL constraint is satisfied because the path
key of the request
is equal to v1
.
On the other hand, this evaluation fails because the constraint is not satisfied:
$ ./bazel-bin/main/evaluator \
--constraint 'request.path == "v1"' \
--request '{ "path": "v2", "token": "admin" }'
The constraint has not been satisfied
The constraint can be loaded from file. Create a file
named constraint.cel
with the following contents:
!(request.ip in ["10.0.1.4", "10.0.1.5", "10.0.1.6"]) &&
((request.path.startsWith("v1") && request.token in ["v1", "v2", "admin"]) ||
(request.path.startsWith("v2") && request.token in ["v2", "admin"]) ||
(request.path.startsWith("/admin") && request.token == "admin" &&
request.ip in ["10.0.1.1", "10.0.1.2", "10.0.1.3"]))
Then create a file named request.json
with the following contents:
{
"ip": "10.0.1.4",
"path": "v1",
"token": "admin",
}
Then run the following command:
./bazel-bin/main/evaluator \
--constraint_file constraint.cel \
--request_file request.json
This time the constraint will not be satisfied.
Note: I find the
_
symbols inside of the flags a bit weird. But this is what is done by the Abseil flags library that I experimented with. 🤷
Let’s evaluate a different kind of request:
./bazel-bin/main/evaluator \
--constraint_file constraint.cel \
--request '{"ip": "10.0.1.1", "path": "/admin", "token": "admin"}'
This time the constraint will be satisfied.
This has been a stimulating challenge.
I didn’t write big chunks of C++ code since a long time! Actually, I never had a chance to look at the latest C++ standards. I gotta say, lots of things changed for the better, but I still prefer to pick other programming languages 😅
I had prior experience with autoconf
& friends, qmake
and cmake
, but I
never used Bazel before.
As a newcomer, I found the documentation of Bazel quite good. I appreciated
how easy it is to consume libraries that are using Bazel. I also like how
Bazel can solve the problem of downloading dependencies, something
you had to solve on your own with cmake
and similar tools.
The concept of building inside of a sandbox, with all the dependencies vendored, is interesting but can be kinda scary. Try building this project and you will see that Bazel seems to be downloading the whole universe. I’m not kidding, I’ve spotted a Java runtime, a Go compiler plus a lot of other C++ libraries.
Bazel build
command gives a nice progress bar. However, the number of tasks to
be done keeps growing during the build process. It kinda reminded me of the old
Windows progress bar!
I gotta say, I regularly have this feeling of “building the universe” with Rust, but Bazel took that to the next level! 🤯
Finally, I had to do a lot of spelunking inside of different C++ code bases: envoy, protobuf’s c++ implementation, cel-cpp and Abseil to name a few. This kind of activity can be a bit exhausting, but it’s also a great way to learn from the others.
Well, in a couple of weeks I’ll blog about my next step of this journey: building C++ code to standalone WebAssembly!
Now I need to take some deserved vacation time 😊!
⛰️ 🚶👋
]]>You probably have already heard about WebAssembly, but there are high chances that happened in the context of Web application development. There’s however a new emerging trend that consists of using WebAssembly outside of the browser.
WebAssembly has many interesting properties that make it great for writing plugin systems or even distributing small computational units (think of FaaS).
WebAssembly is what is being used to power Kubewarden, a project I created almost two years ago at SUSE Rancher, with the help of Rafa and other awesome folks. This is where the majority of my “blogging energies” have been focused.
Now, let’s go back to the main focus of today’s blog entry: write kubectl plugins using WebAssembly.
As you all know, kubectl can be easily extended by writing external plugins.
These plugins are executables named kubectl-<name of the plugin>
that, once
put in your $PATH
can be invoked via kubectl <name of the plugin>
. This
is the same mechanism used to write git
plugins.
These plugins can be managed via a tool called Krew.
The kubectl
tool is available for multiple operating systems and architectures,
which means these plugins must be available for many platforms.
I think writing kubectl plugins using WebAssembly has the following advantages:
Last but not least, this sounds like a fun experiment!
krew-wasm
The idea about writing kubectl plugins with WebAssembly originated during a
brainstorming session I was doing with Rafa about our upcoming talk for
WasmDay EU 2022. The idea kinda “infected” me, I had to
hack on it ASAP!!! This is how the krew-wasm
project was created.
krew-wasm takes inspiration from Krew, but it does not aim to replace it. That’s quite the opposite, it’s a complementary tool that can be used alongside with Krew.
The sole purpose of krew-wasm is to manage and execute kubectl plugins written using WebAssembly and WASI.
krew-wasm plugins are WebAssembly modules that are distributed using container registries, the same infra used to host container images.
krew-wasm can download kubectl WebAssembly plugins from a container registry and
make them discoverable to kubectl.
This is achieved by creating a symbolic link for each managed plugin. This symbolic
link is named kubectl-<name of the plugin>
but, instead of pointing to the
WebAssembly module, it points to the krew-wasm
executable.
Once invoked, krew-wasm
determines its usage mode which could either be a
“direct invocation” (when the user invokes the krew-wasm
binary to manage plugins)
or it could be a “wrapper invocation” done via kubectl
.
When invoked in “wrapper mode”, krew-wasm takes care of loading the WebAssembly plugin and invoking it. krew-wasm works as a WebAssembly host, and takes care of setting up the WASI environment used by the plugin.
I’ll leave the technical details out of this post, but if you want you can find more on the GitHub project page.
The POC would not be complete without some plugins to run. Guess what, you can find a one right here!
The kubectl decoder
plugin dumps Kubernetes Secret objects to the standard output,
decoding all the data that is base64-encoded. On top of that, when a x509 certificate
is found inside of the Secret, a detailed output is shown rather then the not so helpful
PEM encoded representation of the certificate.
If you want to experiment with this idea, you can write your plugins using Rust and this SDK.
This has been a nice experiment. It proves the combination of WebAssembly and WASI can be used to produce working kubectl plugins.
What’s more interesting is the fact these technologies could be used to extend other Cloud Native projects. Did someone say helm? 😜
There are however some limitations, mostly caused by the freshness of WASI. These are documented here. However, I’m pretty sure things will definitely improve over the next months. After all the WebAssembly ecosystem is moving at a fast pace!
]]>Note well: this blog post is part of a series, checkout the previous episode about running containerized buildah on top of Kubernetes.
I have a small Kubernetes cluster running at home that is made of ARM64 and x86_64 nodes. I want to build multi-architecture images so that I can run them everywhere on the cluster, regardless of the node architecture. My plan is to leverage the same cluster to build these container images. That leads to a “Inception-style” scenario: building container images from within a container itself.
To achieve that I decided to rely on buildah to build the container images. I’ve shown how run buildah in a containerized fashion without using a privileged container and with a tailor-made AppArmor profile to secure it.
The previous blog post also showed the definition of Kubernetes PODs that would build the actual images.
What I’m going to show today is how to automate the whole building process.
Given the references to the Git repository that provides a container image definition, I want to automate these steps:
Steps #1 and #2 can be done in parallel, while step #3 needs to wait for the previous ones to complete.
This kind of automation can be done using some pipeline solution.
There are many Continuous Integration and Continuous Delivery solutions that are available for Kubernetes. If you love to seek enlightenment by staring in front of beautiful logos, checkout this portion of the CNCF landscape dedicated to CI and CD solutions. 🤯
After some research I came up with two potential candidates: Argo and Tekton.
Both are valid projects with active communities. However I decided to settle on Argo. The main reason that led to this decision was the lack of ARM64 support from Tekton.
Interestingly enough, both Tekton and kaniko (which I discussed in the previous blog post of this series) use the same mechanism to build themselves, a mechanism that can produce only x86_64 container images and is not so easy to extend.
Argo is an umbrella of different projects, each one of them tackling specific problems like:
The projects above are just the mature ones, many others can be found under the Argo project labs GitHub organization. These projects are not yet considered production ready, but are super interesting.
My favourite ones are:
The majority of these projects don’t have ARM64 container images yet, but work is being done and this work is significantly simpler compared to the one of porting Tekton. Most important of all: the core projects I need have already been ported.
A pipeline can be created inside Argo by defining a Workflow resource.
Copying from the core concepts documentation page of Argo Workflow, these are the elements I’m going to use:
template
.step
, steps
or dag
.Spoiler alert, I’m going to create multiple Argo Templates, each one of them focusing on one specific part of the problem. Then I’ll use a DAG to explicit the dependencies between all these Templates. Finally, I’ll define an Argo Workflow to “wrap” all these objects.
I could show you the final result right away, but you would probably be overwhelmed by it. I’ll instead go step-by-step as I did. I’ll start with a small subset of the problem and then I’ll keep building on top of it.
By the end of the previous blog post, I was able to build a container image by using the following Kubernetes POD definition:
apiVersion: v1
kind: Pod
metadata:
name: builder
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
spec:
nodeSelector:
kubernetes.io/arch: "amd64"
containers:
- name: main
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: ["/bin/sh"]
args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
volumeMounts:
- name: code
mountPath: /code
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "https://github.com/flavio/guestbook-go.git",
"--branch", "master"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
These are the key points of this POD:
Starting from something like Argo’s “Hello world Workflow”, we can transpose the POD defined above to something like that:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: simple-build-
spec:
entrypoint: buildah
templates:
- name: buildah
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
nodeSelector:
kubernetes.io/arch: "amd64"
container:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: ["/bin/sh"]
args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
volumeMounts:
- name: code
mountPath: /code
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "https://github.com/flavio/guestbook-go.git",
"--branch", "master"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
As you can see the POD definition has been transformed into a Template
object. The contents of the POD spec
section have been basically copied and pasted under the Template.
The POD annotations have been moved straight under the template.metadata
section.
I have to admit this was pretty confusing to me in the beginning, but everything became clear once I started to look at the field documentation of the Argo resources.
The workflow can be submitted using the argo
cli tool:
$ argo submit workflow-simple-build.yaml
Name: simple-build-qk4t4
Namespace: argo
ServiceAccount: default
Status: Pending
Created: Wed Sep 30 15:45:20 +0200 (now)
This will be visible also from the Argo Workflow UI:
The previous Workflow definition can be cleaned up a bit, leading to the following YAML file:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: simple-build-
spec:
entrypoint: buildah
templates:
- name: buildah
inputs:
parameters:
- name: arch
- name: repository
- name: branch
- name: image_name
- name: image_tag
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
nodeSelector:
kubernetes.io/arch: "amd64"
script:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: [bash]
source: |
set -xe
cd /code/
# needed to workaround protected_symlink - we can't just cd into /code/checkout
cd $(readlink checkout)
buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
echo Image built and pushed to remote registry
volumeMounts:
- name: code
mountPath: /code
- name: certs
mountPath: /certs
readOnly: true
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "{{inputs.parameters.repository}}",
"--branch", "{{inputs.parameters.branch}}"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
- name: certs
secret:
secretName: registry-cert
Compared to the previous definition, this one doesn’t have any hard-coded
value inside of it. The details of the Git repository, the image name, the container registry,… all
of that is now passed dynamically to the template by using the input.parameters
map.
The main
container has also been rewritten to use an Argo Workflow specific field: script.source
. This
is really handy because it provides a nice way to write a bash script to be executed inside the
container.
The source
script has been also extended to perform a push
operation at the
end of the build process.
As you can see the architecture of the image is appended to the tag of the image.
This is a common pattern used when building multi-architecture container images.
One final note about the push
operation. The destination registry is secured
using a self-signed certificate. Because of that either the CA that signed the
certificate or the registry’s certificate have to be provided to buildah.
This can be done by using the --cert-dir
flag and by placing the certificates
to be loaded under the specified path.
Note well, the certificate files must have the .crt
file extension otherwise
they won’t be handled.
I “loaded” the certificate into Kubernetes by using a Kubernetes secret like this one:
apiVersion: v1
kind: Secret
metadata:
name: registry-cert
namespace: argo
type: Opaque
data:
ca.crt: `base64 -w 0 actualcert.crt`
As you can see the main
container is now mounting the contents of the registry-cert
Kubernetes Secret under /certs
.
This time, when submitting the workflow, we must specify its parameters:
$ argo submit workflow-simple-build-2.yaml \
-p arch=amd64 \
-p repository=https://github.com/flavio/guestbook-go.git \
-p branch=master \
-p image_name=registry-testing.svc.lan/guestbook-go \
-p image_tag=0.0.1
Name: simple-build-npqdw
Namespace: argo
ServiceAccount: default
Status: Pending
Created: Wed Sep 30 15:52:06 +0200 (now)
Parameters:
arch: {1 0 amd64}
repository: {1 0 https://github.com/flavio/guestbook-go.git}
branch: {1 0 master}
image_name: {1 0 registry-testing.svc.lan/guestbook-go}
image_tag: {1 0 0.0.1}
The Workflow object defined so far is still hard-coded to be scheduled only
on x86_64 nodes (see the nodeSelector
constraint).
I could create a new Workflow definition by copying one shown before and then
change the nodeSelector
constraint to reference the
ARM64 architecture. However, this would violate the
DRY principle.
Instead, I will abstract the Workflow definition by leveraging a feature of
Argo Workflow called
loops.
I will define a parameter for the target architecture and then I will iterate
over two possible values: amd64
and arm64
.
This is the resulting Workflow definition:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: simple-build-
spec:
entrypoint: build-images-arch-loop
templates:
- name: build-images-arch-loop
inputs:
parameters:
- name: repository
- name: branch
- name: image_name
- name: image_tag
steps:
- - name: build-image
template: buildah
arguments:
parameters:
- name: arch
value: "{{item.arch}}"
- name: repository
value: "{{inputs.parameters.repository}}"
- name: branch
value: "{{inputs.parameters.branch}}"
- name: image_name
value: "{{inputs.parameters.image_name}}"
- name: image_tag
value: "{{inputs.parameters.image_tag}}"
withItems:
- { arch: 'amd64' }
- { arch: 'arm64' }
- name: buildah
inputs:
parameters:
- name: arch
- name: repository
- name: branch
- name: image_name
- name: image_tag
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
nodeSelector:
kubernetes.io/arch: "{{inputs.parameters.arch}}"
script:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: [bash]
source: |
set -xe
cd /code/
# needed to workaround protected_symlink - we can't just cd into /code/checkout
cd $(readlink checkout)
buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
echo Image built and pushed to remote registry
volumeMounts:
- name: code
mountPath: /code
- name: certs
mountPath: /certs
readOnly: true
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "{{inputs.parameters.repository}}",
"--branch", "{{inputs.parameters.branch}}"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
- name: certs
secret:
secretName: registry-cert
The workflow definition grew a bit. I’ve added a new template called build-images-arch-loop
, which is now
the entry point of the workflow. This template performs a loop over the
[ { arch: 'amd64' }, { arch: 'arm64' } ]
array, each time invoking the buildah
template with slightly different input parameters. The only parameter that changes
across the invocations is the arch
one, which is used to define the
nodeSelector
constraint.
Executing this workflow results in two steps being executed at the same time: one building the image on a random x86_64 node, the other doing the same thing on a random ARM64 node.
This can be clearly seen from the Argo Workflow UI:
When the workflow execution is over, the registry will contain two different images:
<image-name>:<image-tag>-amd64
<image-name>:<image-tag>-arm64
Now there’s just one last step to perform: create a multi-architecture container manifest referencing these two images.
The Image manifest Version 2, Schema 2
specification defines a new type of image manifest called “Manifest list”
(application/vnd.docker.distribution.manifest.list.v2+json
).
Quoting the official specification:
The manifest list is the “fat manifest” which points to specific image manifests for one or more platforms. Its use is optional, and relatively few images will use one of these manifests. A client will distinguish a manifest list from an image manifest based on the Content-Type returned in the HTTP response.
The creation of such a manifest is pretty easy and it can be done with docker, podman and buildah in a similar way.
I will still use buildah to create the manifest and push it to the registry where all the images are stored.
This is the Argo Template that takes care of that:
- name: create-manifest
inputs:
parameters:
- name: image_name
- name: image_tag
- name: architectures
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
volumes:
- name: certs
secret:
secretName: registry-cert
script:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: [bash]
source: |
set -xe
image_name="{{inputs.parameters.image_name}}"
image_tag="{{inputs.parameters.image_tag}}"
architectures="{{inputs.parameters.architectures}}"
target="${image_name}:${image_tag}"
architectures_list=($(echo $architectures | tr "," "\n"))
buildah manifest create ${target}
#Print the split string
for arch in "${architectures_list[@]}"
do
arch_image="${image_name}:${image_tag}-${arch}"
buildah pull --cert-dir /certs ${arch_image}
buildah manifest add ${target} ${arch_image}
done
buildah manifest push --cert-dir /certs ${target} docker://${target}
echo Manifest creation done
volumeMounts:
- name: certs
mountPath: /certs
readOnly: true
resources:
limits:
github.com/fuse: 1
The template has an input parameter called architectures
, this string is made
of the architectures names joined by a comma; e.g. "amd64,arm64"
.
The script
creates a manifest with the name of the image and then, iterating
over the architectures, it adds the architecture-specific images to it.
Once this is done the manifest is pushed to the container registry.
To make a simple example, assuming the following scenario:
guestbook-go
application with release v0.1.0
registry.svc.lan
registryThe Argo Template that creates the manifest will pull the following images:
registry.svc.lan/guestbook-go:v0.1.0-amd64
: the x86_64 imageregistry.svc.lan/guestbook-go:v0.1.0-arm64
: the ARM64 imageFinally, the Template will create and push a manifest named registry.svc.lan/guestbook-go:v0.1.0
.
This image reference will always return the right container image to the node
requesting it.
Adding the container image to the manifest is done with the
buildah manifest add
command. This command doesn’t actually need to have
the container image available locally, it would be enough to reach out to
the registry hosting it to obtain the manifest digest.
In our case the images are stored on a registry secured with
a custom certificate. Unfortunately, the manifest add
command
was lacking some flags (like the cert
one); because of that I had
to introduce the workaround of pre-pulling all the images referenced
by the manifest. This has the side effect of wasting some time, bandwidth and
disk space.
I’ve submitted patches both to buildah
and to podman to enrich their
manifest add
commands; both pull requests have been merged into the master
branches. The next release of buildah will ship with my patch and the
manifest creation Template will be simpler and faster.
Argo allows to define a workflow sequence with clear dependencies between each step. This is done by defining a DAG.
Our workflow will be made of one Argo Template of type DAG, that will have two tasks:
This is the Template definition:
- name: full-process
dag:
tasks:
- name: build-images
template: build-images-arch-loop
arguments:
parameters:
- name: repository
value: "{{workflow.parameters.repository}}"
- name: branch
value: "{{workflow.parameters.branch}}"
- name: image_name
value: "{{workflow.parameters.image_name}}"
- name: image_tag
value: "{{workflow.parameters.image_tag}}"
- name: create-multi-arch-manifest
dependencies: [build-images]
template: create-manifest
arguments:
parameters:
- name: image_name
value: "{{workflow.parameters.image_name}}"
- name: image_tag
value: "{{workflow.parameters.image_tag}}"
- name: architectures
value: "{{workflow.parameters.architectures_string}}"
As you can see the Template takes the usual series of parameters we’ve already defined, and forwards them to the tasks.
This is the full definition of our Argo workflow, hold on… this is really long 🙀
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: build-multi-arch-image-
spec:
ttlStrategy:
secondsAfterCompletion: 60
entrypoint: full-process
arguments:
parameters:
- name: repository
value: https://github.com/flavio/guestbook-go.git
- name: branch
value: master
- name: image_name
value: registry-testing.svc.lan/guestbook
- name: image_tag
value: 0.0.1
- name: architectures_string
value: "arm64,amd64"
templates:
- name: full-process
dag:
tasks:
- name: build-images
template: build-images-arch-loop
arguments:
parameters:
- name: repository
value: "{{workflow.parameters.repository}}"
- name: branch
value: "{{workflow.parameters.branch}}"
- name: image_name
value: "{{workflow.parameters.image_name}}"
- name: image_tag
value: "{{workflow.parameters.image_tag}}"
- name: create-multi-arch-manifest
dependencies: [build-images]
template: create-manifest
arguments:
parameters:
- name: image_name
value: "{{workflow.parameters.image_name}}"
- name: image_tag
value: "{{workflow.parameters.image_tag}}"
- name: architectures
value: "{{workflow.parameters.architectures_string}}"
- name: build-images-arch-loop
inputs:
parameters:
- name: repository
- name: branch
- name: image_name
- name: image_tag
steps:
- - name: build-image
template: buildah
arguments:
parameters:
- name: arch
value: "{{item.arch}}"
- name: repository
value: "{{inputs.parameters.repository}}"
- name: branch
value: "{{inputs.parameters.branch}}"
- name: image_name
value: "{{inputs.parameters.image_name}}"
- name: image_tag
value: "{{inputs.parameters.image_tag}}"
withItems:
- { arch: 'amd64' }
- { arch: 'arm64' }
- name: buildah
inputs:
parameters:
- name: arch
- name: repository
- name: branch
- name: image_name
- name: image_tag
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
nodeSelector:
kubernetes.io/arch: "{{inputs.parameters.arch}}"
volumes:
- name: code
emptyDir:
medium: Memory
- name: certs
secret:
secretName: registry-cert
script:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: [bash]
source: |
set -xe
cd /code/
# needed to workaround protected_symlink - we can't just cd into /code/checkout
cd $(readlink checkout)
buildah bud -t {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}} .
buildah push --cert-dir /certs {{inputs.parameters.image_name}}:{{inputs.parameters.image_tag}}-{{inputs.parameters.arch}}
echo Image built and pushed to remote registry
volumeMounts:
- name: code
mountPath: /code
- name: certs
mountPath: /certs
readOnly: true
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "{{inputs.parameters.repository}}",
"--branch", "{{inputs.parameters.branch}}"]
volumeMounts:
- name: code
mountPath: /tmp/git
- name: create-manifest
inputs:
parameters:
- name: image_name
- name: image_tag
- name: architectures
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
volumes:
- name: certs
secret:
secretName: registry-cert
script:
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: [bash]
source: |
set -xe
image_name="{{inputs.parameters.image_name}}"
image_tag="{{inputs.parameters.image_tag}}"
architectures="{{inputs.parameters.architectures}}"
target="${image_name}:${image_tag}"
architectures_list=($(echo $architectures | tr "," "\n"))
buildah manifest create ${target}
#Print the split string
for arch in "${architectures_list[@]}"
do
arch_image="${image_name}:${image_tag}-${arch}"
buildah pull --cert-dir /certs ${arch_image}
buildah manifest add ${target} ${arch_image}
done
buildah manifest push --cert-dir /certs ${target} docker://${target}
echo Manifest creation done
volumeMounts:
- name: certs
mountPath: /certs
readOnly: true
resources:
limits:
github.com/fuse: 1
That’s how life goes with Kubernetes, sometimes there’s just a lot of YAML…
Now we can submit the workflow to Argo:
$ argo submit build-pipeline-final.yml
Name: build-multi-arch-image-wndlr
Namespace: argo
ServiceAccount: default
Status: Pending
Created: Thu Oct 01 16:22:46 +0200 (now)
Parameters:
repository: {1 0 https://github.com/flavio/guestbook-go.git}
branch: {1 0 master}
image_name: {1 0 registry-testing.svc.lan/guestbook}
image_tag: {1 0 0.0.1}
architectures_string: {1 0 arm64,amd64}
The visual representation of the workflow is pretty nice:
As you might have noticed, I didn’t provide any parameter to argo submit
; the
Argo Workflow now has default values for all the input parameters.
Something worth of note, Argo Workflow leaves behind all the containers it creates. This is good to triage failures, but I don’t want to clutter my cluster with all these resources.
Argo provides cost optimization parameters to implement cleanup strategies. The one I’ve used above is the Workflow TTL Strategy.
You can see these lines at the top of the full Workflow definition:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: build-multi-arch-image-
spec:
ttlStrategy:
secondsAfterCompletion: 60
This triggers an automatic cleanup of all the PODs spawned by the Workflow 60 seconds after its completion, be it successful or not.
Today we have seen how to create a pipeline that builds container images for multiple architectures on top an existing Kubernetes cluster.
Argo Workflow proved to be a good solution for this kind of automation. There’s quite some YAML involved with that, but I highly doubt over projects would have spared us from that.
What can we do next? Well, to me the answer is pretty clear. The definition of the container image is stored inside of a Git repository; hence I want to connect my Argo Workflow to the events happening inside of the Git repository.
Stay tuned for more updates! In the meantime feedback is always welcome.
]]>The overall support of ARM inside of the container ecosystem improved a lot
over the last years with more container images made available for the
armv7
and the arm64
architectures.
But what about my own container images? I’m running some homemade containerized applications on top of this cluster and I would like to have them scheduled both on the x64_64 nodes and on the ARM ones.
There are many ways to build ARM container images. You can go from something as simple, and tedious, as performing manual builds on a real/emulated ARM machines or you can do something more structured like using this GitHub Action, relying on something like the Open Build Service,…
My personal desire was to leverage my mixed Kubernetes cluster and perform the image building right on top of it.
Implementing this design has been a great learning experience, something IMHO worth to be shared with others. The journey has been too long to fit into a single blog post; I’ll split my story into multiple posts.
Our journey begins with the challenge of building a container image from within a container.
The most known way to build a container image is by using docker build
.
I didn’t want to use docker to build my images because the build process
will take place right on top of Kubernetes, meaning the build will happen in a
containerized way.
Some people are using docker as the container runtime of their Kubernetes
clusters and are leveraging that to mount the docker socket inside of some
of their containers. Once the docker socket is mounted, the containerized
application has full access to the docker daemon that is running on the host.
From there it’s game over the container can perform actions such as building
new images.
I’m a strong opponent of this approach because it’s highly insecure. Moreover I’m not using docker as container runtime and I guess many people will stop doing that in the near future once dockershim gets deprecated. Translated: the majority of the future Kubernetes cluster will either have containerd, CRI-O or something similar instead of docker - hence bye bye to the docker socket hack.
There are however many other ways to build containers that are not based on
docker build
.
If you do a quick internet search about containerized image building you will definitely find kaniko. kaniko does exactly what I want: it performs containerized builds without using the docker daemon. There are also many examples covering image building on top of Kubernetes with kaniko. Unfortunately, at the time of writing, kaniko supports only the x86_64 architecture.
Our chances are not over yet because there’s another container building tool that can help us: buildah.
Buildah is part of the “libpod ecosystem”, which includes projects such as podman, skopeo and CRI-O. All these tools are available for multiple architectures: x86_64, aarch64 (aka ARM64), s390x and ppc64le.
Buildah can build container images starting from a Dockerfile
or in a
more interactive way. All of that without requiring any privileged daemon
running on your system.
During the last years the buildah developers spent quite some efforts to support the use case of “containerized buildah”. This is just the most recent blog post that discusses this scenario in depth.
Upstream has even a Dockerfile
that can be used to create a buildah container
image. This can be found here.
I took this Dockerfile
, made some minor adjustments and uploaded it
to this project
on the Open Build Service. As a result I got a multi architecture container
image that can be pulled from registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
.
As some container veterans probably know, there are several types of storage drivers that can be used by container engines.
In case you’re not familiar with this topic you can read these great documentation pages from Docker:
Note well: despite being written for the docker container engine, this applies also to podman, buildah, CRI-O and containerd.
The most portable and performant storage driver is the overlay
one.
This is the one we want to use when running buildah containerized.
The overlay
driver can be used in safe way even inside of a container by
leveraging fuse-overlay; this
is described by the buildah blog post I linked above.
However, using the overlay
storage driver inside of a container requires
Fuse to be enabled on the host and, most important of all, it requires the
/dev/fuse
device to be accessible by the container.
The share operation cannot be done by simply mounting /dev/fuse
as a volume
because there are some extra “low level” steps that must be done (like properly
instructing the cgroup device hierarchy).
These extra steps are automatically handled by docker and podman via the
--device
flag of the run
command:
$ podman run --rm -ti --device /dev/fuse buildahimage bash
This problem will need to be solved in a different way when buildah is run on top of Kubernetes.
Special host devices can be shared with containers running inside of a Kubernetes POD by using a recent feature called Kubernetes device plugins.
Quoting the upstream documentation:
Kubernetes provides a device plugin framework that you can use to advertise system hardware resources to the Kubelet .
Instead of customizing the code for Kubernetes itself, vendors can implement a device plugin that you deploy either manually or as a DaemonSet . The targeted devices include GPUs, high-performance NICs, FPGAs, InfiniBand adapters, and other similar computing resources that may require vendor specific initialization and setup.
This Kubernetes feature is commonly used to allow containerized machine learning workloads to access the GPU cards available on the host.
Luckily someone wrote a Kubernetes device plugin that exposes /dev/fuse
to
Kubernetes-managed containers:
fuse-device-plugin.
I’ve forked the project, made
some minor fixes to its Dockerfile and created a GitHub action to build
the container image for amd64
, armv7
and amd64
(a PR is coming soon).
The images are available on the Docker Hub as: flavio/fuse-device-plugin
.
The fuse-device-plugin has to be deployed as a Kubernetes DaemonSet via this yaml file:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fuse-device-plugin-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
name: fuse-device-plugin-ds
template:
metadata:
labels:
name: fuse-device-plugin-ds
spec:
hostNetwork: true
containers:
- image: flavio/fuse-device-plugin:latest
name: fuse-device-plugin-ctr
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
This is basically this file,
with the flavio/fuse-device-plugin
image being used instead of the
original one (which is built only for x86_64).
Once the DaemonSet PODs are running on all the nodes of the cluster, we can see
the Fuse device being exposed as an allocatable resource identified by the
github.com/fuse
key:
$ kubectl get nodes -o=jsonpath=$'{range .items[*]}{.metadata.name}: {.status.allocatable}\n{end}'
jam-2: map[cpu:4 ephemeral-storage:224277137028 github.com/fuse:5k memory:3883332Ki pods:110]
jam-1: map[cpu:4 ephemeral-storage:111984762997 github.com/fuse:5k memory:3883332Ki pods:110]
jolly: map[cpu:4 ephemeral-storage:170873316014 github.com/fuse:5k gpu.intel.com/i915:1 hugepages-1Gi:0 hugepages-2Mi:0 memory:16208280Ki pods:110]
The Fuse device can then be made available to a container by specifying a resource limit:
apiVersion: v1
kind: Pod
metadata:
name: fuse-example
spec:
containers:
- name: main
image: alpine
command: ["ls", "-l", "/dev"]
resources:
limits:
github.com/fuse: 1
If you look at the logs of this POD you will see something like that:
$ kubectl logs fuse-example
total 0
lrwxrwxrwx 1 root root 11 Sep 15 08:31 core -> /proc/kcore
lrwxrwxrwx 1 root root 13 Sep 15 08:31 fd -> /proc/self/fd
crw-rw-rw- 1 root root 1, 7 Sep 15 08:31 full
crw-rw-rw- 1 root root 10, 229 Sep 15 08:31 fuse
drwxrwxrwt 2 root root 40 Sep 15 08:31 mqueue
crw-rw-rw- 1 root root 1, 3 Sep 15 08:31 null
lrwxrwxrwx 1 root root 8 Sep 15 08:31 ptmx -> pts/ptmx
drwxr-xr-x 2 root root 0 Sep 15 08:31 pts
crw-rw-rw- 1 root root 1, 8 Sep 15 08:31 random
drwxrwxrwt 2 root root 40 Sep 15 08:31 shm
lrwxrwxrwx 1 root root 15 Sep 15 08:31 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Sep 15 08:31 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root 15 Sep 15 08:31 stdout -> /proc/self/fd/1
-rw-rw-rw- 1 root root 0 Sep 15 08:31 termination-log
crw-rw-rw- 1 root root 5, 0 Sep 15 08:31 tty
crw-rw-rw- 1 root root 1, 9 Sep 15 08:31 urandom
crw-rw-rw- 1 root root 1, 5 Sep 15 08:31 zero
Now that this problem is solved we can move to the next one. 😉
The source code of the “container image to be built” must be made available to the containerized buildah.
As many people do, I keep all my container definitions versioned inside of Git repositories. I had to find a way to clone the Git repository holding the definition of the “container image to be built” inside of the container running buildah.
I decided to settle for this POD layout:
git clone
the source code of the “container image to be built”
before the main container is started.The contents produced by the git clone
must be placed into a
directory that can be accessed later on by the main container.
I decided to use a Kubernetes volume of type emptyDir
to create a shared storage between the init and the main containers.
The emptyDir
volume is just perfect: it doesn’t need any fancy Kubernetes
Storage Class and it will automatically vanish once the build is done.
To checkout the Git repository I decided to settle on the official Kubernetes git-sync container.
Quoting its documentation:
git-sync is a simple command that pulls a git repository into a local directory. It is a perfect “sidecar” container in Kubernetes - it can periodically pull files down from a repository so that an application can consume them.
git-sync can pull one time, or on a regular interval. It can pull from the HEAD of a branch, from a git tag, or from a specific git hash. It will only re-pull if the target of the run has changed in the upstream repository. When it re-pulls, it updates the destination directory atomically. In order to do this, it uses a git worktree in a subdirectory of the –root and flips a symlink.
git-sync can pull over HTTP(S) (with authentication or not) or SSH.
This is just what I was looking for.
I will start git-sync
with the following parameters:
--one-time
: this is needed to make git-sync exit once the checkout is done;
otherwise it will keep running forever and it will periodically look for new
commits inside of the repository. I don’t need that, plus this would cause
the main container to wait indefinitely for the init container to exit.--depth 1
: this is done to limit the checkout to the latest commit. I’m not
interested in the history of the repository. This will make the checkout faster
and use less bandwidth and disk space.--repo <my-repo
: the repo I want to checkout.--branch <my-branch>
: the branch to checkout.The git-sync container image was already built for multiple architectures, but unfortunately it turned out the non x86_64 images were broken. The issue has been recently solved with the v3.1.7.
While waiting for the issue to be fixed I just rebuilt the container image on the Open Build Service. This is no longer needed, everybody can just use the official image.
It’s now time to perform a simple test run. We will define a simple Kubernetes POD that will:
buildah
This is the POD definition:
apiVersion: v1
kind: Pod
metadata:
name: builder-amd64
spec:
nodeSelector:
kubernetes.io/arch: "amd64"
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "https://github.com/flavio/guestbook-go.git",
"--branch", "master"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
containers:
- name: main
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: ["/bin/sh"]
args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
volumeMounts:
- name: code
mountPath: /code
resources:
limits:
github.com/fuse: 1
Let’s break it down into pieces.
The POD uses a Kubernetes node selector to ensure the build happens on a node with the x86_64 architecture. By doing that we will know the architecture of the final image.
As said earlier, the Git repository is checked out using an init container:
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "https://github.com/flavio/guestbook-go.git",
"--branch", "master"]
volumeMounts:
- name: code
mountPath: /tmp/git
The Git repository and the branch are currently hard-coded into the POD definition, this is going to be fixed later on. Right now that’s good enough to see if things are working (spoiler alert: they won’t 😅).
The git-sync container will run before the main
container and it will write
the source code of the “container image to be built” inside of a Kubernetes
volume named code
.
This is how the volume will look like after git-sync
has ran:
$ ls -lh <root of the volume>
drwxr-xr-x 9 65533 65533 300 Sep 15 09:41 .git
lrwxrwxrwx 1 65533 65533 44 Sep 15 09:41 checkout -> rev-155a69b7f81d5b010c5468a2edfbe9228b758d64
drwxr-xr-x 6 65533 65533 280 Sep 15 09:41 rev-155a69b7f81d5b010c5468a2edfbe9228b758d64
The source code is stored under the rev-<git commit ID>
directory. There’s
a symlink named checkout
that points to it. As you will see later, this will
lead to a small twist.
The source code of our application is stored inside of a Kubernetes volume of
type emptyDir
:
volumes:
- name: code
emptyDir:
medium: Memory
I’ve also instructed Kubernetes to store the volume in memory. Behind the scene Kubelet will use tmpfs to do that.
The POD will have just one container running inside of it. This is called main
and its only purpose is to run buildah.
This is the definition of the container:
containers:
- name: main
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: ["/bin/sh"]
args: ["-c", "cd /code; cd $(readlink checkout); buildah bud -t guestbook ."]
volumeMounts:
- name: code
mountPath: /code
resources:
limits:
github.com/fuse: 1
As expected the container is mounting the code
Kubenetes volume too. Moreover,
the container is requesting one resource of type github.com/fuse
; as explained
above this is needed to make /dev/fuse
available inside of the container.
The container executes a simple bash script. The oneliner can be expanded to that:
cd /code
cd $(readlink checkout)
buildah bud -t guestbook .
There’s one interesting detail in there. As you can see I’m not “cd-ing” straight
into /code/checkout
, instead I’m moving into /code
and then resolving
the actual target of the checkout
symlink.
We can’t move straight into /code/checkout
because that would give us an
error:
builder:/ # cd /code/checkout
bash: cd: /code/checkout: Permission denied
This happens because /proc/sys/fs/protected_symlinks
is turned on by default.
As you can read here, this
is a way to protect from specific type of exploits. Not even root
inside of the
container can jump straight into /code/checkout
, this is why I’m doing this
workaround.
One last note, as you have probably noticed, buildah is just building the container image, it’s not pushing it to any registry. We don’t care about that right now.
Our journey is not over yet, there’s one last challenge ahead of us.
Before digging into the issue, let me provide some background. My local cluster was initially made by one x86_64 node running openSUSE Leap 15.2 and by two ARM64 nodes running the beta ARM64 build of Rasperry Pi OS (formerly known as raspbian).
I used the POD definition shown above to define two PODs:
builder-amd64
: the nodeSelector
constraint targets the amd64
architecturebuilder-arm64
: the nodeSelector
constraint targets the arm64
architectureThat lead to an interesting finding: the builds on ARM64 nodes worked fine, while all the builds on the x86_64 node failed.
The failure was always the same and happened straight at the beginning of the process:
$ kubectl logs -f builder-amd64
mount /var/lib/containers/storage/overlay:/var/lib/containers/storage/overlay, flags: 0x1000: permission denied
level=error msg="exit status 125"
To me, that immediately smelled like a security feature blocking buildah.
I needed something faster then kubectl
to iterate over this problem.
Luckily I was able to reproduce the same error while running buildah locally
using podman:
$ sudo podman run \
--rm \
--device /dev/fuse \
-v <path-to-container-image-sources>:/code \
registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
/bin/sh -c "cd /code; buildah bud -t foo ."
I was pretty sure the failure happened due to some tight security check. To prove my theory I ran the same container in privileged mode:
$ sudo podman run \
--rm \
--device /dev/fuse \
--privileged \
-v <path-to-container-image-sources>:/code \
registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
/bin/sh -c "cd /code; buildah bud -t foo ."
The build completed successfully. Running a container in privileged mode is bad and makes me hurt, it’s not a long term solution but at least that proved the build failure was definitely caused by some security constraint.
The next step was to identify the security measure at the origin of the failure. That could be either something related with seccomp or AppArmor. I immediately ruled out SELinux as the root cause because it’s not used on openSUSE by default.
I then ran the container again, but this time I instructed podman to not apply any kind of seccomp profile; I basically disabled seccomp for my containerized workload.
This can be done by using the unconfined
mode for seccomp:
$ sudo podman run \
--rm \
--device /dev/fuse \
-v <path-to-container-image-sources>:/code \
--security-opt=seccomp=unconfined \
registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
/bin/sh -c "cd /code; buildah bud -t foo ."
The build failed again with the same error. That meant seccomp was not causing the failure. AppArmor was left as the main suspect.
Next, I just run the container but I instructed podman to not apply any kind of AppArmor profile; again, I basically disabled AppArmor for my containerized workload.
This can be done by using the unconfined
mode for AppArmor:
$ sudo podman run \
--rm \
--device /dev/fuse \
-v <path-to-container-image-sources>:/code \
--security-opt=apparmor=unconfined \
registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
/bin/sh -c "cd /code; buildah bud -t foo ."
This time the build completed successfully. Hence the issue was caused by the default AppArmor profile.
All the container engines (docker, podman, CRI-O, containerd) have an AppArmor profile that is applied to all the containerized workloads by default.
The containerized Buildah is probably doing something that is not allowed by this generic profile. I just had to identify the offending operation and create a new tailor-made AppArmor profile for buildah.
As a first step I had to obtain the default AppArmor profile. This is not as easy as it might seem. The profile is generated at runtime by all the container engines and is loaded into the kernel. Unfortunately there’s no way to dump the information stored into the kernel and have a human-readable AppArmor profile.
After some digging into the source code of podman and some reading on docker’s GitHub issues, I produced a quick PR that allowed me to print the default AppArmor profile on to the stdout.
This is the default AppArmor profile used by podman:
#include <tunables/global>
profile default flags=(attach_disconnected,mediate_deleted) {
#include <abstractions/base>
network,
capability,
file,
umount,
# Allow signals from privileged profiles and from within the same profile
signal (receive) peer=unconfined,
signal (send,receive) peer=default,
deny @{PROC}/* w, # deny write for all files directly in /proc (not in a subdir)
# deny write to files not in /proc/<number>/** or /proc/sys/**
deny @{PROC}/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}/** w,
deny @{PROC}/sys/[^k]** w, # deny /proc/sys except /proc/sys/k* (effectively /proc/sys/kernel)
deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**} w, # deny everything except shm* in /proc/sys/kernel/
deny @{PROC}/sysrq-trigger rwklx,
deny @{PROC}/kcore rwklx,
deny mount,
deny /sys/[^f]*/** wklx,
deny /sys/f[^s]*/** wklx,
deny /sys/fs/[^c]*/** wklx,
deny /sys/fs/c[^g]*/** wklx,
deny /sys/fs/cg[^r]*/** wklx,
deny /sys/firmware/** rwklx,
deny /sys/kernel/security/** rwklx,
# suppress ptrace denials when using using 'ps' inside a container
ptrace (trace,read) peer=default,
}
A small parenthesis, this AppArmor profile is the same generated by all the other container engines. Some poor folks keep this file in sync manually, but there’s a discussion upstream to better organize things.
Back to the build failure caused by AppArmor… I saved the default profile into
a text file named containerized_buildah
and I changed this line
profile default flags=(attach_disconnected,mediate_deleted) {
to look like that:
profile containerized_buildah flags=(attach_disconnected,mediate_deleted,complain) {
This changes the name of the profile and, most important of all, changes the policy mode
to be of type complain
instead of enforcement
.
Quoting the AppArmor man page:
- enforcement - Profiles loaded in enforcement mode will result in enforcement of the policy defined in the profile as well as reporting policy violation attempts to syslogd.
- complain - Profiles loaded in “complain” mode will not enforce policy. Instead, it will report policy violation attempts. This mode is convenient for developing profiles.
I then loaded the policy by doing:
$ sudo apparmor_parser -r containerized_buildah
Invoking the aa-status
command reports a list of all the modules loaded,
their policy mode and all the processes confined by AppArmor.
$ sudo aa-status
...
2 profiles are in complain mode.
containerized_buildah
...
One last operation had to done before I could start to debug the containerized buildah: turn off “audit quieting”. Again, straight from AppArmor’s man page:
Turn off deny audit quieting
By default, operations that trigger “deny” rules are not logged. This is called deny audit quieting.
To turn off deny audit quieting, run:
echo -n noquiet >/sys/module/apparmor/parameters/audit
Before starting the container, I opened a new terminal to execute this process:
# tail -f /var/log/audit/audit.log | tee apparmor-build.log
On systems where auditd is running (like mine), all the AppArmor logs are sent
to /var/log/audit/audit.log
. This command allowed me to keep an eye open on
the live stream of audit logs and save them into a smaller file named
apparmor-build.log
.
Finally, I started the container using the custom AppArmor profile shown above:
$ sudo podman run \
--rm \
--device /dev/fuse \
-v <path-to-container-image-sources>:/code \
--security-opt=apparmor=containerized_buildah \
registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest \
/bin/sh -c "cd /code; buildah bud -t foo ."
The build completed successfully. Grepping for ALLOWED
inside of the audit file
returned a stream of entries like the following ones:
type=AVC msg=audit(1600172410.567:622): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/tmp/containers.o5iLtx" pid=25607 comm="exe" srcname="/usr/bin/buildah" flags="rw, bind"
type=AVC msg=audit(1600172410.567:623): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/tmp/containers.o5iLtx" pid=25607 comm="exe" flags="ro, remount, bind"
type=AVC msg=audit(1600172423.511:624): apparmor="ALLOWED" operation="mount" info="failed mntpnt match" error=-13 profile="containerized_buildah" name="/" pid=25629 comm="exe" flags="rw, rprivate"
...
As you can see all these entries are about mount
operations, with mount
being invoked with quite an assortment of flags.
The default AppArmor profile explicitly denies mount
operations:
...
deny mount,
...
All I had to do was to change the containerized_buildah
AppArmor profile to
that:
#include <tunables/global>
profile containerized_buildah flags=(attach_disconnected,mediate_deleted) {
#include <abstractions/base>
network,
capability,
file,
umount,
mount,
# Allow signals from privileged profiles and from within the same profile
signal (receive) peer=unconfined,
signal (send,receive) peer=default,
deny @{PROC}/* w, # deny write for all files directly in /proc (not in a subdir)
# deny write to files not in /proc/<number>/** or /proc/sys/**
deny @{PROC}/{[^1-9],[^1-9][^0-9],[^1-9s][^0-9y][^0-9s],[^1-9][^0-9][^0-9][^0-9]*}/** w,
deny @{PROC}/sys/[^k]** w, # deny /proc/sys except /proc/sys/k* (effectively /proc/sys/kernel)
deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**} w, # deny everything except shm* in /proc/sys/kernel/
deny @{PROC}/sysrq-trigger rwklx,
deny @{PROC}/kcore rwklx,
deny /sys/[^f]*/** wklx,
deny /sys/f[^s]*/** wklx,
deny /sys/fs/[^c]*/** wklx,
deny /sys/fs/c[^g]*/** wklx,
deny /sys/fs/cg[^r]*/** wklx,
deny /sys/firmware/** rwklx,
deny /sys/kernel/security/** rwklx,
# suppress ptrace denials when using using 'ps' inside a container
ptrace (trace,read) peer=default,
}
The profile is now back to enforcement mode and, most important of all, it
allows any kind of mount
invocation.
I tried to be more granular and allow only the mount
flags
actually used by buildah, but the list was too long, there were too many
combinations and that seemed pretty fragile. The last thing I want to happen is
to have AppArmor break buildah in the future if a slightly different mount
operation is done.
Reloading the AppArmor profile via sudo apparmor_parser -r containerized_buildah
and restarting the build proved that the profile was doing its job also in
enforcement mode: the build successfully completed. 🎉🎉🎉
But the journey over yet, not quite…
Once I figured out the root cause of x86_64 builds there was one last mystery to be solved: why the ARM64 builds worked just fine? Why didn’t AppArmor cause any issue over there?
The answer was quite simple (and a bit shocking to me): it turned out the Raspberry Pi OS (formerly known as raspbian) ships a kernel that doesn’t have AppArmor enabled. I never realized that!
I didn’t find the idea of running containers without any form of Mandatory Access Control particularly thrilling. Hence I decided to change the operating system run on my Raspberry Pi nodes.
I initially picked Raspberry Pi OS because I wanted to have my Raspberry Pi 4 boot straight from an external USB disk instead of the internal memory card. At the time of writing, this feature requires a bleeding edge firmware and all the documentation points at Raspberry Pi OS. I just wanted to stick with what the community was using to reduce my chances of failure…
However, if you need AppArmor support, you’re left with two options: openSUSE and Ubuntu.
I installed openSUSE Leap 15.2 for aarch64 (aka ARM64) on one of my Raspberry Pi 4. The process of getting it to boot from USB was pretty straightforward. I added the node back into the Kubernetes cluster, forced some workloads to move on top of it and monitored its behaviour. Everything was great, I was ready to put openSUSE on my 2nd Raspberry Pi 4 when I noticed something strange: my room was quieter than the usual…
My Raspberry Pis are powered using the official PoE HAT. I love this hat, but I hate its built-in fan because it’s notoriously loud (yes, you can tune its thresholds, but it’s still damn noisy when it kicks in).
Well, my room was suddenly quieter because the fan of the PoE HAT was not spinning at all. That lead the CPU temperature to reach more than 85 °C 😱
It turns out the PoE HAT needs a driver which is not part of the mainstream kernel and unfortunately nobody added it to the openSUSE kernel yet. That means openSUSE doesn’t see and doesn’t even turn on the PoE HAT fan (not even at full speed).
I filed a enhancement bug report against openSUSE Tumbleweed to get the PoE HAT driver added to our kernel and moved over to Ubuntu. Unfortunately that was a blocking issue for me. What a pity 😢
On the other hand, the kernel of Ubuntu Server supports both the PoE HAT fan and AppArmor. After some testing I switched all my Raspberry Pi nodes to run Ubuntu 20.04 Server.
To prove my mental sanity, I ran the builder-arm64
POD against the Ubuntu
nodes using the default AppArmor profile. The build failed on ARM64 in the same
way as it did on x86_64. What a relief 😅.
At this point I’ve a tailor-made AppArmor profile for buildah, plus all the nodes of my cluster have AppArmor support. It’s time to put all the pieces together!
The previous POD definition has to be extended to ensure the main container running buildah is using the tailor-made AppArmor profile instead of the default one.
Kubernetes’ AppArmor support is a bit primitive, but effective. The only requirement, when using custom profiles, is to ensure the profile is already known by the AppArmor system on each node of the cluster.
This can be done in an easy way: just copy the profile under
/etc/apparmor.d
and perform a systemct reload apparmor
. This has to be done
once, at the next boot the AppArmor service will automatically load all
the profiles found inside of /etc/apparmor.d
.
This is how the final POD definition looks like:
apiVersion: v1
kind: Pod
metadata:
name: builder-amd64
annotations:
container.apparmor.security.beta.kubernetes.io/main: localhost/containerized_buildah
spec:
nodeSelector:
kubernetes.io/arch: "amd64"
containers:
- name: main
image: registry.opensuse.org/home/flavio_castelli/containers/containers/buildahimage:latest
command: ["/bin/sh"]
args: ["-c", "cd code; cd $(readlink checkout); buildah bud -t guestbook ."]
volumeMounts:
- name: code
mountPath: /code
resources:
limits:
github.com/fuse: 1
initContainers:
- name: git-sync
image: k8s.gcr.io/git-sync/git-sync:v3.1.7
args: [
"--one-time",
"--depth", "1",
"--dest", "checkout",
"--repo", "https://github.com/flavio/guestbook-go.git",
"--branch", "master"]
volumeMounts:
- name: code
mountPath: /tmp/git
volumes:
- name: code
emptyDir:
medium: Memory
This time the build will work fine also inside of Kubernetes, regardless of the node architecture! 🥳
First of all, congratulations for having made up to this point. It has been quite a long journey, I hope you enjoyed it.
The next step consists of taking this foundation (a Kubernetes POD that can run buildah to build new container images) and find a way to orchestrate that.
What I’ll show you in the next blog post is how to create a workflow that, given a GitHub repository with a Dockerfile, builds two container images (amd64 and arm64), pushes both of them to a container registry and then creates a multi-architecture manifest referencing them.
As always feedback is welcome, see you soon!
]]>For example a Node.js application relying on left-pad could force only certain
versions of this library to be used by specifying a constraint like
>= 1.1.0 < 1.2.0
. This would force npm to install the latest version of the
library that satisfies the constraint.
How does that translates to containers?
Imagine the following scenario: a developer deploys a containerized application
that requires a Redi database. The developer deploys the latest version
of the redis container (eg: redis:4.0.5
), ensures his application works fine
and then moves to do other things.
After some weeks a security issue/bug is found inside of Redis and a new patched release takes place. Suddenly the deployed container is outdated. How can the developer be aware a new v4 release of Redis is available? Wouldn’t be even better to have some automated tool taking care of this upgrade?
After some more weeks a new minor release of Redis is released (eg: 4.1.0
).
Is it safe to automatically update to a new minor release of Redis, is the
developer application going to work as expected?
Some container images have special tags like v4
or v4.1
and the developer
could just leverage them to kinda pinpoint the redis container to a more
delimited set of versions. However using these tags reduces reproducibility
and debuggability.
Let’s imagine the redis image being deployed is redis:v4.1
and everything is
working as expected. Assume after some time the developer (or some automated tool)
pulls a new version of the redis:v4.1
image and suddenly the application
has some issues. How can the developer understand what really changed?
Wouldn’t it be great to be able to say something like “everything worked fine
with redis:4.1.0
but it broke when I upgraded to redis:4.1.9
”?
There are some tools that can be used to find and automatically update old container images: Watchtower and ouroboros. However none of them allows the flexibility I was looking for (in terms of checks), plus they are both tailored to work only against docker.
Because of that, during the 2020 edition of SUSE Hackweek, I spent some time working on a different solution to this use case.
fresh-container is a tool that can be used to see if a container can be updated to a more recent release.
fresh-container is different compared to Watchtower and ouroboros because it relies on semantic versioning to process container image tags.
Semantic versioning is used to express the version constraints a container version must satisfy. This gives more flexibility, for example take a look at the following scenarios:
>= 4.0.0 < 5.0.0
>= 4.1.0 < 4.2.0
< 6.0.0
fresh-container
can be run as a standalone program:
$ fresh-container check --constraint ">= 1.9.0 < 1.10.0" nginx:1.9.0
The 'docker.io/library/nginx' container image can be upgraded from the '1.9.0' tag to the '1.9.15' one and still satisfy the '>= 1.9.0 < 1.10.0' constraint.
Behind the scenes fresh-container will query the container registry hosting the image to gather the list of all the available tags. The tags that do not respect semantic versioning will be ignored and finally the tool will evaluate the constraint provided by the user.
It can also generate computer parsable output by producing a JSON response:
$ fresh-container check -o json --constraint ">= 1.9.0 < 1.10.0" nginx:1.9.0
{
"image": "docker.io/library/nginx",
"constraint": ">= 1.9.0 < 1.10.0",
"current_version": "1.9.0",
"next_version": "1.9.15",
"stale": true
}
Querying the remote container registries to fetch all the available tags of a container image is an expensive operation. That gets even worse when multiple containers have to be inspected on a regular basis.
The fresh-container binary can operate in a server mode to alleviate this issue:
$ fresh-container server
This will start a web server offering a simple REST API that can be used to perform queries. The remote tags of the container images are cached inside of an in-memory database to speed up constraint resolution.
It’s possible to run fresh-container check
against a fresh-container
server
to perform faster queries by using the --server <http://fresh-container-server>
flag.
fresh-container is a tool built to serve one specific use case: your provide some data as input and, as output, it will tell you if the container image can be updated to a more recent version.
It’s main goal is to be leveraged by other tools to build something bigger like fresh-container-operator.
This is a kubernetes operator that, once deployed inside of a kubernetes cluster, will look at all the kubernetes deployments running inside of it and finds the ones having stale containers.
The operator can also automatically update these outdated deployments to use the latest version of the container images that satisfy their requirements.
How does it work? First of all you have to enrich your deployment definition by adding some ad-hoc annotations.
For each container image used by the deployment you have to specify the semantic versioning constraint that has to be used to evaluate their “freshness”.
Take a look at the following example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
annotations:
fresh-container.autopilot: "false"
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
annotations:
fresh-container.constraint/nginx: ">= 1.9.0 < 1.10.0"
spec:
containers:
- name: nginx
image: nginx:1.9.0
ports:
- containerPort: 80
In this case the operator will look at the version of the nginx container
in use and evaluate it against the >= 1.9.0 < 1.10.0
constraint.
Note well: deployments that do not have any
fresh-container.constraint/<container name>
will be ignored by the operator.
The operator adds the special label fresh-container.hasOutdatedContainers=true
to all the deployments that have one or more stale containers inside of them.
This allows quick searches against all the deployments:
$ kubectl get deployments --all-namespaces -l fresh-container.hasOutdatedContainers=true
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default nginx-deployment 1/1 1 1 19m
The details about the stale containers are added by the operator as annotations of the deployment:
kubectl describe deployments.apps nginx-deployment
Name: nginx-deployment
Namespace: default
CreationTimestamp: Thu, 27 Feb 2020 10:32:55 +0100
Labels: fresh-container.hasOutdatedContainers=true
Annotations: deployment.kubernetes.io/revision: 1
fresh-container.autopilot: false
fresh-container.lastChecked: 2020-02-27T09:45:07Z
fresh-container.nextTag/nginx: 1.9.15
For each stale container the operator adds an annotation with
fresh-container.nextTag/<container name>
as key and the tag of the most recent
container that satisfies the constraint as value.
In the example above you can see that the nginx container inside of the
deployment can be updated to the 1.9.15
tag while still satisfying the
>= 1.9.0 < 1.10.0
constraint.
The next step is to allow fresh-container-operator to update all the deployments that have stale containers.
This is not done by default, but can be enable on a per-deployment basis
by adding the fresh-container.autopilot=true
annotation inside of the
deployment metadata.
As I stated in the beginning I created these projects during the 2020 edition of SUSE Hackweek. They are early prototypes that need more love.
I would be happy to hear what you think about them. Feel free to leave a comment below or open an issue on their GitHub projects:
]]>You might wonder why this is needed, after all it’s already possible to run a docker distribution (aka registry) instance as a pull-through cache. While that’s true, this solution doesn’t address the needs of more “sophisticated” users.
Based on the feedback we got from a lot of SUSE customers it’s clear that a simple registry configured to act as a pull-through cache isn’t enough.
Let’s go step by step to understand the requirements we have.
First of all it should be possible to have a mirror of certain container images locally. This is useful to save time and bandwidth. For example there’s no reason to download the same image over an over on each node of a Kubernetes cluster.
A docker registry configured to act as a pull-through cache can help with that. There’s still need to warm the cache, this can be left to the organic pull of images done by the cluster or could be done artificially by some script run by an operator.
Unfortunately a pull-through cache is not going to solve this problem for nodes running inside of an air-gapped environment. Nodes operated in such an environment are located into a completely segregated network, that would make it impossible for the pull-through registry to reach the external registry.
Cluster operators want to have control of the images available inside of the local mirror.
For example, assuming we are mirroring the Docker Hub, an operator might be
fine with having the library/mariadb
image but not the library/redis
one.
When operating a registry configured as pull-through cache, all the images of
the upstream registry are at the reach of all the users of the cluster. It’s
just a matter of doing a simple docker pull
to get the image cached into
the local pull-through cache and sneak it into all the nodes.
Moreover some operators want to grant the privilege of adding images to the local mirror only to trusted users.
The Docker Hub is certainly the most known container registry. However there are
also other registries being used: SUSE operates its own registry, there’s Quay.io,
Google Container Registry (aka gcr
) and there are even user operated ones.
A docker registry configured to act as a pull-through cache can mirror only one registry. Which means that, if you are interested in mirroring both the Docker Hub and Quay.io, you will have to run two instances of docker registry pull-through caches: one for the Docker Hub, the other for Quay.io.
This is just overhead for the final operator.
During the last week I worked to build a PoC to demonstrate we can create a docker registry mirror solution that can satisfy all the requirements above.
I wanted to have a single box running the entire solution and I wanted all the different pieces of it to be containerized. I hence resorted to use a node powered by openSUSE Kubic.
I didn’t need all the different pieces of Kubernetes, I just needed kubelet
so
that I could run it in disconnected mode. Disconnected means the kubelet
process is not connected to a Kubernetes API server, instead it reads PODs
manifest files straight from a local directory.
I created an openSUSE Kubic node and then I started by deploying a standard docker registry. This instance is not configured to act as a pull-through cache. However it is configured to use an external authorization service. This is needed to allow the operator to have full control of who can push/pull/delete images.
I configured the registry POD to store the registry data to a directory on the machine by using a Kubernetes hostPath volume.
On the very same node I deployed the authorization service needed by the docker registry. I choose Portus, an open source solution created at SUSE a long time ago.
Portus needs a database, hence I deployed a containerized instance of MariaDB
on the same node. Again I used a Kubernetes hostPath
to ensure the persistence
of the database contents. I placed both Portus and its MariaDB instance into the
same POD. I configured MariaDB to listen only to localhost, making it reachable
only by the Portus instance (that’s because they are in the same
Kubernetes POD).
I configured both the registry and Portus to bind to a local unix socket, then I deployed a container running HAProxy to expose both of them to the world.
The HAProxy is the only container that uses the host network. Meaning it’s actually listening on port 80 and port 443 of the openSUSE Kubic node.
I went ahead and created two new DNS entries inside of my local network:
registry.kube.lan
: this is the FQDN of the registryportus.kube.lan
: this is the FQDN of portusI configured both the names to be resolved with the IP address of my container host.
I then used cfssl to generate a CA and
then a pair of certificates and keys for registry.kube.lan
and portus.kube.lan
.
Finally I configured HAProxy to:
By having dedicated FQDN for the registry and Portus and by using HAProxy’s SNI
based load balancing, we can leave the registry listening on a standard port
(443
) instead of using a different one (eg: 5000
). In my opinion that’s a big
win, based on my personal experience having the registry listen on a non standard
port makes things more confusing both for the operators and the end users.
Once I was over with these steps I was able to log into https://portus.kube.lan
and perform the usual setup wizard of Portus.
We now have to mirror images from multiple registries into the local one, but how can we do that?
Sometimes ago I stumbled over this tool, which can be used to copy images from multiple registries into a single one. While doing that it can change the namespace of the image to put it all the images coming from a certain registry into a specific namespace.
I wanted to use this tool, but I realized it relies on the docker open-source engine to perform the pull and push operations. That’s a blocking issue for me because I wanted to run the mirroring tool into a container without doing nasty tricks like mounting the docker socket of the host into a container.
Basically I wanted the mirroring tool to not rely on the docker open source engine.
At SUSE we are already using and contributing to skopeo, an amazing tool that allows interactions with container images and container registries without requiring any docker daemon.
The solution was clear: extend skopeo to provide mirroring capabilities.
I drafted a design proposal with my colleague Marco Vedovati, started coding and then ended up with this pull request.
While working on that I also uncovered a small glitch
inside of the containers/image
library used by skopeo.
Using a patched skopeo binary (which include both the patches above) I then mirrored a bunch of images into my local registry:
$ skopeo sync --source docker://docker.io/busybox:musl --dest-creds="flavio:password" docker://registry.kube.lan
$ skopeo sync --source docker://quay.io/coreos/etcd --dest-creds="flavio:password" docker://registry.kube.lan
The first command mirrored only the busybox:musl
container image from the
Docker Hub to my local registry, while the second command mirrored all the
coreos/etcd
images from the quay.io registry to my local registry.
Since the local registry is protected by Portus I had to specify my credentials while performing the sync operation.
Running multiple sync
commands is not really practical, that’s why we added
a source-file
flag. That allows an operator to write a configuration file
indicating the images to mirror. More on that on a dedicated blog post.
At this point my local registry had the following images:
docker.io/busybox:musl
quay.io/coreos/etcd:v3.1
quay.io/coreos/etcd:latest
quay.io/coreos/etcd:v3.3
quay.io/coreos/etcd:v3.3
quay.io/coreos/etcd
images …As you can see the namespace of the mirrored images is changed to include the FQDN of the registry from which they have been downloaded. This avoids clashes between the images and makes easier to track their origin.
As I mentioned above I wanted to provide a solution that could be used also to run mirrors inside of air-gapped environments.
The only tricky part for such a scenario is how to get the images from the upstream registries into the local one.
This can be done in two steps by using the skopeo sync
command.
We start by downloading the images on a machine that is connected to the internet. But instead of storing the images into a local registry we put them on a local directory:
$ skopeo sync --source docker://quay.io/coreos/etcd dir:/media/usb-disk/mirrored-images
This is going to copy all the versions of the quay.io/coreos/etcd
image into
a local directory /media/usb-disk/mirrored-images
.
Let’s assume /media/usb-disk
is the mount point of an external USB drive.
We can then unmount the USB drive, scan its contents with some tool, and
plug it into computer of the air-gapped network. From this computer we
can populate the local registry mirror by using the following command:
$ skopeo sync --source dir:/media/usb-disk/mirrored-images --dest-creds="username:password" docker://registry.secure.lan
This will automatically import all the images that have been previously downloaded to the external USB drive.
Now that we have all our images mirrored it’s time to start consuming them.
It might be tempting to just update all our Dockerfile
(s), Kubernetes
manifests, Helm charts, automation scripts, …
to reference the images from registry.kube.lan/<upstream registry FQDN>/<image>:<tag>
.
This however would be tedious and unpractical.
As you might know the docker open source engine has a --registry-mirror
.
Unfortunately the docker open source engine can only be configured to mirror the
Docker Hub, other external registries are not handled.
This annoying limitation lead me and Valentin Rothberg to create this pull request against the Moby project.
Valentin is also porting the patch against libpod,
that will allow to have the same feature also inside of
CRI-O and podman
.
During my experiments I figured some little bits were missing from the original PR.
I built a docker engine with the full patch
applied and I created this /etc/docker/daemon.json
configuration file:
{
"registries": [
{
"Prefix": "quay.io",
"Mirrors": [
{
"URL": "https://registry.kube.lan/quay.io"
}
]
},
{
"Prefix": "docker.io",
"Mirrors": [
{
"URL": "https://registry.kube.lan/docker.io"
}
]
}
]
}
Then, on this node, I was able to issue commands like:
$ docker pull quay.io/coreos/etcd:v3.1
That resulted in the image being downloaded from registry.kube.lan/quay.io/coreos/etcd:v3.1
,
no communication was done against quay.io
. Success!
Everything is working fine on nodes that are running this not-yet merged patch, but what about vanilla versions of docker or other container engines?
I think I have a solution for them as well, I’m going to experiment a bit with that during the next week and then provide an update.
This is a really long blog post. I’ll create a new one with all the configuration files and instructions of the steps I performed. Stay tuned!
In the meantime I would like to thank Marco Vedovati, Valentin Rothberg for their help with skopeo and the docker mirroring patch, plus Miquel Sabaté Solà for his help with Portus.
]]>During the last week I worked together with Marcus Schäfer (the author of KIWI) to reduce their size.
We fixed some obvious mistakes (like avoiding to install man pages and documentation), but we also removed some useless packages.
These are the results of our work:
Just to make some comparisons, the Ubuntu image is around 188M while the Fedora one is about 186M. We cannot obviously compete with images like busybox or Alpine, but the situation definitely improved!
Needless to say, the new images are already on the DockerHub.
Have fun!
]]>The Docker registry works like a charm, but it’s hard to have full control over the images you push to it. Also there’s no web interface that can provide a quick overview of registry’s contents.
So Artem, Federica and I created the Portus project (BTW “portus” is the Latin name for harbour).
The first goal of Portus is to allow users to have a better control over the contents of their private registries. It makes possible to write policies like:
This is done implementing the token based authentication system supported by the latest version of the Docker registry.
Portus listens to the notifications sent by the Docker registry and uses them to populate its own database.
Using this data Portus can be used to navigate through all the namespaces and the repositories that have been pushed to the registry.
We also worked on a client library that can be used to fetch extra information from the registry (i.e. repositories’ manifests) to extend Portus’ knowledge.
Right now Portus has just the concept of users. When you sign up into Portus a private namespace with your username will be created. You are the only one with push and pull rights over it; nobody else will be able to mess with it. Also pushing and pulling to the “global” namespace is currently not allowed.
The user interface is still a work in progress. Right now you can browse all the namespaces and the repositories available on your registry. However user’s permissions are not taken into account while doing that.
If you want to play with Portus you can use the development environment managed by Vagrant. In the near future we are going to publish a Portus appliance and obviously a Docker image.
Please keep in mind that Portus is just the result of one week of work. A lot of things are missing but the foundations are solid.
Portus can be found on this repository on GitHub. Contributions (not only code, also proposals, bugs,…) are welcome!
]]>In the beginning I looked into the kubernetes project. I found it really promising but AFAIK not yet ready to be used. It’s still in its early days and it’s in constant evolution. I will surely keep looking into it.
I also looked into other projects like consul and geard but then I focused on using etcd and fleet, two of the tools part of CoreOS.
I ended up creating a small testing environment that is capable of running a simple guestbook web application talking with a MongoDB database. Both the web application and the database are shipped as Docker images running on a small cluster.
The whole environment is created by Vagrant. That project proved to be also a nice excuse to play with this tool. I found Vagrant to be really useful.
You can find all the files and instructions required to reproduce my experiments inside of this repository on GitHub.
Happy hacking!
]]>For those who never heard about it, KIWI is a tool which creates Linux systems for both physical and virtual machines. It can create openSUSE, SUSE and other types of Linux distributions.
Update: I changed the required version of kiwi and the openSUSE 13.1 template. Kiwi just received some improvements which do no longer force the image to include the lxc package.
As you might know Docker has already its build system which provides a really easy way to create new images. However these images must be based on existing ones, which leads to the problem of creating the 1st parent image. That’s where KIWI comes to the rescue.
Indeed Kiwi can be used to build the openSUSE/SUSE/whatever docker images that are going to act as the foundation blocks of other ones.
Docker support has been added to KIWI 5.06.87. You can find this package inside of the Virtualization:Appliances project on OBS.
Install the kiwi
and the kiwi-doc
packages on your system. Then go to the
/usr/share/doc/packages/kiwi/examples/
directory where you will find a simple
openSUSE 13.1 template.
Just copy the whole /usr/share/doc/packages/kiwi/examples/suse-13.1/suse-docker-container
directory to another location and make your changes.
The heart of the whole image is the config.xml
file:
<?xml version="1.0" encoding="utf-8"?>
<image schemaversion="6.1" name="suse-13.1-docker-guest">
<description type="system">
<author>Flavio Castelli</author>
<contact>fcastelli@suse.com</contact>
<specification>openSUSE 13.1 docker image</specification>
</description>
<preferences>
<type image="docker" container="os131">
<machine>
<vmdisk/>
<vmnic interface="eth0" mode="veth"/>
</machine>
</type>
<version>1.0.0</version>
<packagemanager>zypper</packagemanager>
<rpm-check-signatures>false</rpm-check-signatures>
<rpm-force>true</rpm-force>
<locale>en_US</locale>
<keytable>us.map.gz</keytable>
<hwclock>utc</hwclock>
<timezone>US/Eastern</timezone>
</preferences>
<users group="root">
<user password="$1$wYJUgpM5$RXMMeASDc035eX.NbYWFl0" home="/root" name="root"/>
</users>
<repository type="yast2">
<source path="opensuse://13.1/repo/oss/"/>
</repository>
<packages type="image">
<package name="coreutils"/>
<package name="iputils"/>
</packages>
<packages type="bootstrap">
<package name="filesystem"/>
<package name="glibc-locale"/>
<package name="module-init-tools"/>
</packages>
</image>
This is a really minimal image which contains just a bunch of packages.
The first step is the creation of the image’s root system:
kiwi -p /usr/share/doc/packages/kiwi/examples/suse-13.1/suse-docker-container \
--root /tmp/myimage
The next step compresses the file system of the image into a single tarball:
kiwi --create /tmp/myimage --type docker -d /tmp/myimage-result
The tarball can be found under /tmp/myimage-result
. This can be imported
into docker using the following command:
docker import - myImage < /path/to/myimage.tbz
The image named myImage
is now ready to be used.
In the next days I’ll make another blog post explaining how to build docker images using KIWI and the Open Build Service. This is a powerful combination which allows to achieve continuous delivery.
Stay tuned and have fun!
]]>First of all the Docker package has been moved from my personal OBS project to the more official Virtualization one. The next step is to get the Docker package into Factory :)
I’m going to drop the docker
package from home:flavio_castelli:docker
,
so make sure to subscribe to the Virtualization
repository to get latest versions of
Docker.
I have also submitted some openSUSE related documentation to the official Docker project. If you visit the “Getting started” page you will notice the familiar geeko logo. Click it to be redirected to the openSUSE’s installation instructions.
]]>You can read the full announcement here, but let me talk about the biggest change introduced by this release: storage drivers!
Docker has always used AUFS, a “unionfs-like” file system, to power its containers. Unfortunately AUFS is neither part of the official kernel nor of the openSUSE/SLE one.
In the past I had to build a custom patched kernel to run Docker on openSUSE. That proved to be a real pain both for me and for the end users.
Now with storage drivers Docker can still use AUFS, but can also opt for something different. In our case Docker is going to use thin provisioning, a consolidated technology which is part of the mainstream kernel since quite some time.
Moreover Docker’s community is working on experimental drivers for BTRFS, ZFS, Gluster and Ceph.
Running Docker is incredibly simple now: just use the 1 click install and download it from the ‘home:flavio_castelli:docker’ project.
As I said earlier: no custom kernel is required. You are going to keep the one shipped by the official openSUSE repositories.
Just keep in mind that Docker does some initialization tasks on its very first
execution (it configures thin provisioning). So just wait a little before hitting its
API with the Docker cli tool (you will just get an error because docker.socket
is not found).
Right now Docker works fine on openSUSE 12.3 and 13.1 but not on SLE 11 SP3. During the next days I’m going to look into this issue. I want to have a stable and working package for SLE.
Once the package is proved to be stable enough I’ll submit it for inclusion inside of the Virtualization project on OBS.
So please, checkout Docker package and provide me your feedback!
]]>The project has been tracked on this page of hackweek’s wiki, this is a detailed report of what I achieved.
Docker has been packaged inside of this OBS project.
So installing it requires just two commands:
sudo zypper ar http://download.opensuse.org/repositories/home:/flavio_castelli:/docker/openSUSE_12.3 docker
sudo zypper in docker
There’s also a 1 Click Install for the lazy ones :)
Zypper will install docker and its dependencies which are:
lxc
: docker’s “magic” is built on top of LXC.bridge-utils
: is used to setup the bridge interface used by docker’s
containers.dnsmasq
: is used to start the dhcp server used by the containers.iptables
: is used to get containers’ networking work.bsdtar
: is used by docker to compress/extract the containers.The aufs3
kernel module is not part of the official kernel and is not
available on the official repositories. Hence adding docker will trigger the
installation of a new kernel package on your machine.
Both the kernel and the aufs3 packages are going to be installed from the home:flavio_castelli:docker repository but they are in fact links to the packages created by Michal Hrusecky inside of his aufs project on OBS.
Note well: docker works only on 64bit hosts. That’s why there are no 32bit packages.
If you don’t want to install docker on your system or you are just curious and want to jump straight into action there’s a SUSE Studio appliance ready for you. You can find it here.
If you are not familiar with SUSE Gallery let me tell you two things about it:
The latter option is really cool, because it will allow you to play with docker immediately. There’s just one thing to keep in mind about Testdrive: outgoing connections are disabled, so you won’t be able to install new stuff (or download new docker images). Fortunately this appliance comes with the busybox container bundled, so you will be able to play a bit with docker.
The docker daemon must be running in order to use your containers. The openSUSE package comes with a init script which can be used to manage it.
The script is /etc/init.d/docker
, but there’s also the usual symbolic link
called /usr/sbin/rcdocker
.
To start the docker daemon just type:
sudo /usr/sbin/rcdocker start
This will trigger the following actions:
docker0
bridge interface is created. This interface is bridged
with eth0
.dnsmasq
instance listening on the docker0
interface is started.All the containers will get an IP on the 10.0.3.0/24
network.
Now is time to play with docker.
First of all you need to download an image: docker pull base
This will fetch the official Ubuntu-based image created by the dotCloud guys.
You will be able to run the Ubuntu container on your openSUSE host without any problem, that’s LXC’s “magic” ;)
If you want to use only “green” products just pull the openSUSE 12.3 container I created for you:
docker pull flavio/opensuse-12-3
Please experiment a lot with this image and give me your feedback. The dotCloud guys proposed me to promote it to top-level base image, but I want to be sure everything works fine before doing that.
Now you can go through the examples reported on the official docker’s documentation.
I think it would be extremely cool to create docker’s images using SUSE Studio. As you might know I’m part of the SUSE Studio team, so I looked a bit into how to add support to this new format.
– personal opinion –
There are some technical challenges to solve, but I don’t think it would be hard to address them.
– personal opinion –
If you are interested in adding the docker format to SUSE Studio please create a new feature request on openFATE and vote it!
In the meantime there’s another way to create your custom docker images, just keep reading.
KIWI is the amazing tool at the heart of SUSE Studio and can be used to create LXC containers.
As said earlier docker runs LXC containers, so we are going to follow these instructions.
First of all install KIWI from the Virtualization:Appliances project on OBS:
sudo zypper ar http://download.opensuse.org/repositories/Virtualization:/Appliances/openSUSE_12.3 virtualization:appliances
sudo zypper in kiwi kiwi-doc
We are going to use the configuration files of a simple LXC container shipped
the kiwi-doc
package:
cp -r /usr/share/doc/packages/kiwi/examples/suse-11.3/suse-lxc-guest ~/openSUSE_12_3_docker
The openSUSE_12_3_docker
directory contains two configuration files used by
KIWI (config.sh
and config.xml
) plus the root
directory.
The contents of this directory are going to be added to the resulting container.
It’s really important to create the /etc/resolv.conf
file inside of the
final image since docker is going to mount the resol.conf
file of the host
system inside of the running guest. If the file is not found docker won’t be able
to start our container.
An empty file is enough:
touch ~/openSUSE_12_3_docker/root/etc/resolv.conf
Now we can create the rootfs of the container using KIWI:
sudo /usr/sbin/kiwi --prepare ~/openSUSE_12_3_docker --root /tmp/openSUSE_12_3_docker_rootfs
We can skip the next step reported on KIWI’s documentation, that’s not needed with docker because it will produce an invalid container. Just execute the following command:
sudo tar cvjpf openSUSE_12_3_docker.tar.bz2 -C /tmp/openSUSE_12_3_docker_rootfs/ .
This will produce a tarball containing the rootfs of your container.
Now you can import it inside of docker, there are two ways to achieve that:
Importing the image from a web server is really convenient if you ran KIWI on a different machine.
Just move the tarball to a directory which is exposed by the web server. If you don’t have one installed just move to the directory containing the tarball and type the following command:
python -m SimpleHTTPServer 8080
This will start a simple http server listening on port 8080 of your machine.
On the machine running docker just type:
docker import http://mywebserver/openSUSE_12_3_docker.tar.bz2 my_openSUSE_image latest
If the tarball is already on the machine running docker you just need to type:
cat ~/openSUSE_12_3_docker.tar.bz2 | docker import - my_openSUSE_image latest
Docker will download (just in the 1st case) and import the tarball. The resulting image will be named ‘my_openSUSE_image’ and it will have a tag named ‘latest’.
The name of the tag is really important since docker tries to run the image with the ‘latest’ tag unless you explicitly specify a different value.
Hackweek #9 has been both productive and fun for me. I hope you will have fun too using docker on openSUSE.
As usual, feedback is welcome.
]]>The previous 0.8.0 release broke ABI compatibility without changing the SOVERSION
.
I’m not entirely happy with some parts of QJson’s API. I addressed these issues inside of the 1_0_0 branch.
I would appreciate to hear your opinion before merging this branch into master and releasing QJson 1.0.0.
]]>So here we go, QJson 0.8.0 is out!
A lot of bugs has been smashed during this time, this new release will fix issues like this one and this in a nicer way.
QJson’s API is still backward compatible, while the ABI changed.
Some work has also been done to get QJson work on the Symbian platform. The development happened a long time before Symbian was declared dead.
Currently I do not offer any kind of support for the Symbian platform because IMHO Symbian development is a mess and given the current situation in the mobile world I don’t see any point in investing more efforts on that.
Obviously Symbian patches and documentation are still accepted, as long as they don’t cause issues to the other target platforms.
QJson always used cmake as build system but since some Windows developers
had problems with it I decided to add some .pro
files. That proved to be a
bad choice for me since I had to support two build systems. I prefer to invest
my time fixing bugs in the code and adding interesting features rather then
triaging qmake issues on Windows. Hence I decided to remove them from git.
If you are a nostalgic you can still grab these files from git. They have been removed with commit 66d10c44dd3b21.
I decided to move QJson’s code from Gitorious to Github. Github’s issue sysyem is going to replace Sourceforge’s bug tracking system.
I currently use Github a lot, both for personal projects and for work, and I simply love it. I think it offers the best tools in the market and that’s really important to me.
QJson’s website and mailing lists are still going to be hosted on Sourceforge.
I think that’s all from now. If you want more details about the changes introduced
take a look at the changelog
or checkout QJson’s website.
Suppose you want to create a tailored SUSE Studio appliance to run a Ruby on Rails app, this is a list of things you have to take care of:
Dister is a command line tool similar to the one used by Heroku (one of the coolest ways to run your Rails app into the cloud). Within a few steps you will be able to create a SUSE Studio appliance running your rails application, download it and deploy wherever you want.
Dister is named after SUSE Studio robot. It has been created by Dominik Mayer and me during the latest hackweek.
We are going to create a SUSE Studio appliance running a rails application called “SUSE history”. The app uses bundler to handle its dependencies. This is its Gemfile file:
source 'http://rubygems.org'
gem 'rails', '3.0.5'
gem 'pg'
gem "flutie", "~> 1.1"
As you can see the app uses rails3, the flutie gem and PostgreSQL as database.
Move into the suse_history directory and execute the following command:
dister create suse_history
As you can see dister has already done a couple of things for you:
It’s time to upload suse_history code. This is done using the following command:
dister push
As you can see dister packaged the application source code and all its dependencies into a single archive. Then uploaded the archive to SUSE Studio as an overlay file. Dister uploaded also the configuration files required by Apache and by PostgreSQL setup.
It’s build time!
dister build
The appliance has automatically being built using the raw disk. You can use different formats of course.
Testdrive is one of the coolest features of SUSE Studio. Unfortunately dister doesn’t support it yet. Just visit your appliance page and start testdrive from your browser. Just enable testdrive networking and connect to your appliance:
Your appliance is working flawlessly. Download it and deploy it wherever you want.
dister download
As you can see dister handles pretty fine a simple Rails application, but there’s still room for improvements.
Here’s a small list of the things on my TODO list:
Bugs and enhancements are going to be tracked here.
Dister code can be found here on github, fork it and start contributing.
If you are a student you can work on dister during the next Google Summer of code, apply now!
]]>Thanks to Jump, you won’t have to type those long paths anymore.
You can find jump’s source code, detailed documentation and installation instructions here.
SUSE packages can be found here.
]]>Code can be downloaded from here. openSUSE packages are already available on the build service.
These are some screenshots illustrating fastuserswitch’s new features.
{% img /images/fast_user_switch/fastuserswitch011.png %} {% img /images/fast_user_switch/fastuserswitch021.png %} {% img /images/fast_user_switch/fastuserswitch03.png %}
]]>Let me introduce my first plasmoid: the fast user switch plasmoid. It’s a simple icon in the panel that allows users to swich to another open session or to open a new login page. Here you can see the mandatory screenshots.
{% img /images/fast_user_switch/fastuserswitch02.png %} {% img /images/fast_user_switch/fastuserswitch01.png %}
You can find the source code here. Binary packages for openSUSE are already available on the build service.
I think that KDM should allow to switch back to an already open session in a more transparent way. Right now if an user has already one session open, he goes back to the login screen and enters his credentials a **new ** session is started. I think that most users would expect to be switched back to their already running session. Starting a new session is just confusing for them.
]]>In order to run the unit tests of your rails application, basically you have these official possibilities:
rake test
: runs all unit, functional and integration tests.rake test:units
: runs all the unit tests.rake test:functionals
: runs all the functional tests.rake test:integration
: runs all the integration tests.
Each one of these commands requires some time and they are not the best
solution while developing a new feature or fixing a bug. In this circumstance
we just want to have a quick feedback from the unit test of the code we are
editing.Waiting for all the unit/functional tests to complete decreases our productivity, what we need is to execute just a single unit test. Fortunately there are different solutions for this problem, let’s go through them.
Most of the IDE supporting ruby allow you to run a single unit test. If you are using Netbeans running a single unit test is really easy:
As you will notice the summary window contains also some useful information like the:
If you are not using Netbeans you can always rely on some command line tools.
These “tricks” don’t require additional gems, hence they will work out of the box.
The first solution is to call this rake task:
rake test TEST=path_to_test_file
So the final command should look like
rake test TEST=test/unit/invitation_test.rb
Unfortunately on my machine this command repeats the same test three times, I hope you won’t have the same weird behavior also on your systems…
Alternatively you can use the following command:
ruby -I"lib:test" path_to_test_file"
It’s even possible to call a specific test method of your testcase:
ruby -I"lib:test" path_to_test_file -n name_of_the_method"
So calling:
ruby -I"lib:test" test/unit/invitation_test.rb - test_should_create_invitation
will execute only _InvitationTest::test_should_createinvitation.
It’s also possible to execute only the test methods matching a regular expression. Look at this example:
ruby -I"lib:test" test/unit/invitation_test.rb -n /.*between.*/
This command will execute only the test methods matching the /.between./ regexp.
If you want to avoid the awful syntax showed in the previous paragraph there’s a gem that can help you, it’s called single_test.
The github page contains a nice documentation, but let’s go through the most common use cases.
You can install the gem as a rails plugin:
script/plugin install git://github.com/grosser/single_test.git
single_test will add new rake tasks to your rails project, but won’t override the original ones.
Suppose we want to execute the unit test of user.rb, just type the following command:
rake test:user
If you want to execute the functional test of User just call:
rake test:user_c
Appending _”c” to the class name will automatically execute its functional test (if it exists).
It’s still possible to execute a specif test method:
rake test:user_c:_test_name_
So calling:
rake test:user_c:test_update_user
Will execute the _test_updateuser method written inside of _test/functional/user_controllertest.rb.
It’s still possible to use regexp:
rake test:invitation:.*between.*
This syntax is equivalent to ruby -I"lib:test" test/unit/invitation_test.rb
-n /.*between.*/
.
When a single unit test is run all the usual database initialization tasks are not performed. If your code is relying on newly created migrations you will surely have lots of errors. This is happening because the new migrations have not been applied to the test database.
In order to fix these errors just execute:
rake db:test:prepare
before running your unit test.
]]>Since I’m not a Symbian developer it has been a little hard for me to achieve that. I would like to thank Antti Luoma for his help.
There are also good news for Windows developers: now building QJson under Windows is easier. Checkout the new installation instruction page.
I hope this will help all the Windows developers who want to use QJson.
]]>I’ll keep the code on KDE’s svn synchronized with the git repository.
]]>I refactored a bit my latest changes: I created a new class called QObjectHelper that provides the methods required to convert a QObject instance to a QVariantMap and vice-versa.
This class can be used in conjunction with the Serializer and Parser classes to serialize and deserialize QObject instances to and from JSON.
Let me show a quick example, suppose the declaration of Person class looks like this:
{% codeblock [class definition] [lang:cpp ] %} class Person : public QObject { Q_OBJECT Q_PROPERTY(QString name READ name WRITE setName) Q_PROPERTY(int phoneNumber READ phoneNumber WRITE setPhoneNumber) Q_PROPERTY(Gender gender READ gender WRITE setGender) Q_PROPERTY(QDate dob READ dob WRITE setDob) Q_ENUMS(Gender)
public: Person(QObject* parent = 0); ~Person(); QString name() const; void setName(const QString& name); int phoneNumber() const; void setPhoneNumber(const int phoneNumber); enum Gender {Male, Female}; void setGender(Gender gender); Gender gender() const; QDate dob() const; void setDob(const QDate& dob); private: QString m_name; int m_phoneNumber; Gender m_gender; QDate m_dob; }; {% endcodeblock %}
The following code will serialize an instance of Person to JSON :
{% codeblock [From QObject to JSON] [lang:cpp ] %} Person person; person.setName(“Flavio”); person.setPhoneNumber(123456); person.setGender(Person::Male); person.setDob(QDate(1982, 7, 12)); QVariantMap variant = QObjectHelper::qobject2qvariant(&person;); Serializer serializer; qDebug() << serializer.serialize( variant); {% endcodeblock %}
The generated output will be:
{% codeblock [JSON data] [lang:json ] %} { “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 } {% endcodeblock %}
Suppose you have the following JSON data stored into a QString:
{% codeblock [JSON data] [lang:json ] %} { “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 } {% endcodeblock %}
The following code will initialize an already allocated instance of Person using the JSON values:
{% codeblock [From JSON to QObject] [lang:cpp ] %}
Parser parser;
QVariant variant = parser.parse(json);
Person person;
QObjectHelper::qvariant2qobject(variant.toMap(), &person;);
{% endcodeblock %}
These changes have been included inside the new release of QJson: 0.7.0.
Packages for openSUSE are building right now.
]]>This solution relies on the awesome Qt’s property system.
Suppose the declaration of Person class looks like this:
{% codeblock [class definition] [lang:cpp ] %}
class Person : public QObject
{
Q_OBJECT
Q_PROPERTY(QString name READ name WRITE setName)
Q_PROPERTY(int phoneNumber READ phoneNumber WRITE setPhoneNumber)
Q_PROPERTY(Gender gender READ gender WRITE setGender)
Q_PROPERTY(QDate dob READ dob WRITE setDob)
Q_ENUMS(Gender)
public:
Person(QObject* parent = 0);
~Person();
QString name() const;
void setName(const QString& name);
int phoneNumber() const;
void setPhoneNumber(const int phoneNumber);
enum Gender {Male, Female};
void setGender(Gender gender);
Gender gender() const;
QDate dob() const;
void setDob(const QDate& dob);
private:
QString m_name;
int m_phoneNumber;
Gender m_gender;
QDate m_dob;
};
{% endcodeblock %}
The following code will serialize an instance of Person to JSON:
{% codeblock [Serialize to JSON] [lang:cpp ] %}
Person person;
person.setName(“Flavio”);
person.setPhoneNumber(123456);
person.setGender(Person::Male);
person.setDob(QDate(1982, 7, 12));
Serializer serializer;
qDebug() << serializer.serialize( &person;);
{% endcodeblock %}
The generated output will be: {% codeblock [JSON data] [lang:json ] %} { “dob” : “1982-07-12”, “gender” : 0, “name” : “Flavio”, “phoneNumber” : 123456 } {% endcodeblock %}
I hope you will find this new feature useful. I’m also considering to create a similar method inside the Parser class.
As usual suggestions are welcome.
]]>In the meantime I kept working on kaveau, so let me show you what has changed:
Previously kaveau used rdiff-backup as backup back-end. rdiff-backup is a great program but unfortunately it relies on the outdated librsync library. The latest release of librsync is dated 2004. It has a couple of serious bugs still open and, while rsync has reached version three, this library is still stuck at version one.
These are the reasons of the switch from rdiff-backup to rsync. This choice breaks the compatibility with the previous backups but it introduces a lot of advantages. One of the most important improvements brought by the adoption of rsync is an easier restore procedure: now all the backups can be accessed using a standard file manager, while previously rdiff-backup was needed to access the old backups.
On the backup device everything is saved under the kaveau/hostname/username path.
The directory will have a similar structure:
drwxr-xr-x 3 flavio users 4096 2009-09-12 18:50 2009-09-12T18:50:19
drwxr-xr-x 3 flavio users 4096 2009-09-14 23:07 2009-09-14T23:07:46
drwxr-xr-x 3 flavio users 4096 2009-09-14 23:30 2009-09-14T23:30:36
lrwxrwxrwx 1 flavio users 19 2009-09-14 23:30 current -> 2009-09-14T23:30:36
As you can see there’s one directory per backup, plus a symlink called current pointing to the latest backup.
Nowadays big external storage devices are pretty cheap, but it’s always good to save some disk space. Now kaveau keeps:
Before starting to work on the restore user interface I will spend some time figuring out how to add support for network devices.
A lot of users requested this feature, hence I want to make them happy :) .
I’m planning to use avahi to discover network shares (nfs, samba) or network machines running ssh and use them as backup devices. Honestly, I want to achieve something similar to Apple’s time capsule.
As usual, feedback messages are really appreciated.
]]>The good news is that QJson can be successfully built under Window, I can show you proof ;)
{% img /images/qjson/qjson_windows_1.png %} {% img /images/qjson/qjson_windows_2.png %}
I have written the build instructions on QJson website: just take a look here.
One last note: if you have problems with QJson please subscribe to the developer mailing list and post a message.
]]>I decided to dedicate myself to an idea that has been obsessing me since a long time. Last December my brand new hard disk suddenly died, making impossible to recover anything. Fortunately I had just synchronized the most important documents between my workstation and my laptop, so I didn’t lose anything important. This incident make me realize that I should perform backups regularly and I immediately started looking for a good solution.
Personally I think that doing backups is pretty boring so I wanted something damned easy to setup and use. Something that once configured runs in the background and does the dirty job without bothering you. Let’s face the truth: I wanted Apple’s Time Machine for KDE.
After some searches I realized that nothing was fitting my requirements and I decided to create something new: kaveau.
Kaveau is a backup program that makes backup easy for the average user.
As you will see while coding/planning kaveau I made some assumptions and so only few things are configurable. I really think that sometimes “less is more”.
Current features:
Backups are performed using rdiff-backup because it’s damned easy to use, well tested (it’s used also in production environments) and packaged by all distributions.
The awesome solid library is used for interacting with the external disks is super easy.
I have been working on kaveau just for five days, so there’s still a lot of work to do.
A screenshot tour will give you the right idea of its status.
####Backup wizard - page 1
####Backup wizard - page 2
####Backup wizard - final page
####Backup operation in progress
####Backup completed
Right now the code is available on this git repository but I don’t recommend you to try it (unless you want to find and fix bugs ;) ).
I would really appreciate:
In fact we hacked a lot, doing lots of changes to QJson:
So it’s with a great pleasure that I announce the release of QJson 0.6.0.
Beware, since the API has been changed your application will probably break. I’m really sorry about that, but I guarantee it won’t happen in the future (as I said both API and ABI interfaces can now be considered stable).
QJson web site has been updated, reflecting all the changes made to the library. openSUSE packages has been moved from my home repo to KDE:Qt one.
One last note, if you have problems with QJson please contact me using the qjson-devel mailing list. You can subscribe here.
]]>On Thursday 9th July I’ll give a BoF about QJson.
During the talk I will show:
See you soon!
]]>As usual data is provided by last.fm, which should return also the events _“near” _the specified city (don’t ask me to define a value for near :) ).
I have created new openSUSE packages, this time everything should work fine. Just make sure to remove all qjson packages before installing this release (in fact all the previous problems were due to packaging errors of qjson, now I have created new packages called libqjson0).
Packages are available for openSUSE 11.0, 11.1 and Factory (both for i586 and x86_64 architectures).
One last news: rockmarble has also a new site, something slightly better than github wiki ;)
]]>After some hacking I created this small application: rockmarble…
If you have a last.fm account rockmarble will import your favourite artist list. Otherwise you can add one artists at a time.
The tour location will be displayed inside Marble, using openstreetmap.
In order to build/run it you will need:
You can grab the source code of rockmarble here.
If you are an openSUSE user you can use 1click install:
The geolocalization data are given by last.fm, so if you discover that Metallica are going to give a concert in the middle of the Pacific Ocean please don’t bother me :)
It seems that QJson doesn’t handle properly special characters. Maybe you will some artist with a blank name. I’m going to fix this issue asap.
Visit rockmarble website
Who wants to integrate it into amarok’s context view? ;)
]]>I came over a couple of solutions but none of them made me happy. So in the last weekend I wrote my own library : QJson The library is based on Qt toolkit and converts JSON data to QVariant instances. JSON arrays will be mapped to QVariantList instances, while JSON’s objects will be mapped to QVariantMap. The JSON parser is generated with Bison, while the scanner has been coded by me.
Converting JSON’s data to QVariant instance is really simple:
{% codeblock [] [lang:cpp ] %} // create a JSonDriver instance JSonDriver driver; bool ok; // json is a QString containing the data to convert QVariant result = driver.parse (json, &ok); {% endcodeblock %}
Suppose you’re going to convert this JSON data:
{% codeblock [JSON data] [lang:json ] %} { “encoding” : “UTF-8”, “plug-ins” : [ “python”, “c++”, “ruby” ], “indent” : { “length” : 3, “use_space” : true } } {% endcodeblock %}
The following code would convert the JSON data and parse it:
{% codeblock [] [lang:cpp ] %} JSonDriver driver; bool ok; QVariantMap result = driver.parse (json, &ok).toMap(); if (!ok) { qFatal(“An error occured during parsing”); exit (1); } qDebug() << “encoding:” << result[“encoding”].toString(); qDebug() << “plugins:“; foreach (QVariant plugin, result[“plug-ins”].toList()) { qDebug() << “\t-” << plugin.toString(); } QVariantMap nestedMap = result[“indent”].toMap(); qDebug() << “length:” << nestedMap[“length”].toInt(); qDebug() << “use_space:” << nestedMap[“use_space”].toBool(); {% endcodeblock %}
The output would be:
encoding: "UTF-8" plugins: - "python" - "c++" - "ruby" length: 3 use_space: true
QJson requires:
Actually QJson code is hosted on KDE subversion repository. You can download it using a svn client:
svn co svn://anonsvn.kde.org/home/kde/trunk/playground/libs/qjson
For more informations visit QJson site
]]>While organizing the event somebody proposed to setup a local server with some music released under CC license. He suggested to download some albums from Jamendo (due to network issues we won’t be able to provide direct access to the website).
Since nobody wanted to download the albums by hand, last night I wrote a small ruby program that does the dirty job.
Ruby and json gem have to be installed on you machine.
Help:[](http://www.flavio.castelli.name/wp- content/uploads/2008/10/jamendo_downloader.rb)
./jamendo_downloader.rb –help
Download the top 10 rock albums:
./jamendo_downloader.rb -g rock -t 10
I think there’s nothing more to say… enjoy it!
{% gist 2437530 jamendo_downloader.rb %}
]]>After reading these lines:
” We are exploring new technologies, creating prototypes of future systems, and trying to find and shape some of the features that will be part of upcoming SUSE products and the ecosystem around that. It’s a fascinating job, challenging, fun, and always exciting. For somebody like me who loves to create new things and enjoys working with an awesome team of innovative people this is a dream job.”
I realized the description matched with all my wishes!
I immediately applied for the position and now, after more than a month, I’m looking at a signed proposal.
I’ve waited a long time before spreading my joy to the rest of the world… but now I just want to say that starting from 1st August I’ll be part of Suse’s Incubation team! :D
I’m really happy, excited and honoured for this awesome opportunity.
I wish you’ll be able to realize your dreams too!
Just a funny note… While Cornelius was writing about the job offer on his
blog, I was having a dinner with another Novell employee:
Massimiliano Mantione. I was
listening to his personal story, wishing to find a dream job like his one.
I have to admit I was a bit jealous :)
During this time I’ve been working on XesamQLib creation. This is a Qt based library for accessing Xesam services. Its API is going to be similar to Xesam-glib one and it will make life easier for developers who want to interact with programs exposing Xesam service (who talked about Strigi? :D )
Right now I’m finishing to clean the code, in order to publish a first version of XesamQLib on KDE repository. I’ll keep you updated.
]]>Currently I’m using Linux as primary Os, but I’ll give a try also to KDE4 on Mac ;)
The other new entry is Baguette, a three-months old dachshund. She’s really lovely, running around the house eating random things like shoes (I think it’s a must for all pets) and usb cables (she’s already a geek-dog).
I think I’ll have to keep an eye on Baguette, otherwise I’ll find another bite on my Macbook pro apple ;)
]]>KDE integration: Flavio will coordinate the definition of interfaces over which KDE will handle searching and metadata. He can ask Aaron, Evgeny and Jos for help with the interface design. The interface will cover:
Querying via Xesam
Configuration of the Strigi daemon
Indexing and deindexing of data by passing it to the daemon (allowing for indexing for more than just files)
Controlling the daemon (starting, stopping, pausing)
Once this interface will be ready, it will be easy to integrate Strigi functionalities inside KDE programs. This mean (just reporting the most relevant cases) that it will be possible to create a Strigi krunner, have metadata extraction inside Dolphin and Konqueror, interact with Akonadi…
Talking about Xesam, right in these days I got a mail from Fabrice Colin, author of Pinot. Recently Fabrice made some improvements on Pinot’s XesamQueryLanguage parser (which is also used by Strigi). We’re now figuring out how to share our code in a more convenient way. Maybe we’ll use svn external…
]]>During the meeting Strigi developers will discuss about the future developments of Strigi.
Special guest: Aaron Aseigo. You’re all welcome.
]]>Since Strigi’s analyzer work in a different way, lot of code has to be ported. Unfortunately, after a good start, some relevant analyzers were still missing.
But in the last weeks Strigi gained support of:
I’ve also updated this summary page. As you can see there’s still some work to do, but don’t worry… I’ll try to do the best ;)
]]>So, by now, Strigi supports the following file system monitoring facilities:
You’re hacking on your local working copy and you want to keep it up-to-date but, since you have some uncommitted changes, git-svn rebase cannot be executed
I was just thinking to write something about_ _this problem when I read a post on digikam blog.
In this post Marcel proposes a workaround using a bash function. In fact there’s a “cleaner” solution, if you’re interested read the last part of my git-svn howto.
]]>Linux Day is an Italian manifestation that promotes Linux and FOSS. During this day different organizations (mostly Linux User Groups) arrange events with speeches, installation parties and more.
Since lot of people requested it, I gave a speech about KDE 4 during the Linux day organized by my LUG (BGLug).
The presentation covers the main changes and features introduced by KDE 4. I took inspiration from Troy’s “Road to KDE 4” articles (I like them really much).
People liked the speech and, most important of all, showed great interest for KDE 4.
This is a really small howto that describes how to work on a project versioned with svn (maybe taken from KDE repository ;) ) using git.
Since Git is a distributed revision control system (while svn is a centralized one) you can perform commits, brances, merges,… on your local working dir without being connected to internet.
Next time you’ll be online, you will be able to “push” your changes back to the central svn server.
You’ve to:
mkdir strigi
cd strigi && git-svn init https://svn.kde.org/home/kde/trunk/kdesupport/strigi
git-svn init command is followed by the address of the svn repository (in this case we point to strigi’s repository)git-svn fetch -rREVISION
Where REVISION is the number obtained before.git-svn rebase
Now you’ll be able to work on your project using git as revision control
system.To keep update your working copy just perform:
git-svn rebase
You can **commit your changes ** to the svn server using the command:
git-svn dcommit
In this way each commit made with git will be “transformed” into a svn one.
While adding new cool features to your program, you may experiment some
problem when synchronizing with the main development tree. In fact you have to
commit all local modifications (using the git-commit
command) before
invoking git-svn rebase
.
Sometimes it isn’t reasonable since your changes are not yet ready to be committed (you haven’t finished/tested/improved your work). But don’t worry, git has a native solution also for this problem, just follow these steps:
git-stash
git-svn rebase
as usualgit-stash apply
git-stash clear
After the first step all your uncommitted changes will disappear from the
working copy, so you’ll be able to perform the rebase command without
problems.For further informations read git-stash
man page.
That’s all.
A special mention goes to Thiago Macieira for his help.
]]>But in the end I tamed the beast and now Xesam support in Strigi is full.
IMHO XesamUserSearchLanguage can be considered more important than XesamQueryLanguage since common users will write queries in this way.
As reported on the project page: {% blockquote %} It is [XesamUserSearchLanguage] designed as an extended synthesis of Apple’s spotlight and Google’s search languages. {% endblockquote %}
These are some possible queries (examples taken from freedesktop site):
_type:music hendrix_
will return all music items related to hendrix_type:image size>=1mb tag:flower africa_
will return all pictures displaying a flower greater than 1 Mb and related with africa
The Xesam’s UserSearchLanguage query –> Strigi::Query object conversion is made using a hand-written scanner and a C++ parser created by Bison.
You don’t have to worry if you don’t have Bison installed on your system since all parser generated code is already put into svn. In these days, as soon as I’ll have some spare time (when?!), I’ll write another post about open-source scanner and parser generators.
By now I would like to thank Andreas Pakulat (developer of KDevelop) for his help with parser generators.
]]>In the end yesterday I spent approximately four hours on the train. During this elapse of time I started the Xesam User Language parser :) During the travels I:
Now, after fixing some build errors, I’ll start writing Bison’s grammar rules. These rules will translate Xesam user language queries into Strigi::Query objects.
I hope it will work (both bathroom and Xesam parser ;) )
]]>Strigicmd is a command-line tool shipped with Strigi. It can perform different actions like:
So, if you want to try the new Xesam support you’ve just to use strigicmd
with the xesamquery option. The command syntax is: strigicmd xesamquery -t
backend -d indexdir [-u xesam_user_language_file] [-q
xesam_query_language_file]
As you can expect you’ve to save your Xesam query
to file and point strigicmd to it.
This is a really small step-by-step guide:
strigicmd create -t clucene -d temp/ logs/
Create a simple file containing your Xesam query. You can find some example query on Xesam site or inside strigi tarball (complete path: strigi/src/streamanalyzer/xesam/testqueries/). This is a stupid and easy query:
{% codeblock [query] [lang:xml ] %}
Perform the search, just type:
strigicmd xesamquery -t clucene -d temp/ -q ~/irc_oever.xml
Enjoy the search results ;)
Remember that XesamUserLanguage query language isn’t yet supported.
]]>But why is this important? If you aren’t able to answer the previous question probably you don’t know what is Xesam. Here’s a short definition taken from the official site: Xesam is an umbrella project with the purpose of providing unified apis and specs for desktop search- and metadata services. Thanks to dbus and Xesam it will be possible to access the informations indexed by Strigi (and all the desktop searching programs supporting these technologies) in a standard and easier way. Isn’t it cool?
I’ve to say a big “thank you” to Fabrice Colin (author of pinot) because my Xesam code relies upon his work.
My work isn’t yet finished. Xesam defines two kind of queries:
By now I’m thinking to accomplish this task using flex, but I’m just in a preliminary state. Suggestions are welcome!
P.S. I’m really happy because this is my first post published on PlanetKDE. Hello to everybody!
]]>How to resist?! :)
After some clicks I made this beautiful avatar:
{% img /images/avatar_simpson.jpg %}
Isn’t it amazing? :D
]]>I’ll enable it again as soon as possible.
BTW: I don’t know who will care about this limitation ;)
]]>Many thanks to Roberto for the photos (made with his mobile phone!).
]]>These are the major changes introduced:
Actually I’m very happy of the first point, but I think the second one can still be improved…
]]>svn status
and svn diff
and I made some commits on Strigi trunk. Most of all are just code readability
improvement == don’t exceed the 80 chars per line.
I think that Jos will be really happy to see them ;)
]]>The most important of all is the last one. I’ve spent all my week-ends working in my new house. I’ve been fixing and improving it (wow… it seems I’m talking about a piece of software ;) ) and now it’s quite finished.
I’ve to say a great “thank you” to my dad. He’s able to do everything (he’s something better than Mac Gyver) and helped me a lot (in fact he has done all the dirty jobs).
Prepare yourself, I’ll put some photos of the house really soon…
]]>During the upgrade I also installed a “real” gallery program (I’m referring to Gallery), you can navigate through it clicking here.
I hope you’ll like the new look and will enjoy it!
I also promise that I’ll keep this site more updated, stay tuned… in the next weeks I’ll start writing some good technical post ;)
Btw, during the upgrade I ported all previous contents and comments to drupal.
]]>This speech illustrates:
I’ve uploaded the slides, you can find them in the paper category.
]]>Departure from Italy. The plane landed in Charlesroi airport, where we (me and Laura) take a bus for Bruxelles. Once in the city it was so late that there was no public transport, so we had to take a cab in order to reach our hotel.
Small tour in Bruxelles. As we thought Bruxelles is a beautiful city with lot of things to see, in fact there are too many things to see for a 2 day trip! In the late morning we reached FOSDEM at the ULB university.
After a long tour across the different stands I found the KDE guys and especially Jos. I met him at the Nepomuk talk. Here I discovered that since Sebastian Trueg (author of the famous K3B burning program) was ill, Jos has taken his place. It has been a really interesting talk, that “warmed” people making them more curious about Strigi. Then it was my time. Everything went right and in the end there were many questions.
The day ended with another small walk in Bruxelles and a dinner in one typical restaurant.
We went to FOSDEM for the last time to see Jos talk. As usual Jos made a great job and all the people liked it. After a small talk with him about some technical aspects of Strigi, we leave FOSDEM and we went for another small tour in Bruxelles.
In the early afternoon we had to take the bus for the Charlesroi airport. We came back home at 10pm with some bottles of good Belgian beer ;)
It has been a great experience, I’ll try to came back to FOSDEM the next year!
You can find my presentation in the papers section.
]]>Strigi desktop integration: how to access Strigi features.
Strigi, the fastest and smallest desktop search engine, provides fast searches and good metadata extraction capabilities. It will be used by the next KDE, but it can be easily integrated in other programs.
Modern humans are using more and more data every day. Keeping data organized is becoming insufficient, so finding and filtering documents have become key tasks in modern operating systems. The traditional unix tools find, grep and locate do no longer suffice for a number of reasons: The amounts of data have grown more than data access speeds, many document formats are too complex to handle by simple tools and relations between different types of data are becoming more important for a convenient user experience. Strigi introduces a new way of looking at metadata and file formats that enables the creation of very efficient tools for improving the way users handle their data. It does so by using simple C++ code with very few dependencies.
Strigi’s features can be easily integrated by other programs using two different interfaces: socket and DBus. The last one can be considered the best choice since there are lot of DBus bindings that make possible to interact with Strigi using different programming languages. DBus inter-process communication system will be used by KDE4 and is actually used by Gnome. Moreover, DBus bindings exists for lot of programming languages. Thanks to this Strigi can be easily integrated in different window manager and inside programs written in languages different from C++.
KDE developers can also take advantage from Strigi’s JStreams: a set of classes for fast access to archive files. In fact Strigi defines a dedicated KIO Slave for JStreams, this can be easily used by other programs, making archives access faster.
This presentation will show how the current Strigi clients interact with the daemon, how to communicate with Strigi and, most important of all, how other projects can benefit too accessing Strigi.
FOSDEM stands for “Free Open Source Developers European Meeting”.
This is one of the most important happenings for the open-source scene in Europe. It takes place in Bruxelles, where for two days the university if full of talks, and stands of open-source projects.
Big projects have a DevRoom, a place reserved for talks related with them. And here I came… As Strigi is related with KDE me and Jos van Oever will give two presentations in the KDE DevRoom.
I’ll give a talk with the title: “Strigi desktop integration”. I’ll talk about the different interfaces you can use for playing with Strigi. These interfaces are really easy to use, this makes possible to integrate Strigi in different projects or write some cool front-end without any hassle. Finally I’ll talk about the technologies that reside into Strigi internals. Since they’re really useful, it can be smart to use them in different situations (also when information retrieval isn’t the main goal). Here you can get more detailed informations about my talk.
]]>The other great thing is that, exactly a month later my graduation, I found a good work in a IT company. In the meantime I got lot of job interviews, and I’ve been really lucky because I had lot of choices. The main office is located in Milan, a great city near Bergamo (where I live). Since I don’t like to live in Milan (for lot of reasons), I became an outlier. So every day I take a train, the subway and a tram in order to reach work. It isn’t so hard as you could thing… I’ve lot of time to spend reading and maybe coding Wink In the end, for the last point regarding strigi and fosdem, I’ll make a dedicated post.
]]>I’ve just contacted klaptopdaemon maintainers and I’m going to commit it into kde svn, so you’ll get everything into next klaptopdaemon release :)
]]>I made this program just for fun and for learning something (it was the first time I used the Qt library), so don’t bother me if this isn’t fully comparable to Hp’s ADK.
I don’t use too much my calculator, the only thing I really needed was a decent note editor and this’s what HpCalc is going to be: an hp 39G note editor.
Actually the program is only an alpha version, but it’s quite usable for loading/editing/saving .not files.
As I said before I don’t use too much my calculator and this isn’t one of my primary interest, so I don’t know if I’ll ever upgrade it.
HpCalc source code is released under GPL2 license, so you can grab it free from this github repository.
]]>This speech illustrates:
A couple of months passed since I started seeking something interesting to challenge with. I found kat, an open-source information retrieval program for KDE. If you don’t know what’s an information retrieval program you’ve just to think about a local google. It’s simply a program that let you search through your local files like google. Other information retrieval programs are beagle and google desktop. I started to study kat’s code and I also discovered that its mantainer is a nice italian guy. Unfortunately kat have some ugly problems and Roberto (the manteiner) can’t fix them because now he’s really busy. I was going to investigate over these problems for fixing them when I discovered a similar project: strigi.
strigi is a really young project created by Jos van den Oever. It’s written in C++ using STL and other external libraries. It runs as a daemon listening over a socket. In this way you’ve just to write your custom gui using your favorite language, nice isn’t it? I’ve contacted Jos and I began to send patches and add new features to strigi, committing them straight into kde subversion repository (cool, I’m a bit excited about it :) ). Recently I added the support for the linux kernel inotify interface, an essential component for strigi.
I really prefer strigi over kat because:
You can find more informations about strigi here.
]]>I’m going to release it under GPL over berlios site. Actually I I’ve registered the project (see http://developer.berlios.de/projects/qshapes/ ), committed some code on the svn repository and uploaded some screenshot). The program is “quite” stable (there’re yet some crashes) and can run under linux, macos and windows. It’s written in C++ using Qt for the gui so it’s really portable. I think I’ll use this program as a starting point for one of my dreams: a multiplatform open-source diagram creation tool like dia, kivio or Microsoft visio®. I like dia and kivio but both lacks of some components / features. Since I don’t like too much “gnomish” software I’ll never improve dia. On the other hand kivio is quite pretty but poor than dia in some situations. Especially kivio isn’t multiplatform and this’s a great problem for a me.
But now I’m really busy so I’ll start working on this project after my second level thesis (I’ll tell you something about it really soon).
]]>have the following keymap: apple keyboard = alt gr
You’ve to edit your /etc/X11/xkb/keycodes/xfree86
file. Remember to make a
backup copy of this file before editing. These’re the easy steps:
Restart X and keep your finger crossed ;)
]]>We just want this mapping:
apple key = alt gr
numpad return (the key near to left arrow) = canc
Simply download the file at the end of the page and load it using loadkeys
ibook-it.map.gz
Simple, isn’t it?
I found this file over internet, the author is Dario Besseghini. Thank you Dario!
{% gist 2469779 %}
]]>Obviously I’m really happy but I’m also really busy because I’m working hard on strigi. I’ll let you know something more about it in these days…
]]>I just wanted to enable mouse button emulation on my iBook using the following shortcuts:
fn + ctrl = middle mouse button
fn + alt = left mouse button
You’ve to enable CONFIG_MAC_EMUMOUSEBTN
in your kernel.
In order to enable this shortcut at every boot add the following lines to your
/etc/sysctl.conf
:
dev.mac_hid.mouse_button_emulation = 1
dev.mac_hid.mouse_button2_keycode = 97
dev.mac_hid.mouse_button3_keycode = 100
Simple, isn’t it?
]]>The article focuses on the installation of linux over some modern console like:
! file: for each file/directory you’ve removed without the command svn del svn-commit will prevent these errors because it will tell svn to:
add all your unversioned files to the repository
delete all the files you’ve removed from your working directory (be careful !!)
svn-commit requires:
svn-commit syntax: svn-commit directory
A simple example: svn commit
pwd``
{% codeblock [svn-commit.py] [lang:python ] %} #! /usr/bin/python
import os,popen2 from sys import argv
if name == “main“:
dir=argv[1]; pipe = os.popen (‘svn status %s’%(dir)) #executes shell command
result = [x.strip() for x in pipe.readlines() ] pipe.close()
to_add=[] to_remove=[]
for x in result: if x.find(‘?’)!=-1: to_add+=[x[x.find(‘/’):len(x)]] elif x.find(‘!’)!=-1: to_remove+=[x[x.find(‘/’):len(x)]]
print “To add\n”; for x in to_add: print (“Command svn add %s”%(x)) pipe= os.popen (“svn add \“%s\“”%(x)) result = [x.strip() for x in pipe.readlines() ] for y in result: print y pipe.close() print “To remove\n”; for x in to_remove: print (“Command svn delete \“%s\“”%(x)) pipe= os.popen (“svn delete \“%s\“”%(x)) result = [x.strip() for x in pipe.readlines() ] for y in result: print y pipe.close()
{% endcodeblock %}
]]>In this example you’ll find how you can match a regexp in a string.
{% codeblock [pattern matching] [lang:c++ ] %}
// Created by Flavio Castelli
#include
int main() { boost::regex pattern (“bg|olug”,boost::regex_constants::icase|boost::regex_constants::perl); std::string stringa (“Searching for BsLug”);
if (boost::regex_search (stringa, pattern, boost::regex_constants::format_perl)) printf (“found\n”); else printf(“not found\n”);
return 0; } {% endcodeblock %}
In this example you’ll find how you can replace a string matching a pattern.
{% codeblock [substitutions] [lang:c++ ] %} // Created by Flavio Castelli flavio.castelli@gmail.com // distrubuted under GPL v2 license
#include
int main() { boost::regex pattern (“b.lug”,boost::regex_constants::icase|boost::regex_constants::perl); std::string stringa (“Searching for bolug”); std::string replace (“BgLug”); std::string newString;
newString = boost::regex_replace (stringa, pattern, replace);
printf(“The new string is: |%s|\n”,newString.c_str());
return 0; } {% endcodeblock %}
In this example you’ll find how you tokenize a string with a pattern.
{% codeblock [split] [lang:c++ ] %} // Created by Flavio Castelli flavio.castelli@gmail.com // distrubuted under GPL v2 license
#include
int main() { boost::regex pattern (”\D”,boost::regex_constants::icase|boost::regex_constants::perl);
std::string stringa (“26/11/2005 17:30”); std::string temp;
boost::sregex_token_iterator i(stringa.begin(), stringa.end(), pattern, -1); boost::sregex_token_iterator j;
unsigned int counter = 0;
while(i != j) { temp = *i; printf (“token %i = |%s|\n”, ++counter, temp.c_str()); i++; }
return 0; } {% endcodeblock %}
In order to build this examples you’ll need:
Requirements: In order to run remove-svn requires python.
Synopsis: remove-svn syntax is: remove-svn dir
in this way remove-svn will recursively remove all .svn directories found under dir.
An example: remove-svn
pwd``
UPDATE A faster way for removing .svn file through this simple bash command: find ./ -name *svn* | xargs rm -rf
The old script has been removed.
]]>This isn’t the final version of qtcanvas, it’s only a backport of the original classes shipped with Qt3.
So what’s the difference between this qtcanvas and qt3canvas (available only through Qt3 support with Qt4)? Simple this version works with all open-source versions of Qt >= 4.1.0!!
In this way you can use qtcanvas also under windows (before it wasn’t possible with the open-source edition).
I’ve tried qtcanvas under Mac OS X Tiger, Gnu-Linux and Windows XP and they work fine.
]]>In this way you can watch the status of one or more guides, keeping the translations updated.
gen-dockeck requires:
gen-docheck syntax: gen-docheck [--help] [--man] [--config configuration
file]
for more informations read the man page: gen-docheck --manan
gen-docheck support also configuration files.
This is an example:
#mail sender
sender = gentoo_doccheck@gentoo.orgThis email address is being protected from spam bots, you need Javascript enabled to view it
#check only guides mathing these names (use "." to match all, "," to separate names)
checkonly = diskless,macos
#checkonly = .
#send mail notify to translator
mailnotify = 0
#send all mail notify to this address
force_mail_destination = flavio.castelli@gmail.com
# smtp server
smtp = smtp.tiscali.it
# debug smtp commands
smtpdebug = 0
You can automate gen-docheck adding it to cron.
Here’s an example:
0 10 * * 0 /home/micron/gen\-docheck/gen\-docheck.pl --config /home/micron/gen\-docheck/gen\-docheck.conff
In this way you’ll run gen-docheck every sunday at 10:00 AM
The code can be found inside of this git repository.
]]>id3medit is a simple script for tagging all mp3/ogg files present in a directory.
id3medit relies on id3v2, a command-line tool for editing id3v2 tags file names must be in format: ’## - trackname.ext’. Where ## is track’s number, and ext is file’s extension (mp3 or ogg in case insensitive format)
id3medit syntax is: id3medit artist album year(*) genre(*)
Where * denotes
optional arguments You can obtain genre identification number in this way:
id3v2 -L | grep -i genre
id3v2 -L | grep -i rock
1: Classic Rock
17: Rock
40: Alt. Rock
47: Instrum. Rock
56: Southern Rock
78: Rock & Roll
79: Hard Rock
81: Folk/Rock
91: Gothic Rock
92: Progress. Rock
93: Psychadel. Rock
94: Symphonic Rock
95: Slow Rock
121: Punk Rock
141: Christian Rock
{% gist 2469919 %}
]]>This speech is about: