New article on woodpecker
continuous-integration/drone/push Build is passing Details

This commit is contained in:
nemunaire 2023-10-27 16:55:30 +02:00
parent 71cacaa029
commit b2e13e8985
3 changed files with 440 additions and 0 deletions

View File

@ -0,0 +1,219 @@
---
title: "Unify HTTP requests and GRPC calls on a single domain for more flexible configuration: example with Woodpecker"
date: !!timestamp '2023-10-27 09:35:00'
image: /post/woodpecker-ci-mixing-http-grpc-on-one-domain-nginx/og.webp
tags:
- network
- hosting
- continuous integration
---
I installed the continuous integration service [Woodpecker](https://woodpecker-ci.org/), to replace [DroneCI](https://drone.io), which [the company that bought it decided to bury](https://github.com/harness/gitness#where-is-drone).
As Woodpecker is a fork of the latest free version of Drone, its use is broadly similar.
However, the teams have taken different directions on certain aspects, and communication with agents/*runners*, which used to be via websockets, is now carried out in Woodpecker using the GRPC protocol.
The solution proposed by the [Woodpecker documentation](https://woodpecker-ci.org/docs/administration/proxy#caddy) is to use 2 domains: one will be used for the web interface and the REST API, the second will be used for GRPC.
Is this really necessary?
<!-- more -->
## `nginx` configuration using 2 domains
Woodpecker exposes the web interface and GRPC on two different ports: 8000 and 9000 respectively.
The simplest approach to exposing these two services on the Internet is to use two separate domains.
Here's what our reverse-proxy configuration might look like:
```
server {
listen 80;
server_name woodpecker.example.com;
location / {
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_pass http://127.0.0.1:8000;
proxy_redirect off;
proxy_http_version 1.1;
proxy_buffering off;
chunked_transfer_encoding off;
}
}
server {
listen 80 http2;
server_name woodpeckeragent.example.com;
location / {
grpc_pass grpc://127.0.0.1:9000;
}
}
```
Here we're using `nginx`'s [GRPC module](https://nginx.org/en/docs/http/ngx_http_grpc_module.html), and in particular the [`grpc_pass`](https://nginx.org/en/docs/http/ngx_http_grpc_module.html#grpc_pass) directive.
This directive is similar to the [`proxy_pass`](https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass) directive: it will forward to port 9000 on the local machine all GRPC packets arriving on the `woodpeckeragent.example.com`.
On the other hand, I prefer to avoid declaring 2 domains, obtaining 2 certificates, etc. for 1 and the same service, with a domain that you tend to forget and neglect.
So let's see if we can't do better.
## The GRPC protocol
Without going into too much detail, GRPC is a [protocol close to HTTP/2](https://grpc.io/docs/what-is-grpc/faq/#why-is-grpc-betterworse-than-rest).
As such, many reverse-proxies are capable of transmitting GRPC requests.
This is the case of [Caddy](https://woodpecker-ci.org/docs/administration/proxy#caddy), a brief example of which is given, [Traefik](https://woodpecker-ci.org/docs/administration/proxy#traefik).
But `nginx` also [supports the transmission of GRPC requests](https://www.nginx.com/blog/nginx-1-13-10-grpc/).
When a GRPC request is received by the reverse-proxy, it sees an HTTP/2 request similar to :
```
POST /pkg.Service/Function HTTP/2.0
Host: grpc.example.com
User-Agent: grpc-go/1.21.0
[...]
```
At first glance, there's nothing confusing for a web server.
There must be something clever we can do to use these similarities to our advantage.
## Combine HTTP requests and GRPC calls on the same domain
The path for GPRC requests is fixed, and depends on the protobuf file describing calls and structures.
Each *Service* is declared within a package (*pkg*), then *Functions* complete the path.
All calls are `POSTs`.
So we need to extract the various HTTP routes used by each protobuf service.
Let's take a look at the following file for Wookpecker :
<https://github.com/woodpecker-ci/woodpecker/blob/main/pipeline/rpc/proto/woodpecker.proto>
```proto
[...]
package proto;
[...]
service Woodpecker {
rpc Version (Empty) returns (VersionResponse) {}
rpc Next (NextRequest) returns (NextResponse) {}
rpc Init (InitRequest) returns (Empty) {}
rpc Wait (WaitRequest) returns (Empty) {}
rpc Done (DoneRequest) returns (Empty) {}
rpc Extend (ExtendRequest) returns (Empty) {}
rpc Update (UpdateRequest) returns (Empty) {}
rpc Log (LogRequest) returns (Empty) {}
rpc RegisterAgent (RegisterAgentRequest) returns (RegisterAgentResponse) {}
rpc ReportHealth (ReportHealthRequest) returns (Empty) {}
}
[...]
service WoodpeckerAuth {
rpc Auth (AuthRequest) returns (AuthResponse) {}
}
[...]
```
From this descriptive file, we have now determined all the routes that can be used by all clients using this version of the service.
The `proto` package contains 2 services: `Woodpecker` and `WoodpeckerAuth`.
This gives us 2 root routes:
- `/proto.Woodpecker/`
- `/proto.WoodpeckerAuth/`
Behind each, we find the functions described by each `rpc` line.
For example:
- `/proto.Woodpecker/Version`
- `/proto.Woodpecker/Done`
- `/proto.Woodpecker/Log`
- `/proto.WoodpeckerAuth/Auth`
- ...
## `nginx` configuration
So here's what our `nginx` configuration might look like, using a single :
```
server {
listen 80;
http2 on;
server_name woodpecker.example.com;
location / {
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_pass http://127.0.0.1:8000;
proxy_redirect off;
proxy_http_version 1.1;
proxy_buffering off;
chunked_transfer_encoding off;
}
location /proto.Woodpecker/ {
grpc_pass grpc://127.0.0.1:9000;
}
location /proto.WoodpeckerAuth/ {
grpc_pass grpc://127.0.0.1:9000;
}
}
```
Note that, of course, this only works if HTTP/2 is enabled by means of [the `http2` directive](https://nginx.org/en/docs/http/ngx_http_v2_module.html#http2).
There's no need to specify functions individually, as `nginx` will perform the routing using only the package and service name.
## Timeout on the `Next` function
Woodpecker declares an `Next` function that waits for information about the next job to be performed.
This waiting time is generally quite variable, and can be quite long if your IC is not constantly in use.
`nginx` will take the initiative, by default, to stop any request that has been inactive for more than 60 seconds.
This causes a large number of reconnections for the Woodpecker agent, visible in the agent logs:
{"level":"error","error":"rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 504 (Gateway Timeout); transport: received unexpected content-type \"text/html\"","time":"2023-10-27T11:28:51Z","message":"grpc error: wait(): code: Unavailable: rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 504 (Gateway Timeout); transport: received unexpected content-type \"text/html\""}
The agent will reconnect itself, but we can reduce the occurrence of this reconnection (whose main benefit is to do nothing until the server wakes us up), by adding this line to our configuration `nginx` :
```
location /proto.Woodpecker/Next {
grpc_pass grpc://127.0.0.1:9008;
grpc_read_timeout 3600s;
}
```
These few lines tell `nginx` not to disconnect inactive connections before 1 hour of inactivity, only for the `Next` function of the protocol.
## Filter IPs that can access the GRPC service
In addition to the [`grpc_pass`](https://nginx.org/en/docs/http/ngx_http_grpc_module.html#grpc_pass) directive, you can of course use all the other directives you're used to.
For example, we might want to filter the IPs that have access to GRPC.
We could do this with [the `allow` and `deny` directives](https://nginx.org/en/docs/http/ngx_http_access_module.html):
```
location /proto.Woodpecker/ {
allow 192.168.0.0/24;
deny all;
grpc_pass grpc://127.0.0.1:9000;
}
```
Since only agents use the GRPC protocol, it would be perfectly legitimate to implement such a rule.
---
Unifying HTTP requests and GRPC calls under a single domain offers many advantages.
Not only does this approach simplify reverse-proxy configuration, it also enables more centralized, streamlined management of the service.
By the way, GRPC and protobuf are an excellent solution for transmitting structured data over the network, so don't hesitate to [take a look](https://protobuf.dev/overview/).

Binary file not shown.

After

Width:  |  Height:  |  Size: 219 KiB

View File

@ -0,0 +1,221 @@
---
title: "Unifier les requêtes HTTP et appels GRPC sur un domaine unique pour une configuration plus modulable: exemple avec Woodpecker"
date: !!timestamp '2023-10-27 09:35:00'
image: /post/woodpecker-ci-mixing-http-grpc-on-one-domain-nginx/og.webp
tags:
- network
- hosting
- continuous integration
---
J'ai installé le service d'intégration continue [Woodpecker](https://woodpecker-ci.org/), afin de remplacer [DroneCI](https://drone.io), que [l'entreprise l'ayant racheté a décidé de l'enterrer](https://github.com/harness/gitness#where-is-drone).
Woodpecker étant un fork de la dernière version libre de Drone, son utilisation est globalement semblable.
Néanmoins, les équipes ont suivies des orientations différentes sur certains aspects, et la communication avec les agents/*runners*, qui se faisaient avant au moyen de websockets, est réalisée dans Woodpecker au moyen du protocole GRPC.
La solution proposée par la [documentation de Woodpecker](https://woodpecker-ci.org/docs/administration/proxy#caddy) est d'utiliser 2 domaines: un sera utilisé pour l'interface web et l'API REST, le second sera utilisé pour GRPC.
Est-ce vraiment nécessaire?
<!-- more -->
## Configuration `nginx` utilisant 2 domaines
Woodpecker expose l'interface web et GRPC sur deux ports différents: respectivement 8000 et 9000.
L'approche la plus simple pour exposer ces deux services sur Internet est d'utiliser deux domaines distincts.
Voici à quoi pourrait ressembler la configuration de notre reverse-proxy:
```
server {
listen 80;
server_name woodpecker.example.com;
location / {
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_pass http://127.0.0.1:8000;
proxy_redirect off;
proxy_http_version 1.1;
proxy_buffering off;
chunked_transfer_encoding off;
}
}
server {
listen 80 http2;
server_name woodpeckeragent.example.com;
location / {
grpc_pass grpc://127.0.0.1:9000;
}
}
```
Nous utilisons ici le [module GRPC](https://nginx.org/en/docs/http/ngx_http_grpc_module.html) de `nginx` avec notamment la directive [`grpc_pass`](https://nginx.org/en/docs/http/ngx_http_grpc_module.html#grpc_pass).
Cette directive est similaire à la directive [`proxy_pass`](https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass): elle transmettra au port 9000 de la machine locale tous les paquets GRPC arrivant sur le domaine `woodpeckeragent.example.com`.
Par contre, déclarer 2 domaines, obtenir 2 certificats, ... pour 1 seul et même service, avec un domaine que l'on va avoir tendance à oublier et négliger, personnellement je préfère éviter.
Voyons donc si l'on ne peut pas réussir à faire mieux.
## Le protocole GRPC
Sans rentrer dans les détails, GRPC est un [protocole proche de HTTP/2](https://grpc.io/docs/what-is-grpc/faq/#why-is-grpc-betterworse-than-rest).
À ce titre, de nombreux reverse-proxy sont capables de transmettre les requêtes GRPC.
C'est le cas de [Caddy](https://woodpecker-ci.org/docs/administration/proxy#caddy) dont un exemple succinct est donné, [Traefik](https://woodpecker-ci.org/docs/administration/proxy#traefik).
Mais `nginx` aussi [supporte la transmission de requêtes GRPC](https://www.nginx.com/blog/nginx-1-13-10-grpc/).
Lorsqu'une requête GRPC est reçue par le reverse-proxy, celui-ci voit une requête HTTP/2 semblable à :
```
POST /pkg.Service/Function HTTP/2.0
Host: grpc.example.com
User-Agent: grpc-go/1.21.0
[...]
```
Rien donc a priori de déroutant pour un serveur web.
On doit bien pouvoir faire quelque chose de malin pour utiliser ces similitudes à notre avantage.
## Regrouper requêtes HTTP et appels GRPC sur un même domaine
Le chemin des requêtes GPRC est fixe, il dépend du fichier protobuf décrivant les appels et les structures.
Chaque *Service* est déclaré au sein d'un paquetage (*pkg*), des *Fonctions* complètent ensuite le chemin.
Tous les appels sont des `POST`.
Il convient donc d'extraire les différentes routes HTTP utilisées par chaque service protobuf.
Voyons le fichier suivant pour Wookpecker:
<https://github.com/woodpecker-ci/woodpecker/blob/main/pipeline/rpc/proto/woodpecker.proto>
```proto
[...]
package proto;
[...]
service Woodpecker {
rpc Version (Empty) returns (VersionResponse) {}
rpc Next (NextRequest) returns (NextResponse) {}
rpc Init (InitRequest) returns (Empty) {}
rpc Wait (WaitRequest) returns (Empty) {}
rpc Done (DoneRequest) returns (Empty) {}
rpc Extend (ExtendRequest) returns (Empty) {}
rpc Update (UpdateRequest) returns (Empty) {}
rpc Log (LogRequest) returns (Empty) {}
rpc RegisterAgent (RegisterAgentRequest) returns (RegisterAgentResponse) {}
rpc ReportHealth (ReportHealthRequest) returns (Empty) {}
}
[...]
service WoodpeckerAuth {
rpc Auth (AuthRequest) returns (AuthResponse) {}
}
[...]
```
À partir de ce fichier descriptif, nous venons de déterminer l'intégralité des routes qui pourront être empruntées par tous les clients utilisant cette version du service.
Le package `proto` contient 2 services: `Woodpecker` et `WoodpeckerAuth`.
Cela donne donc 2 routes racines:
- `/proto.Woodpecker/`
- `/proto.WoodpeckerAuth/`
Derrière chacune, on retrouvera les fonctions décrites par chaque ligne `rpc`.
Par exemple:
- `/proto.Woodpecker/Version`
- `/proto.Woodpecker/Done`
- `/proto.Woodpecker/Log`
- `/proto.WoodpeckerAuth/Auth`
- ...
## Configuration `nginx`
Voici donc à quoi pourrait ressemble notre configuration `nginx`, en utilisant un seul domaine:
```
server {
listen 80;
http2 on;
server_name woodpecker.example.com;
location / {
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_pass http://127.0.0.1:8000;
proxy_redirect off;
proxy_http_version 1.1;
proxy_buffering off;
chunked_transfer_encoding off;
}
location /proto.Woodpecker/ {
grpc_pass grpc://127.0.0.1:9000;
}
location /proto.WoodpeckerAuth/ {
grpc_pass grpc://127.0.0.1:9000;
}
}
```
Notez que cela ne fonctionne bien évidemment qu'en activant HTTP/2 au moyen de [la directive `http2`](https://nginx.org/en/docs/http/ngx_http_v2_module.html#http2).
Il n'est pas nécessaire d'indiquer individuellement les fonctions, `nginx` réalisera le routage en se contentant du package et du nom du service.
## Timeout sur la fonction `Next`
Woodpecker déclare une fonction `Next` qui attend les informations d'un prochain travail à effectuer.
Ce temps d'attente est généralement très variable, plutôt très long si votre CI n'est pas sollicitée en permanence.
`nginx` va prendre l'initiative, par défaut, d'arrêter toute requête inactive depuis plus de 60 secondes.
Cela occasionne un grand nombre reconnexion pour l'agent Woodpecker, visible dans les journaux de l'agent:
{"level":"error","error":"rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 504 (Gateway Timeout); transport: received unexpected content-type \"text/html\"","time":"2023-10-27T11:28:51Z","message":"grpc error: wait(): code: Unavailable: rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 504 (Gateway Timeout); transport: received unexpected content-type \"text/html\""}
L'agent se reconnectera de lui-même, mais on peut tout de même réduire l'occurence de cette reconnexion (dont l'intérêt principal est de ne rien faire tant que le serveur ne nous réveille pas), en ajoutant cette ligne à notre configuration `nginx`:
```
location /proto.Woodpecker/Next {
grpc_pass grpc://127.0.0.1:9008;
grpc_read_timeout 3600s;
}
```
Ces quelques lignes demandent à `nginx` de ne pas couper les connexions inactives avant 1 heure d'inactivité, uniquement pour la fonction `Next` du protocole.
## Filtrer les IP pouvant accéder au service GRPC
Outre la directive [`grpc_pass`](https://nginx.org/en/docs/http/ngx_http_grpc_module.html#grpc_pass), il est bien évidemment possible d'utiliser toutes les directives que l'on a l'habitude d'utiliser.
Nous pourrions par exemple vouloir filtrer les IP ayant accès à GRPC.
On pourrait faire cela avec [les directives `allow` et `deny`](https://nginx.org/en/docs/http/ngx_http_access_module.html):
```
location /proto.Woodpecker/ {
allow 192.168.0.0/24;
deny all;
grpc_pass grpc://127.0.0.1:9000;
}
```
Seuls les agents utilisent le protocole GRPC, il serait tout à fait légitime de mettre une telle règle en place.
---
L'unification des requêtes HTTP et des appels GRPC sous un seul et même domaine présente de nombreux avantages.
Non seulement cette approche simplifie la configuration du reverse-proxy, mais elle permet également une gestion plus centralisée et épurée du service.
D'ailleurs, GRPC et protobuf sont une excellente solution pour transmettre des données structurées sur le réseau, n'hésitez pas à [y jeter un œil](https://protobuf.dev/overview/).