~ cd ../

Writing asm for eBPF programs with the Go eBPF library

I’ve recently been working on a Go program that uses the Cilium eBPF library. While doing that, I made extensive use of the asm package from it so I thought I’d share my experience with writing programs using it.

In case you don’t know, the asm library has a Go API that allows you to essentially write a notation that will then be converted to eBPF RISC instructions directly.

You can find more about the instruction set here:

The only detail you have to know to understand this post is that the instruction set is composed of ten general purpose registers (r0, r1, r2, r3, r4, r5, r6, r7, r8, r9), that the first register is the return value of the last helper call and that there’s one r10 register that is read only and also called stack pointer or rfp.

With that knowledge in mind, let’s see how the two notations looks like.

Let’s look at a move between two registers

dst = src

Which with real values becomes

r6 = r1

This can also be thought about as a mov instruction.

mov dst, src

Which in the Cilium eBPF instruction Go notation becomes

asm.Mov.Reg(asm.R6, asm.R1)

With this in mind, and looking at the spec, we write a whole eBPF program in this way. But wait, writing an entire program from scratch if it’s your first time might not be the best idea.

How to get some guidance in doing so? Clang can help! Let’s how by taking this program called myprog.c as example:

#include <linux/bpf.h>
#include <linux/ptrace.h>
#include <linux/version.h>
#include <bpf/bpf_helpers.h>

#define MAX_CPUS 128

struct {
  __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
  __uint(key_size, sizeof(int));
  __uint(value_size, sizeof(__u32));
  __uint(max_entries, MAX_CPUS);
} pids_map SEC(".maps");

SEC("tracepoint/syscalls/sys_enter_read")
int my_program(void *ctx) {
  __u64 pid;
  pid = bpf_get_current_pid_tgid() >> 32;

  bpf_perf_event_output(ctx, &pids_map, BPF_F_CURRENT_CPU, &pid,
                        sizeof(pid));

  return 0;
}

char _license[] SEC("license") = "GPL";

This program will output the PID of the current process doing a read syscall to the pids_map.

To compile that program you need libbpf installed and you can compile it with clang.

clang -O2 -target bpf -g -c myprog.c

This will generate an ELF file called myprog.o, you can then disassemble that to get our familiar instructions.

llvm-objdump -d -S --no-show-raw-insn  --symbolize-operands myprog.o

The result will be something like:

myprog.o:	file format elf64-bpf


Disassembly of section tracepoint/syscalls/sys_enter_read:

0000000000000000 <my_program>:
; int my_program(void *ctx) {
       0:	r6 = r1
;   pid = bpf_get_current_pid_tgid() >> 32;
       1:	call 14
       2:	r0 >>= 32
       3:	*(u64 *)(r10 - 8) = r0
       4:	r4 = r10
       5:	r4 += -8
;   bpf_perf_event_output(ctx, &pids_map, BPF_F_CURRENT_CPU, &pid,
       6:	r1 = r6
       7:	r2 = 0 ll
       9:	r3 = 4294967295 ll
      11:	r5 = 8
      12:	call 25
;   return 0;
      13:	r0 = 0
      14:	exit

Oh, amazing the 0 instruction is exactly what we just wrote before, let’s look at the others!

This is a bit involved, to make it easier to understand, on top of each instruction I added the corresponding notation as it comes out from llvm-objdump as well a comment with the explaination of the important parts.

progSpec.Instructions = asm.Instructions{
	// Move the value of r1 to r6
	// this is done because r1 is `void *ctx` in the tracepoint.
	// ctx needs to be passed to the bpf_perf_event_output function later, so we want
	// to save it in r6.
	// r6 = r1
	asm.Mov.Reg(asm.R6, asm.R1),

	// call 14
	// This will call the 14th helper, `bpf_get_current_pid_tgid`
	asm.FnGetCurrentPidTgid.Call(),

	// This is the equivalent of `pid >>= 32` to access the value of the pid
	// r0 >>= 32
	asm.RSh.Imm(asm.R0, 32),

	// asm.RFP is the same as asm.R10,, the stack pointer
	// we want to store the value of the pid at the frame - 8.
	// DWord means that we want to store a 64 bit value.
	// *(u64 *)(r10 - 8) = r0
	asm.StoreMem(asm.RFP, -8, asm.R0, asm.DWord),

	// move the value of r10 to r4
	// r4 = r10
	asm.Mov.Reg(asm.R4, asm.RFP),

	// r4 += -8
	asm.Add.Imm(asm.R4, -8),

	// move r6 back to r1 so that we can use `ctx` as first argument to bpf_perf_event_output
	// r1 = r6
	asm.Mov.Reg(asm.R1, asm.R6),

	// get the address of the map from Go and assign it to r2
	// so that we can use it as second argument to bpf_perf_event_output
	// r2 = 0 ll
	asm.LoadMapPtr(asm.R2, events.FD()),
	
	// use the magic constant 0xffffffff (bpfFCurrentCPU) to get the current CPU
	// and then use it as third argument to bpf_perf_event_output
	// r3 = 4294967295 ll
	asm.LoadImm(asm.R3, bpfFCurrentCPU, asm.DWord),

	// argument 5 is the size of the entry
	// r5  = 8
	asm.Mov.Imm(asm.R5, 8),

	// call the 25th helper, `bpf_perf_event_output` with the arguments above.
	// call 25
	asm.FnPerfEventOutput.Call(),

	// r0 = 0
	asm.Mov.Imm(asm.R0, 0).Sym("exit"),

	// exit
	asm.Return(),
}

To confirm the arguments of an helper, when in doubt, you can grep in the Kernel’s source code looking for [helper_name]_proto, for example: bpf_perf_event_output_proto.

You’ll find something like this:

static const struct bpf_func_proto bpf_perf_event_output_proto = {
	.func		= bpf_perf_event_output,
	.gpl_only	= true,
	.ret_type	= RET_INTEGER,
	.arg1_type	= ARG_PTR_TO_CTX,
	.arg2_type	= ARG_CONST_MAP_PTR,
	.arg3_type	= ARG_ANYTHING,
	.arg4_type	= ARG_PTR_TO_MEM,
	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
};

Now that we have the instructions explained, we can assemble everything in a Go program!

For this one, I took inspiration from the examples folder in the Cilium eBPF library here.

package main

import (
	"encoding/binary"
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"

	"golang.org/x/sys/unix"

	"github.com/cilium/ebpf"
	"github.com/cilium/ebpf/asm"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/perf"
)

const bpfFCurrentCPU = 0xffffffff

var progSpec = &ebpf.ProgramSpec{
	Name:    "my_program",
	Type:    ebpf.TracePoint,
	License: "GPL",
}

func main() {
	stopper := make(chan os.Signal, 1)
	signal.Notify(stopper, os.Interrupt, syscall.SIGTERM)

	if err := unix.Setrlimit(unix.RLIMIT_MEMLOCK, &unix.Rlimit{
		Cur: unix.RLIM_INFINITY,
		Max: unix.RLIM_INFINITY,
	}); err != nil {
		log.Fatalf("setting temporary rlimit: %s", err)
	}

	events, err := ebpf.NewMap(&ebpf.MapSpec{
		Type: ebpf.PerfEventArray,
		Name: "pids_map",
	})
	if err != nil {
		log.Fatalf("creating perf event array: %s", err)
	}

	defer events.Close()

	rd, err := perf.NewReader(events, os.Getpagesize())
	if err != nil {
		log.Fatalf("creating event reader: %s", err)
	}
	defer rd.Close()

	go func() {
		<-stopper
		rd.Close()
	}()

	progSpec.Instructions = asm.Instructions{
		// r6 = r1
		asm.Mov.Reg(asm.R6, asm.R1),

		// r1 = *(u64 *)(r6 + 16)
		asm.LoadMem(asm.R1, asm.R6, 16, asm.DWord),

		// if r1 != 0 goto +11 <LBB0_2>
		asm.JNE.Imm(asm.R1, 0, "exit"),

		// call 14
		// *(u64 *)(r10 - 8) = r0
		asm.FnGetCurrentPidTgid.Call(),

		// r0 >>= 32
		asm.RSh.Imm(asm.R0, 32),

		// *(u64 *)(r10 - 8) = r0
		asm.StoreMem(asm.RFP, -8, asm.R0, asm.DWord),

		// r4 = r10
		asm.Mov.Reg(asm.R4, asm.RFP),

		// r4 += -8
		asm.Add.Imm(asm.R4, -8),

		// r1 = r6
		asm.Mov.Reg(asm.R1, asm.R6),

		// r2 = 0 ll
		asm.LoadMapPtr(asm.R2, events.FD()),

		// r3 = 4294967295 ll
		asm.LoadImm(asm.R3, bpfFCurrentCPU, asm.DWord),

		// r5  = 8
		asm.Mov.Imm(asm.R5, 8),

		// call 25
		asm.FnPerfEventOutput.Call(),

		// r0 = 0
		asm.Mov.Imm(asm.R0, 0).Sym("exit"),

		// exit
		asm.Return(),
	}

	prog, err := ebpf.NewProgram(progSpec)
	if err != nil {
		log.Fatalf("creating ebpf program: %s", err)
	}
	defer prog.Close()

	tp, err := link.Tracepoint("syscalls", "sys_enter_read", prog)
	if err != nil {
		log.Fatalf("opening tracepoint: %s", err)
	}
	defer tp.Close()

	log.Println("Waiting for events..")

	for {
		record, err := rd.Read()
		if err != nil {
			if perf.IsClosed(err) {
				log.Println("Received signal, exiting..")
				return
			}
			log.Fatalf("reading from reader: %s", err)
		}

		data := binary.LittleEndian.Uint64(record.RawSample)
		fmt.Println(data)
	}
}

Now save it as main.go in a folder you like and compile it.

go mod init github.com/fntlnz/bpf-go-asm-example
go mod tidy
go mod verify
go build -o example .

Once compiled you will have a Go binary called example which will just print the PID of every process doing a read syscall. You can change sys_enter_read to any syscall tracepoint you like if you want to experiment a little bit.

I hope this was useful, my reason for doing this was to explore more about portability and writing small composable pieces purely in Go without shipping other toolchains and when I started doing this I was not sure what instruction did what so I thought that sharing my experience here would’ve been useful to others!

⚡ Follow me on Twitter @fntlnz