Nebula mesh VPN still disappointing after 4 years

Background

My homelab network designs follow a simple hub-and-spoke model: mobile clients route all traffic back to home, using a Wireguard tunnel. In Wireguard client, I specify OPNsense as the DNS server (dnsmasq as recursive resolver). Every device, no matter where they are, has ad blocking and custom domain name resolution.

This setup has served me well for years. With power and internet disruptions at home becoming more frequent—and the possibility of relocating in the future—I’ve begun considering a more robust solution. A solution that does not pose my home router as a single point of failure.

For the next iteration of my homelab network, I decide to go with overlay mesh VPN solution. I know, I know, everyone is using Tailscale these days, but I'd like to use an open source solution with self-hosted control plane (or discovery node, or whatever the name is for a particular project).

I decide to revisit a cool project called nebula. In 2021, before I first deployed VPN solution for my homelab, I briefly tried it but didn't meet my basic requirements. It has the following limitations back then:

  • didn't exist in the repositories of Linux distributions; had to be manually installed (not a big deal)
  • didn't have built-in DNS support (nice to have but meh)
  • didn't have relay support (can be problematic for devices behind CGNAT)
  • didn't have iOS/Android mobile client (a huge show-stopper)

Now in 2025, with v1.9.6, all of the above pain points have seemingly been addressed. I was excited to give it another GO (pun intended) 😂

Implementation

Following nebula documentation to set it up is pretty straight forward. The package now exists in Debian, Ubuntu and pretty much all Linux/BSD repositories.

All I need to do it is to install the package on all nodes, set up CA, generate and distribute certificates, and put config.yml to all nodes (except for mobile clients, which still do not support full config)

Differences between nebula and Tailscale

One major difference between nebula and Tailscale, is that almost all configuration (including firewall settings) is applied on each node, not on the lighthouse (control plane). It also leaves certificates management to the administrator.

For a nebula site with a handful of servers/clients, this is still manageable by hand. But once the number grows, or you have complex firewall requirements, or just want to follow the best practice by rotating certificates more frequently, it becomes a nightmare trying to keep each node up-to-date.

From what I can tell, nebula was and still is catered to large-scale enterprise use cases. It scales great with automation, but for a homelab environment, it is positioned quite awkwardly. I don't want to and simply don't have time to run a private CA just for it. The best I can do is probably write an Ansible playbook to use jinga2 template to automate config.yml generation for each host. Then again, since each node is slightly different, Ansible won't really save time, as it's merely a way to centrally manage configuration.

It takes a relatively large amount of work to maintain a relatively small setup.

Limitations as of v1.9.6

Now, will it truly live up to its promised features? Well, let’s find out.

Packages

As I have mention before, this is not an issue anymore. Although, the package supplied systemd unit "nebula@.service" still lacks documentation.

DNS

The DNS feature lighthouse provides is very primitive, as it only dynamically resolve nodes' names (not even lighthouses themselves, see limitations).

Also, config.yml doesn't allow overriding operating system's DNS resolver. This is less of an issue on Linux/BSD or even Windows/Mac, but can absolutely be a show-stopper on mobile clients.

On iOS, since I cannot customize DNS server (well, I can do it for every Wi-Fi connection, but not for cellular data connection), I must rely on the VPN client to override DNS. Wireguard iOS client does this, Tailscale too, so does every major commercial VPN client. BUT THIS IS NOT POSSIBLE ON NEBULA!!! (issue#9)

iOS client

On my servers and Linux desktop, things are pretty smooth once set up. Not so much on iOS. First off, you cannot import configuration, period. Configuration has to be entered manually. Configuration options are very limited, comparing to desktop versions. The following configurations (that are important to me) are not supported on iOS:

You cannot upload an already signed device certificate (issue#20). You have to export the device-generated public key (not CSR), sign it with CA, then upload the device cert. Why does it have to be so quirky?

Oh, last but not least, you cannot make the VPN connection stay always-on (issue#49), ugh!

Conclusion

After spending two nights testing nebula v1.9.6 and its corresponding iOS client, I have to give it a hard pass, again, after 4 years.

It seems like this project is not aimed at small organisation and hobbyist. Maybe performance at very large scale is its selling point? I don't know, but it doesn't capture me as a potential user.

With the concept of overlay mesh VPN become so ubiquitous in 2025, I find it baffling that nebula still lacks many basic features, comparing to Tailsclae/Headscale, ZeroTier, Netbird, you name it.

Anyway, moving forward, I will test Headscale and report back after some time.

OPNsense zpool upgrade

TLDR: I enabled ZFS feature flags on the boot pool of OPNsense (by ignorance), and had to update UEFI boot code in order not to "brick" it.
I want to document this unsettling experience for anyone who has walked the same path and is desperately searching for remedy.

Background

I was doing OPNsense major version upgrade from 25.1 to 25.7. Things went pretty smoothly and I did some post-upgrade checks. One of the checks was zpool status -v and I discovered that there are new feature flags that can be enabled for ZFS pool.

Story

Without thinking too much (read: at all), I went ahead and did zpool upgrade -a. Here is the output:

root@OPNsense:/home/ewon # zpool upgrade -a
This system supports ZFS pool feature flags.

Enabled the following features on 'zroot':
  edonr
  zilsaxattr
  head_errlog
  blake3
  block_cloning
  vdev_zaps_v2

Pool 'zroot' has the bootfs property set, you might need to update
the boot code. See gptzfsboot(8) and loader.efi(8) for details.
root@OPNsense:/home/ewon #

The seemingly casual sentence "you might need to update the boot code" caught my attention, I went searching for this and discovered this forum post. I feel a cold shiver runs down my spine and break into a sweat. If I hadn't caught this, the next reboot will send my home network to hell, literally.

Fix

Luckily, following that people shared in the post by updating UEFI code, I was able to avert a crisis.

cp /boot/loader.efi /boot/efi/efi/boot/bootx64.efi
cp /boot/loader.efi /boot/efi/efi/freebsd/loader.efi

If your machine is running in BIOS mode, do

gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 2 da0

From now on, I won't do zpool upgrade on OPNsense. It should be left alone as a network appliance, not a storage server.