DIY Linux Router Part 3: PPPOE and Routing

The following is the third part of a multipart series describing how I build (software not hardware) my own Linux router from scratch, based on Debian 11.
- Part 1: Hardware
- Part 2: Interfaces, DHCP and VLAN
- Part 4: Firewall and Port Forwards
- Part 5: DNS with Unbound
- Part 6: WireGuard VPN
- Part 7: WiFi
- Part 8: NetFlow / IPFIX
Now, that our interfaces are set up, we will connect the router to the DSL Modem and enable it to route packets from clients to the internet.
Modem
My Draytek Modem is configured in Modem/Bridge Mode which means it will not start a PPPOE connection itself, but instead will bridge the DSL link through to the WAN interface of my router.

The Modem also takes care of the VLAN ID 7 that DSL operates on, which means we don't need to set up VLAN 7 on the WAN interface on the router itself.

PPPOE
As mentioned in the last post, systemd-networkd can not establish a PPPOE connection and currently has no plans to do so. But there is a commonly used piece of software for Linux called pppd that we will use. Under Debian we install the package pppoe
which installs pppd and the required pppoe driver.
Configuration is done through the /etc/ppp/peers/provider
file. The name of the file can be anything you want, when starting pppd later we just need to give it the correct name. The content of this file is still a bit of a mystery to me, since I'm not perfectly familiar with all DSL termini. Currently it looks like this:
noipdefault
defaultroute
hide-password
lcp-echo-interval 20
lcp-echo-failure 3
connect /bin/true
noauth
persist
noaccomp
default-asyncmap
plugin rp-pppoe.so
nic-wan
user "XXXXXXXXXXXXXXXXXXXXXXXX#[email protected]"
nodetach
persist
debug
Using man pppd
you can read about all these options in more detail. The basics are as follows:
nic-wan: This configures the physical interface that pppd should use. In our case it's the interface we renamed to wan.
user: This is my username for my Telekom DSL Provider ("Anschlusskennung"+"Zugangsnummer"+“0001"+"@t-online.de"). The password is in another file called chap-secrets inside the /etc/ppp/ directory. It looks like this: "user" * password
"XXXXXXXXXXXXXXXXXXXXXXXX#[email protected]" * "YYYYYYYY"
defaultroute: After establishing the PPPOE connection the daemon will create a default route to send all traffic through the ppp0
interface.
persist: Do not exit the daemon after a connection has been made. We run pppd as a systemd service and want it to continue. This way, when the DSL connection breaks up, the daemon will automatically try to reestablish it.
nodetach: Do not fork the daemon when we run it as a systemd service.
Inside the /etc/ppp/ip-up.d/
directory are a few scripts that are executed when pppd establishes a connection. These modify DNS settings for example which I don't want, so I delete all of them. I then add one simple script that restarts my firewall whenever a new DSL connection is made.
#!/bin/bash
systemctl restart firewall
This is needed since long running connections through the NAT seem to break and some website / services will stop working. Restarting my iptables firewall fixes these problems. The next blog post will go into detail how I set up my firewall.
The last thing we need to do is run pppd. I created a systemd service for this with as much sandboxing as possible. Since pppd runs as root, restricting its permissions is a good idea.
Create a file called pppoe.service inside /etc/systemd/system/
[Unit]
Description=Connect DSL
After=network.target
[Service]
Type=exec
ExecStart=/usr/sbin/pppd call provider
StandardOutput=null
Restart=always
RestartSec=10s
# filesystem access
ProtectSystem=strict
ReadWritePaths=/run/
PrivateTmp=true
ProtectControlGroups=true
ProtectKernelModules=true
ProtectKernelTunables=true
# network
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_PPPOX AF_PACKET AF_NETLINK
# misc
NoNewPrivileges=true
RestrictRealtime=true
MemoryDenyWriteExecute=true
ProtectKernelLogs=true
LockPersonality=true
ProtectHostname=true
RemoveIPC=true
RestrictSUIDSGID=true
RestrictNamespaces=true
# capabilities
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_RAW
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
[Install]
WantedBy=multi-user.target
In the ExecStart command you can see that we call our file called provider, which you would need to change if you called your file differently. We disable the Standard Output because pppd outputs its logs via Standard Output and via journald and all logs would be duplicated without this.
Now we enable and start the service with systemctl enable --now pppoe
. If everything works you should see something like this inside journald:
pppd[68655]: Plugin rp-pppoe.so loaded.
pppd[68655]: pppd 2.4.9 started by root, uid 0
pppd[68655]: PPP session is 2155
pppd[68655]: Using interface ppp0
pppd[68655]: Connect: ppp0 <--> wan
pppd[68655]: PAP authentication succeeded
pppd[68655]: peer from calling number DC:38:E1:F9:BD:CC authorized
pppd[68655]: local IP address 91.51.139.196
pppd[68655]: remote IP address 62.155.244.49
pppd[68655]: local LL address fe80::ac1e:bf86:945b:c2bf
pppd[68655]: remote LL address fe80::de38:e1ff:fef9:bdcc
And your newly created interface should look like this:

MSS Clamping
When using a DSL connection it can happen that your clients send TCP packets that a bigger then your DSL providers router allows. In a perfect world your provider would send you a ICMP message indicating that the packet was too big and your client could adjust the packet size. This is called Path MTU discovery, but since we do not live a a perfect world, this does not always work and can lead to hanging connections.
A fix for this is to reduce the packet size on the router and never send packets that are too big. This is called MSS Clamping. A nice explanation can be found here: https://samuel.kadolph.com/2015/02/mtu-and-tcp-mss-when-using-pppoe-2/
We achieve this by using the iptables TCPMSS target together with the --clamp-mss-to-pmtu option:
iptables -t mangle -o ppp0 -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
The --clamp-mss-to-pmtu option will use the interfaces MTU (1492 bytes on ppp0) subtract 40 bytes and then set the MSS to this value; in our case 1452 bytes.
Your router should now be able to connect to the internet itself. For the clients behind the router we need some additional configuration.
Routing
We now need to allow the kernel to forward IP packets. Without it, packets that arrive at one of the ethernet interfaces, which have a destination address that's not the router itself, will be dropped.
To do this you only need to set one sysctl setting. To make permanent changes to the sysctl parameters, edit /etc/sysctl.conf
and add the following to line to it:
net.ipv4.ip_forward=1
Now either run sudo sysctl --system
or reboot your system for the change to apply.
NAT / Masquerade
Your router will now route traffic from your clients to your DSL provider, but the destination your clients try to communicate with, would receive a packet with a source IP address within the 192.168.144.0/24 network and would not be able to reach this address. We need to mask the clients local IP address with the public IP address of the routers ppp0
interface.
We again use iptables for this:
iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
The router will now exchange the source address of outgoing packets with its own public IP address. When it receives an answer it will then forward it to the client that requested it.
Bufferbloat / congestion control
To get the most out of your internet connection, the following two sysctl option are recommended.
net.core.default_qdisc=fq_codel
Using fq_codel
as active queue management can help to reduce bufferbloat. Bufferbloat can lead to increased latency when your connection is saturated. It is less of a problem for me with 200/45 mbit/s but it is essential for slower DSL connection. More about bufferbloat: https://www.bufferbloat.net/projects/
net.ipv4.tcp_congestion_control=bbr
net.ipv4.tcp_notsent_lowat=16384
The new bbr congestion control from Google can also decrease latency and increase throughput. There is a nice explanation on cloudflares blog about it, much better than what I could write: https://blog.cloudflare.com/http-2-prioritization-with-nginx/
In the cloudflare blog post they use fq
for active queue management, because bbr
was not compatible wir fq_codel
at the time. Since kernel 4.13 bbr
and fq_codel
can be used together.
You can test if these options add a meaningful performance increase here: http://www.dslreports.com/speedtest
Up next: Firewall and Port Forwards