nftables

From ArchWiki

nftables is a netfilter project that aims to replace the existing {ip,ip6,arp,eb}tables framework. It provides a new packet filtering framework, a new user-space utility (nft), and a compatibility layer for {ip,ip6}tables. It uses the existing hooks, connection tracking system, user-space queueing component, and logging subsystem of netfilter.

It consists of three main components: a kernel implementation, the libnl netlink communication and the nftables user-space front-end. The kernel provides a netlink configuration interface, as well as run-time rule-set evaluation, libnl contains the low-level functions for communicating with the kernel, and the nftables front-end is what the user interacts with via nft.

You can also visit the official nftables wiki page for more information.

Installation

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: Mention iptables-nft.[1] (Discuss in Talk:Nftables)

Install the userspace utilities package nftables or the git version nftables-gitAUR.

Tip: Most iptables front-ends feature no direct or indirect support of nftables, but may introduce it.[2] One graphical front-end that supports both, nftables and iptables, is firewalld.[3]

Usage

nftables makes no distinction between temporary rules made in the command line and permanent ones loaded from or saved to a file.

All rules have to be created or loaded using nft command line utility.

Refer to #Configuration section on how to use.

Current ruleset can be printed with:

# nft list ruleset

Simple firewall

nftables comes with a simple and secure firewall configuration stored in the /etc/nftables.conf file.

The nftables.service will load rules from that file when started or enabled.

Configuration

nftables user-space utility nft performs most of the rule-set evaluation before handing rule-sets to the kernel. Rules are stored in chains, which in turn are stored in tables. The following sections indicate how to create and modify these constructs.

To read input from a file use the -f/--file option:

# nft --file filename

Note that any rules already loaded are not automatically flushed.

See nft(8) for a complete list of all commands.

Tables

Tables hold #Chains. Unlike tables in iptables, there are no built-in tables in nftables. The number of tables and their names is up to the user. However, each table only has one address family and only applies to packets of this family. Tables can have one of five families specified:

nftables family iptables utility
ip iptables
ip6 ip6tables
inet iptables and ip6tables
arp arptables
bridge ebtables

ip (i.e. IPv4) is the default family and will be used if family is not specified.

To create one rule that applies to both IPv4 and IPv6, use inet. inet allows for the unification of the ip and ip6 families to make defining rules for both easier.

See nft(8) § ADDRESS FAMILIES for a complete description of address families.

In all of the following, family_type is optional, and if not specified is set to ip.

Create table

The following adds a new table:

# nft add table family_type table_name

List tables

To list all tables:

# nft list tables

List chains and rules in a table

To list all chains and rules of a specified table do:

# nft list table family_type table_name

For example, to list all the rules of the my_table table of the inet family:

# nft list table inet my_table

Delete table

To delete a table do:

# nft delete table family_type table_name

This will destroy all chains in the table.

Flush table

To flush all rules from a table do:

# nft flush table family_type table_name

Chains

The purpose of chains is to hold #Rules. Unlike chains in iptables, there are no built-in chains in nftables. This means that if no chain uses any types or hooks in the netfilter framework, packets that would flow through those chains will not be touched by nftables, unlike iptables.

Chains have two types. A base chain is an entry point for packets from the networking stack, where a hook value is specified. A regular chain may be used as a jump target for better organization.

In all of the following family_type is optional, and if not specified is set to ip.

Create chain

Regular chain

The following adds a regular chain named chain_name to the table named table_name:

# nft add chain family_type table_name chain_name

For example, to add a regular chain named my_tcp_chain to the my_table table of the inet address family do:

# nft add chain inet my_table my_tcp_chain
Base chain

To add a base chain specify hook and priority values:

# nft add chain family_type table_name chain_name '{ type chain_type hook hook_type priority priority_value ; }'

chain_type can be filter, route, or nat.

For IPv4/IPv6/Inet address families hook_type can be prerouting, input, forward, output, or postrouting. See nft(8) § ADDRESS FAMILIES for a list of hooks for other families.

priority_value takes either a priority name or an integer value. See nft(8) § CHAINS for a list of standard priority names and values. Chains with lower numbers are processed first and can be negative. [4]

For example, to add a base chain that filters input packets:

# nft add chain inet my_table my_chain '{ type filter hook input priority 0; }'

Replace add with create in any of the above to add a new chain but return an error if the chain already exists.

List rules

The following lists all rules of a chain:

# nft list chain family_type table_name chain_name

For example, the following lists the rules of the chain named my_output in the inet table named my_table:

# nft list chain inet my_table my_output

Edit a chain

To edit a chain, simply call it by its name and define the rules you want to change.

# nft chain family_type table_name chain_name '{ [ type chain_type hook hook_type device device_name priority priority_value ; policy policy_type ; ] }'

For example, to change the my_input chain policy of the default table from accept to drop

# nft chain inet my_table my_input '{ policy drop ; }'

Delete a chain

To delete a chain do:

# nft delete chain family_type table_name chain_name

The chain must not contain any rules or be a jump target.

Flush rules from a chain

To flush rules from a chain do:

# nft flush chain family_type table_name chain_name

Rules

Rules are either constructed from expressions or statements and are contained within chains.

Add rule

Tip: The iptables-translate utility translates iptables rules to nftables format.

To add a rule to a chain do:

# nft add rule family_type table_name chain_name handle handle_value statement

The rule is appended at handle_value, which is optional. If not specified, the rule is appended to the end of the chain.

To prepend the rule to the position do:

# nft insert rule family_type table_name chain_name handle handle_value statement

If handle_value is not specified, the rule is prepended to the chain.

Expressions

Typically a statement includes some expression to be matched and then a verdict statement. Verdict statements include accept, drop, queue, continue, return, jump chain_name, and goto chain_name. Other statements than verdict statements are possible. See nft(8) for more information.

There are various expressions available in nftables and, for the most part, coincide with their iptables counterparts. The most noticeable difference is that there are no generic or implicit matches. A generic match was one that was always available, such as --protocol or --source. Implicit matches were protocol-specific, such as --sport when a packet was determined to be TCP.

The following is an incomplete list of the matches available:

  • meta (meta properties, e.g. interfaces)
  • icmp (ICMP protocol)
  • icmpv6 (ICMPv6 protocol)
  • ip (IP protocol)
  • ip6 (IPv6 protocol)
  • tcp (TCP protocol)
  • udp (UDP protocol)
  • sctp (SCTP protocol)
  • ct (connection tracking)

The following is an incomplete list of match arguments (for a more complete list, see nft(8)):

meta:
  oif <output interface INDEX>
  iif <input interface INDEX>
  oifname <output interface NAME>
  iifname <input interface NAME>

  (oif and iif accept string arguments and are converted to interface indexes)
  (oifname and iifname are more dynamic, but slower because of string matching)

icmp:
  type <icmp type>

icmpv6:
  type <icmpv6 type>

ip:
  protocol <protocol>
  daddr <destination address>
  saddr <source address>

ip6:
  daddr <destination address>
  saddr <source address>

tcp:
  dport <destination port>
  sport <source port>

udp:
  dport <destination port>
  sport <source port>

sctp:
  dport <destination port>
  sport <source port>

ct:
  state <new | established | related | invalid>

Deletion

Individual rules can only be deleted by their handles. The nft --handle list command must be used to determine rule handles. Note the --handle switch, which tells nft to list handles in its output.

The following determines the handle for a rule and then deletes it. The --numeric argument is useful for viewing some numeric output, like unresolved IP addresses.

# nft --handle --numeric list chain inet my_table my_input
table inet my_table {
     chain input {
          type filter hook input priority 0;
          ip saddr 127.0.0.1 accept # handle 10
     }
}
# nft delete rule inet my_table my_input handle 10

All the chains in a table can be flushed with the nft flush table command. Individual chains can be flushed using either the nft flush chain or nft delete rule commands.

# nft flush table table_name
# nft flush chain family_type table_name chain_name
# nft delete rule family_type table_name chain_name

The first command flushes all of the chains in the ip table_name table. The second flushes the chain_name chain in the family_type table_name table. The third deletes all of the rules in chain_name chain in the family_type table_name table.

Sets

Sets are named or anonymous, and consist of one or more elements, separated by commas, enclosed by curly braces. Anonymous sets are embedded in rules and cannot be updated, you must delete and re-add the rule. E.g., you cannot just remove "http" from the dports set in the following:

# nft add rule ip6 filter input tcp dport {telnet, http, https} accept

Named sets can be updated, and can be typed and flagged. sshguard uses named sets for the IP addresses of blocked hosts.

table ip sshguard {
       set attackers {
               type ipv4_addr
               flags interval
               elements = { 1.2.3.4 }
       }

To add or delete elements from the set, use:

# nft add element ip sshguard attackers { 5.6.7.8/32 }
# nft delete element ip sshguard attackers { 1.2.3.4/32 }

Note the type ipv4_addr can include a CIDR netmask (the "/32" here is not necessary, but is included for completeness' sake). Note also, the set defined here by "TABLE ip sshguard { SET attackers }" is addressed as ip sshguard attackers.

Atomic reloading

Flush the current ruleset:

# echo "flush ruleset" > /tmp/nftables 

Dump the current ruleset:

# nft -s list ruleset >> /tmp/nftables

Now you can edit /tmp/nftables and apply your changes with:

# nft -f /tmp/nftables

Examples

Workstation

/etc/nftables.conf
flush ruleset

table inet my_table {
	set LANv4 {
		type ipv4_addr
		flags interval

		elements = { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16 }
	}
	set LANv6 {
		type ipv6_addr
		flags interval

		elements = { fd00::/8, fe80::/10 }
	}

	chain my_input_lan {
		udp sport 1900 udp dport >= 1024 meta pkttype unicast limit rate 4/second burst 20 packets accept comment "Accept UPnP IGD port mapping reply"

		udp sport netbios-ns udp dport >= 1024 meta pkttype unicast accept comment "Accept Samba Workgroup browsing replies"

	}

	chain my_input {
		type filter hook input priority filter; policy drop;

		iif lo accept comment "Accept any localhost traffic"
		ct state invalid drop comment "Drop invalid connections"
		ct state established,related accept comment "Accept traffic originated from us"

		meta l4proto ipv6-icmp accept comment "Accept ICMPv6"
		meta l4proto icmp accept comment "Accept ICMP"
		ip protocol igmp accept comment "Accept IGMP"

		udp dport mdns ip6 daddr ff02::fb accept comment "Accept mDNS"
		udp dport mdns ip daddr 224.0.0.251 accept comment "Accept mDNS"

		ip6 saddr @LANv6 jump my_input_lan comment "Connections from private IP address ranges"
		ip saddr @LANv4 jump my_input_lan comment "Connections from private IP address ranges"

		counter comment "Count any other traffic"
	}

	chain my_forward {
		type filter hook forward priority filter; policy drop;
		# Drop everything forwarded to us. We do not forward. That is routers job.
	}

	chain my_output {
		type filter hook output priority filter; policy accept;
		# Accept every outbound connection
	}

}

Server

/etc/nftables.conf
flush ruleset

table inet my_table {
	set LANv4 {
		type ipv4_addr
		flags interval

		elements = { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16 }
	}
	set LANv6 {
		type ipv6_addr
		flags interval

		elements = { fd00::/8, fe80::/10 }
	}

	chain my_input_lan {
		meta l4proto { tcp, udp } th dport 2049 accept comment "Accept NFS"

		udp dport netbios-ns accept comment "Accept NetBIOS Name Service (nmbd)"
		udp dport netbios-dgm accept comment "Accept NetBIOS Datagram Service (nmbd)"
		tcp dport netbios-ssn accept comment "Accept NetBIOS Session Service (smbd)"
		tcp dport microsoft-ds accept comment "Accept Microsoft Directory Service (smbd)"

		udp sport { bootpc, 4011 } udp dport { bootps, 4011 } accept comment "Accept PXE"
		udp dport tftp accept comment "Accept TFTP"
	}

	chain my_input {
		type filter hook input priority filter; policy drop;

		iif lo accept comment "Accept any localhost traffic"
		ct state invalid drop comment "Drop invalid connections"
		ct state established,related accept comment "Accept traffic originated from us"

		meta l4proto ipv6-icmp accept comment "Accept ICMPv6"
		meta l4proto icmp accept comment "Accept ICMP"
		ip protocol igmp accept comment "Accept IGMP"

		udp dport mdns ip6 daddr ff02::fb accept comment "Accept mDNS"
		udp dport mdns ip daddr 224.0.0.251 accept comment "Accept mDNS"

		ip6 saddr @LANv6 jump my_input_lan comment "Connections from private IP address ranges"
		ip saddr @LANv4 jump my_input_lan comment "Connections from private IP address ranges"

		tcp dport ssh accept comment "Accept SSH on port 22"

		tcp dport ipp accept comment "Accept IPP/IPPS on port 631"

		tcp dport { http, https, 8008, 8080 } accept comment "Accept HTTP (ports 80, 443, 8008, 8080)"

		udp sport bootpc udp dport bootps ip saddr 0.0.0.0 ip daddr 255.255.255.255 accept comment "Accept DHCPDISCOVER (for DHCP-Proxy)"
	}

	chain my_forward {
		type filter hook forward priority filter; policy drop;
		# Drop everything forwarded to us. We do not forward. That is routers job.
	}

	chain my_output {
		type filter hook output priority filter; policy accept;
		# Accept every outbound connection
	}

}

Limit rate

table inet my_table {
	chain my_input {
		type filter hook input priority filter; policy drop;

		iif lo accept comment "Accept any localhost traffic"
		ct state invalid drop comment "Drop invalid connections"

		meta l4proto icmp icmp type echo-request limit rate over 10/second burst 4 packets drop comment "No ping floods"
		meta l4proto ipv6-icmp icmpv6 type echo-request limit rate over 10/second burst 4 packets drop comment "No ping floods"

		ct state established,related accept comment "Accept traffic originated from us"

		meta l4proto ipv6-icmp icmpv6 type { destination-unreachable, packet-too-big, time-exceeded, echo-reply, parameter-problem, mld-listener-query, mld-listener-report, mld-listener-reduction, nd-router-solicit, nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert, ind-neighbor-solicit, ind-neighbor-advert, mld2-listener-report } accept comment "Accept ICMPv6"
		meta l4proto icmp icmp type { destination-unreachable, router-solicitation, router-advertisement, time-exceeded, parameter-problem } accept comment "Accept ICMP"
		ip protocol igmp accept comment "Accept IGMP"

		tcp dport ssh ct state new limit rate 15/minute accept comment "Avoid brute force on SSH"

	}

}

Jump

When using jumps in configuration file, it is necessary to define the target chain first. Otherwise one could end up with Error: Could not process rule: No such file or directory.

table inet my_table {
    chain web {
        tcp dport http accept
        tcp dport 8080 accept
    }
    chain my_input {
        type filter hook input priority filter;
        ip saddr 10.0.2.0/24 jump web
        drop
    }
}

Different rules for different interfaces

If your box has more than one network interface, and you would like to use different rules for different interfaces, you may want to use a "dispatching" filter chain, and then interface-specific filter chains. For example, let us assume your box acts as a home router, you want to run a web server accessible over the LAN (interface enp3s0), but not from the public internet (interface enp2s0), you may want to consider a structure like this:

table inet my_table {
  chain my_input { # this chain serves as a dispatcher
    type filter hook input priority filter;

    iif lo accept comment "always accept loopback"
    iifname enp2s0 jump my_input_public
    iifname enp3s0 jump my_input_private

    reject with icmpx port-unreachable comment "refuse traffic from all other interfaces"
  }
  chain my_input_public { # rules applicable to public interface interface
    ct state {established,related} accept
    ct state invalid drop
    udp dport bootpc accept
    tcp dport bootpc accept
    reject with icmpx port-unreachable comment "all other traffic"
  }
  chain my_input_private {
    ct state {established,related} accept
    ct state invalid drop
    udp dport bootpc accept
    tcp dport bootpc accept
    tcp port http accept
    tcp port https accept
    reject with icmpx port-unreachable comment "all other traffic"
  }
  chain my_output { # we let everything out
    type filter hook output priority filter;
    accept
  }
}

Alternatively you could choose only one iifname statement, such as for the single upstream interface, and put the default rules for all other interfaces in one place, instead of dispatching for each interface.

Masquerading

nftables has a special keyword masquerade "where the source address is automagically set to the address of the output interface" (source). This is particularly useful for situations in which the IP address of the interface is unpredictable or unstable, such as the upstream interface of routers connecting to many ISPs. Without it, the Network Address Translation rules would have to be updated every time the IP address of the interface changed.

To use it:

  • make sure masquerading is enabled in the kernel (true if you use the default kernel), otherwise during kernel configuration, set CONFIG_NFT_MASQ=m.
  • the masquerade keyword can only be used in chains of type nat.
  • masquerading is a kind of source NAT, so only works in the output path.

Example for a machine with two interfaces: LAN connected to enp3s0, and public internet connected to enp2s0:

table inet my_nat {
  chain my_masquerade {
    type nat hook postrouting priority srcnat;
    oifname "enp2s0" masquerade
  }
}

Since the table type is inet both IPv4 and IPv6 packets will be masqueraded. If you want only ipv4 packets to be masqueraded (since extra adress space of IPv6 makes NAT not required) meta nfproto ipv4 expression can be used infront of oifname "enp2s0" masquerade or the table type can be changed to ip.

NAT with port forwarding

Tango-inaccurate.pngThe factual accuracy of this article or section is disputed.Tango-inaccurate.png

Reason: I think my_postrouting chain will cause the destination computer see that connections are made by router rather than from some global IP. Also this does not masquerade outbound traffic. (Discuss in Talk:Nftables)

This example will forward ports 22 and 80 to destination_ip. You will need to set net.ipv4.ip_forward and net.ipv4.conf.wan_interface.forwarding to 1 via sysctl.

table ip my_nat {
  chain my_prerouting {
    type nat hook prerouting priority dstnat;

    tcp dport { ssh, http } dnat to destination_ip
  }

  chain my_postrouting {
    type nat hook postrouting priority srcnat;

    ip daddr destination_ip masquerade
  }
}

Count new connections per IP

Use this snippet to count HTTPS connections:

/etc/nftables.conf
table inet filter {
    set https {
        type ipv4_addr;
        flags dynamic;
        size 65536;
        timeout 60m;
    }

    chain input {
        type filter hook input priority filter;
        ct state new tcp dport 443 update @https { ip saddr counter }
    }
}

To print the counters, run nft list set inet filter https.

Dynamic blackhole

Use this snippet to drop all HTTPS connections for 1 minute from a source IP that exceeds the limit of 10/second.

/etc/nftables.conf
table inet dev {
    set blackhole {
        type ipv4_addr;
        flags dynamic, timeout;
        size 65536;
    }

    chain input {
        ct state new tcp dport 443 \
                meter flood size 128000 { ip saddr timeout 10s limit rate over 10/second } \
                add @blackhole { ip saddr timeout 1m }

        ip saddr @blackhole counter drop
    }
}

To print the blackholed IPs, run nft list set inet dev blackhole.

Tips and tricks

Saving current rule set

The output of nft list ruleset command is a valid input file for it as well. Current rule set can be saved to file and later loaded back in.

$ nft -s list ruleset | tee filename
Note: nft list does not output variable definitions, if you had any in your original file they will be lost. Any variables used in rules will be replaced by their value.

Simple stateful firewall

Tango-inaccurate.pngThe factual accuracy of this article or section is disputed.Tango-inaccurate.png

Reason: This is not a very simple firewall. I would consider what Arch Linux ships in /etc/nftables.conf simple. Recommend replacing this section with that script and give some directions on how to expand it for specific needs. (Discuss in Talk:Nftables)

See Simple stateful firewall for more information.

Single machine

Flush the current ruleset:

# nft flush ruleset

Add a table:

# nft add table inet my_table

Add the input, forward, and output base chains. The policy for input and forward will be to drop. The policy for output will be to accept.

# nft add chain inet my_table my_input '{ type filter hook input priority 0 ; policy drop ; }'
# nft add chain inet my_table my_forward '{ type filter hook forward priority 0 ; policy drop ; }'
# nft add chain inet my_table my_output '{ type filter hook output priority 0 ; policy accept ; }'

Add two regular chains that will be associated with tcp and udp:

# nft add chain inet my_table my_tcp_chain
# nft add chain inet my_table my_udp_chain

Related and established traffic will be accepted:

# nft add rule inet my_table my_input ct state related,established accept

All loopback interface traffic will be accepted:

# nft add rule inet my_table my_input iif lo accept

Drop any invalid traffic:

# nft add rule inet my_table my_input ct state invalid drop

Accept ICMP and IGMP:

# nft add rule inet my_table my_input meta l4proto ipv6-icmp accept
# nft add rule inet my_table my_input meta l4proto icmp accept
# nft add rule inet my_table my_input ip protocol igmp accept

New udp traffic will jump to the UDP chain:

# nft add rule inet my_table my_input meta l4proto udp ct state new jump my_udp_chain

New tcp traffic will jump to the TCP chain:

# nft add rule inet my_table my_input 'meta l4proto tcp tcp flags & (fin|syn|rst|ack) == syn ct state new jump my_tcp_chain'

Reject all traffic that was not processed by other rules:

# nft add rule inet my_table my_input meta l4proto udp reject
# nft add rule inet my_table my_input meta l4proto tcp reject with tcp reset
# nft add rule inet my_table my_input counter reject with icmpx port-unreachable

At this point you should decide what ports you want to open to incoming connections, which are handled by the TCP and UDP chains. For example to open connections for a web server add:

# nft add rule inet my_table my_tcp_chain tcp dport 80 accept

To accept HTTPS connections for a webserver on port 443:

# nft add rule inet my_table my_tcp_chain tcp dport 443 accept

To accept SSH traffic on port 22:

# nft add rule inet my_table my_tcp_chain tcp dport 22 accept

To accept incoming DNS requests:

# nft add rule inet my_table my_tcp_chain tcp dport 53 accept
# nft add rule inet my_table my_udp_chain udp dport 53 accept

Be sure to make your changes permanent when satisifed.

Prevent brute-force attacks

Sshguard is program that can detect brute-force attacks and modify firewalls based on IP addresses it temporarily blacklists. See Sshguard#nftables on how to set up nftables to be used with it.

Logging traffic

You can log packets using the log action. The most simple rule to log all incoming traffic is:

# nft add rule inet filter input log

See nftables wiki for details.

Troubleshooting

Working with Docker

Note: With the following setup, you will not be able to use protocols like AF_BLUETOOTH inside containers even with --net host --privileged.

Using nftables can interfere with Docker networking (and probably other container runtimes as well). You can find various workarounds on the internet which either involve patching iptables rules and ensuring a defined service start order or disabling dockers iptables management completely which makes using docker very restrictive (think port forwarding or docker-compose).

A reliable method is letting docker run in a separate network namespace where it can do whatever it wants. It is probably best to not use iptables-nft to prevent docker from mixing nftables and iptables rules.

Use the following docker service drop-in file:

/etc/systemd/system/docker.service.d/netns.conf
[Service]
PrivateNetwork=yes

# cleanup
ExecStartPre=-nsenter -t 1 -n -- ip link delete docker0

# add veth
ExecStartPre=nsenter -t 1 -n -- ip link add docker0 type veth peer name docker0_ns
ExecStartPre=sh -c 'nsenter -t 1 -n -- ip link set docker0_ns netns "$$BASHPID" && true'
ExecStartPre=ip link set docker0_ns name eth0

# bring host online
ExecStartPre=nsenter -t 1 -n -- ip addr add 10.0.0.1/24 dev docker0
ExecStartPre=nsenter -t 1 -n -- ip link set docker0 up

# bring ns online
ExecStartPre=ip addr add 10.0.0.100/24 dev eth0
ExecStartPre=ip link set eth0 up
ExecStartPre=ip route add default via 10.0.0.1 dev eth0

Adjust the 10.0.0.* IP addresses if they are not appropriate for your setup.

Enable IP forwarding and set-up NAT for docker0 with the following postrouting rule:

iifname docker0 oifname eth0 masquerade

Then, ensure that kernel IP forwarding is enabled.

Now you can setup a firewall and port forwarding for the docker0 interface using nftables without any interference.

See also