In the previous recipe, we covered how Docker handles iptables
rules for the most common container networking needs. However, there may be cases where you wish to extend the default iptables
configuration to either allow more access or limit the scope of connectivity. In this recipe, we'll walk through a couple of examples of how to implement custom iptables
rules. We'll focus on limiting the scope of sources connecting to services running on your containers as well as allowing the Docker host itself to connect to those services.
We'll be using the same Docker host with the same configuration from the previous recipe. The Docker service should be configured with the --iptables=false
service option, and there should be two containers defined—web1
and web2
. If you are unsure how to get to this state, please see the previous recipe. In order to define a new iptables
policy, we'll also need to flush all the existing iptables
rules out of the NAT and the FILTER table. The easiest way to do this is to reboot the host.
If you prefer not to reboot, you can change the default filter policy back to allow
. Then, flush the filter and NAT table as follows:
sudo iptables -P INPUT ACCEPT sudo iptables -P FORWARD ACCEPT sudo iptables -P OUTPUT ACCEPT sudo iptables -t filter -F sudo iptables -t nat -F
At this point, you should once again have a Docker host with two containers running and an empty default iptables
policy. To begin, let's once again change the default filter policy to deny
while ensuring that we still allow our management connection over SSH:
user@docker1:~$ sudo iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW,ESTABLISHED -j ACCEPT user@docker1:~$ sudo iptables -A OUTPUT -o eth0 -p tcp --sport 22 -m state --state ESTABLISHED -j ACCEPT user@docker1:~$ sudo iptables -P INPUT DROP user@docker1:~$ sudo iptables -P FORWARD DROP user@docker1:~$ sudo iptables -P OUTPUT DROP
Because we'll be focusing on policy around the filter table, let's put in the NAT policy unchanged from the previous recipe. These NATs cover both outbound masquerading and inbound masquerading for the destination NATs for the service in each container:
user@docker1:~$ sudo iptables -t nat -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE user@docker1:~$ sudo iptables -t nat -A PREROUTING ! -i docker0 -p tcp -m tcp --dport 32768 -j DNAT --to-destination 172.17.0.2:80 user@docker1:~$ sudo iptables -t nat -A PREROUTING ! -i docker0 -p tcp -m tcp --dport 32769 -j DNAT --to-destination 172.17.0.3:80
One of the items you might be interested in configuring is limiting the scope of what the containers can access on the outside network. You'll notice that, in previous examples, the containers were allowed to talk to anything externally. This was allowed since the filter rule was rather generic:
sudo iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
This rule allows the containers to talk to anything out of any interface besides docker0
. Rather than allowing this, we can specify only the ports we want to allow outbound. So for instance, if we publish port 80
, we can then define a reverse or outbound rule, which only allows that specific return traffic. Let's first recreate the inbound rules we used in the previous example:
user@docker1:~$ sudo iptables -A FORWARD -d 172.17.0.2/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 80 -j ACCEPT user@docker1:~$ sudo iptables -A FORWARD -d 172.17.0.3/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 80 -j ACCEPT
Now we can easily replace the more generic outbound rule with specific rules that only allow the return traffic on port 80
. For example, let's put in a rule that allows the container web1
to return traffic only on port 80
:
user@docker1:~$ sudo iptables -A FORWARD -s 172.17.0.2/32 -i docker0 ! -o docker0 -p tcp -m tcp --sport 80 -j ACCEPT
If we check, we should see that from the outside network we can get to the service on web1
:
However, the container web1
is not able to talk to anything on the outside network except on port 80
at this point because we didn't use the generic outbound rule:
user@docker1:~$ docker exec -it web1 ping 4.2.2.2 -c 2 PING 4.2.2.2 (4.2.2.2): 48 data bytes user@docker1:~$
To fix this, we can add specific rules to allow things like ICMP sourced from the web1
container:
user@docker1:~$ sudo iptables -A FORWARD -s 172.17.0.2/32 -i docker0 ! -o docker0 -p icmp -j ACCEPT
The above rule coupled with the state-aware return rule from the previous recipe will allow the web1 container to initiate and receive return ICMP traffic.
user@docker1:~$ sudo iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
user@docker1:~$ docker exec -it web1 ping 4.2.2.2 -c 2 PING 4.2.2.2 (4.2.2.2): 48 data bytes 56 bytes from 4.2.2.2: icmp_seq=0 ttl=50 time=33.892 ms 56 bytes from 4.2.2.2: icmp_seq=1 ttl=50 time=34.326 ms --- 4.2.2.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 33.892/34.109/34.326/0.217 ms user@docker1:~$
In the case of the web2
container, its web server can still not be accessed from the outside network. If we wish to limit the source of traffic which can talk to the web server we could do that by altering the inbound port 80
rule, or by specifying a destination in the outbound port 80
rule. We could for instance limit the source of the traffic to a single device on the outside network by specifying a destination in the egress rule:
user@docker1:~$ sudo iptables -A FORWARD -s 172.17.0.3/32 -d 10.20.30.13 -i docker0 ! -o docker0 -p tcp -m tcp --sport 80 -j ACCEPT
Now if we try from a lab device on the outside network with the IP address of 10.20.30.13
, we should be able to access the web server:
[user@lab1 ~]# ip addr show dev eth0 | grep inet inet 10.20.30.13/24 brd 10.20.30.255 scope global eth0 [user@lab2 ~]# curl http://docker1.lab.lab:32769 <body> <html> <h1><span style="color:#FF0000;font-size:72px;">Web Server #2 - Running on port 80</span> </h1> </body> </html> [user@lab1 ~]#
But if we try from a different lab server with a different IP address, the connection will fail:
[user@lab2 ~]# ip addr show dev eth0 | grep inet inet 10.20.30.14/24 brd 10.20.30.255 scope global eth0 [user@lab2 ~]# curl http://docker1.lab.lab:32769 [user@lab2 ~]#
Again, this rule could be implemented either as an inbound or outbound rule.
When managing the iptables
rules in this manner, you might have noticed that the Docker host itself no longer has connectivity to the containers and the services they are hosting:
user@docker1:~$ ping 172.17.0.2 -c 2 PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data. ping: sendmsg: Operation not permitted ping: sendmsg: Operation not permitted --- 172.17.0.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 999ms user@docker1:~$
This is because all of the rules we've been writing in the filter table have been in the forward chain. The forward chain only applies to traffic the host is forwarding, not traffic that is originated or destined to the host itself. To fix this, we can put rules in the INPUT
and OUTPUT
chains of the filter table. To allow ICMP to and from the containers, we can specify rules like this:
user@docker1:~$ sudo iptables -A OUTPUT -o docker0 -p icmp -m state --state NEW,ESTABLISHED -j ACCEPT user@docker1:~$ sudo iptables -A INPUT -i docker0 -p icmp -m state --state ESTABLISHED -j ACCEPT
The rule being added to the output chain looks for traffic headed out of the docker0
bridge (toward the containers), that is of protocol ICMP, and is a new or established flow. The rule being added to the input chain looks for traffic headed into the docker0
bridge (toward the host), that is of protocol ICMP, and is an established flow. Since the traffic is being originated from the Docker host, these rules will match and allow the ICMP traffic to the container to work:
user@docker1:~$ ping 172.17.0.2 -c 2 PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data. 64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.081 ms 64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.021 ms --- 172.17.0.2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.021/0.051/0.081/0.030 ms user@docker1:~$
However, this still will not allow the containers themselves to ping the default gateway. This is because the rule we added to the input chain matching traffic coming into the docker0
bridge is only looking for established sessions. To have this work bidirectionally, you'd need to add the NEW
flag to the second rule so that it would also match new flows generated by the containers toward the host:
user@docker1:~$ sudo iptables -A INPUT -i docker0 -p icmp -m state --state NEW,ESTABLISHED -j ACCEPT
Since the rule we added to the output chain already specifies new or established flows, ICMP connectivity from the containers to the host will now also work:
user@docker1:~$ docker exec -it web1 ping PING 172.17.0.1 (172.17.0.1): 48 data bytes 56 bytes from 172.17.0.1: icmp_seq=0 ttl=64 time=0.073 ms 56 bytes from 172.17.0.1: icmp_seq=1 ttl=64 time=0.079 ms ^C--- 172.17.0.1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.073/0.076/0.079/0.000 ms user@docker1:~$