The first feature we'll be looking at is MacVLAN. In this recipe, we'll be implementing MacVLAN outside of Docker to gain a better understanding of how it works. Understanding how MacVLAN works outside of Docker will be critical in understanding how Docker consumes MacVLAN. In the next recipe, we'll walk through the Docker network driver implementation of MacVLAN.
In this recipe, we'll be using two Linux hosts (net1
and net2
) to demonstrate MacVLAN functionality. Our lab topology will look as follows:
It is assumed that the hosts are in a base configuration and each host has two network interfaces. The eth0
interface will have a static IP address defined and serve as each hosts default gateway. The eth1
interface will be configured with no IP address. For reference, you can find the network configuration file (/etc/network/interfaces
) for each host following:
net1.lab.lab
auto eth0 iface eth0 inet static address 172.16.10.2 netmask 255.255.255.0 gateway 172.16.10.1 dns-nameservers 10.20.30.13 dns-search lab.lab auto eth1 iface eth1 inet manual
net2.lab.lab
auto eth0 iface eth0 inet static address 172.16.10.3 netmask 255.255.255.0 gateway 172.16.10.1 dns-nameservers 10.20.30.13 dns-search lab.lab auto eth1 iface eth1 inet manual
While we'll cover all of the steps needed to create the topology in this recipe, you may wish to refer to Chapter 1, Linux Networking Constructs, if some of the steps aren't clear. Chapter 1, Linux Networking Constructs, covers the base Linux networking constructs and the CLI tools in much greater depth.
MacVLAN represents an entirely different way to configure interfaces from what we've seen up until this point. Earlier Linux networking configurations we examined relied on constructs that loosely mimicked physical network constructs. MacVLAN interfaces are logical in nature and are bound to an existing network interface. The interface supporting the MacVLAN interfaces is referred to as the parent interface and can support one or more MacVLAN logical interfaces. Let's look at a quick example of configuring a MacVLAN interface on one of our lab hosts.
Configuring MacVLAN type interfaces is done in a very similar manner to all other types on Linux network interfaces. Using the ip
command-line tool, we can use the link
subcommand to define the interface:
user@net1:~$ sudo ip link add macvlan1 link eth0 type macvlan
This syntax should be familiar to you from the first chapter of the book where we defined multiple different interface types. Once created, the next step is to configure it with an IP address. This is also done through the ip
command:
user@net1:~$ sudo ip address add 172.16.10.5/24 dev macvlan1
And finally, we need to make sure that bring the interface up.
user@net1:~$ sudo ip link set dev macvlan1 up
The interface is now up and we can examine the configuration with the ip addr show
command:
user@net1:~$ ip addr show 1: …<loopback interface configuration removed for brevity>… 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:2d:dd:79 brd ff:ff:ff:ff:ff:ff inet 172.16.10.2/24 brd 172.16.10.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe2d:dd79/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:2d:dd:83 brd ff:ff:ff:ff:ff:ff inet6 fe80::20c:29ff:fe2d:dd83/64 scope link valid_lft forever preferred_lft forever 4: macvlan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether da:aa:c0:18:55:4a brd ff:ff:ff:ff:ff:ff inet 172.16.10.5/24 scope global macvlan1 valid_lft forever preferred_lft forever inet6 fe80::d8aa:c0ff:fe18:554a/64 scope link valid_lft forever preferred_lft forever user@net1:~$
There are a couple of interesting items to point out now that we have the interface configured. First, the name of the MacVLAN interface makes it easy to identify the interfaces parent interface. Recall that we mentioned that each MacVLAN interface had to be associated with a parent interface. In this case, we can tell that the parent of this MacVLAN interface is eth0
by looking at the name listed after the macvlan1@
in the MacVLAN interface name. Second, the IP address assigned to the MacVLAN interfaces is in the same subnet as the parent interface (eth0
). This is intentional to allow external connectivity. Let's define a second MacVLAN interface on the same parent interface to demonstrate what sort of connectivity is allowed:
user@net1:~$ sudo ip link add macvlan2 link eth0 type macvlan user@net1:~$ sudo ip address add 172.16.10.6/24 dev macvlan2 user@net1:~$ sudo ip link set dev macvlan2 up
Our network topology is as follows:
We have two MacVLAN interfaces bound to net1's eth0
interface. If we try to reach either interface from an external subnet, the connectivity should work as expected:
user@test_server:~$ ip addr show dev eth0 |grep inet inet 10.20.30.13/24 brd 10.20.30.255 scope global eth0 user@test_server:~$ ping 172.16.10.5 -c 2 PING 172.16.10.5 (172.16.10.5) 56(84) bytes of data. 64 bytes from 172.16.10.5: icmp_seq=1 ttl=63 time=0.423 ms 64 bytes from 172.16.10.5: icmp_seq=2 ttl=63 time=0.458 ms --- 172.16.10.5 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.423/0.440/0.458/0.027 ms user@test_server:~$ ping 172.16.10.6 -c 2 PING 172.16.10.6 (172.16.10.6) 56(84) bytes of data. 64 bytes from 172.16.10.6: icmp_seq=1 ttl=63 time=0.510 ms 64 bytes from 172.16.10.6: icmp_seq=2 ttl=63 time=0.532 ms --- 172.16.10.6 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.510/0.521/0.532/0.011 ms
In the preceding output, I attempted to reach both 172.16.10.5
and 172.16.10.6
from a test server that lives off subnet from the net1
host. In both cases, we were able to reach the IP address of the MacVLAN interfaces implying that routing is working as expected. This is why, we gave the MacVLAN interfaces IP addresses within the existing subnet of the servers eth0
interface. Since the multilayer switch knew that 172.16.10.0/24
lives out of VLAN 10, it simply has to issue an ARP request for the new IP addresses on VLAN 10 to get their MAC addresses. The Linux host already has a default route pointing back to the switch that allows the return traffic to reach the test server. However, this is by no means a requirement of MacVLAN interfaces. I could have easily chosen another IP subnet to use for the interfaces, but that would have prevented external routing from inherently working.
Another item to point out is that the parent interface does not need to have an IP address associated with it. For instance, let's extend the topology by building two more MacVLAN interfaces. One on the host net1
and another on the host net2
:
user@net1:~$ sudo ip link add macvlan3 link eth1 type macvlan user@net1:~$ sudo ip address add 192.168.10.5/24 dev macvlan3 user@net1:~$ sudo ip link set dev macvlan3 up user@net2:~$ sudo ip link add macvlan4 link eth1 type macvlan user@net2:~$ sudo ip address add 192.168.10.6/24 dev macvlan4 user@net2:~$ sudo ip link set dev macvlan4 up
Our topology is as follows:
Despite not having an IP address defined on the physical interface, the hosts now see the 192.168.10.0/24
network as being defined and believe the network to be locally connected:
user@net1:~$ ip route
default via 172.16.10.1 dev eth0
172.16.10.0/24 dev eth0 proto kernel scope link src 172.16.10.2
172.16.10.0/24 dev macvlan1 proto kernel scope link src 172.16.10.5
172.16.10.0/24 dev macvlan2 proto kernel scope link src 172.16.10.6
192.168.10.0/24 dev macvlan3 proto kernel scope link src 192.168.10.5
user@net1:~$
This means that the two hosts can reach each other directly through their associated IP addresses on that subnet:
user@net1:~$ ping 192.168.10.6 -c 2 PING 192.168.10.6 (192.168.10.6) 56(84) bytes of data. 64 bytes from 192.168.10.6: icmp_seq=1 ttl=64 time=0.405 ms 64 bytes from 192.168.10.6: icmp_seq=2 ttl=64 time=0.432 ms --- 192.168.10.6 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.405/0.418/0.432/0.024 ms user@net1:~$
At this point, you might be wondering why you would use a MacVLAN interface type. From the looks of it, it's doesn't appear that much different than just creating logical subinterfaces. The real difference is in how the interface is built. Typically, subinterfaces all use the same MAC address of the parent interfaces. You might have noted in the earlier output and diagrams that the MacVLAN interfaces have different MAC addresses than their associated parent interface. We can validate this on the upstream multilayer switch (gateway) as well:
switch# show ip arp vlan 10 Protocol Address Age (min) Hardware Addr Type Interface Internet 172.16.10.6 8 a2b1.0cd4.4e73 ARPA Vlan10 Internet 172.16.10.5 8 4e19.f07f.33e0 ARPA Vlan10 Internet 172.16.10.2 0 000c.292d.dd79 ARPA Vlan10 Internet 172.16.10.3 62 000c.2959.caca ARPA Vlan10 Internet 172.16.10.1 - 0021.d7c5.f245 ARPA Vlan10
In testing, you might find that the Linux host is presenting the same MAC address for each IP address in your configuration. Depending on what operating system you are running, you may need to change the following kernel parameters in order to prevent the host from presenting the same MAC address:
echo 1 | sudo tee /proc/sys/net/ipv4/conf/all/arp_ignore echo 2 | sudo tee /proc/sys/net/ipv4/conf/all/arp_announce echo 2 | sudo tee /proc/sys/net/ipv4/conf/all/rp_filter
Keep in mind that the applying these settings in this manner won't persist through a reboot.
Looking at the MAC addresses, we can see that the parent interface (172.16.10.2
) and both MacVLAN interfaces (172.16.10.5 and .6
) have different MAC addresses. MacVLAN allows you to present multiple interfaces using different MAC addresses. The net result is that you can have multiple IP interfaces, each with their own unique MAC address, that all use the same physical interface.
Since the parent interface is responsible for multiple MAC addresses it needs to be in promiscuous mode. The host should automatically put an interface into promiscuous mode when it's chosen as a parent interface. You can verify it by checking the ip link details:
user@net2:~$ ip -d link …<output removed for brevity>… 2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 00:0c:29:59:ca:d4 brd ff:ff:ff:ff:ff:ff promiscuity 1 …<output removed for brevity>…
As with other Linux interface types we've seen, MacVLAN interfaces are also namespace aware. This can lead to some interesting configuration options. Let's now look at deploying MacVLAN interfaces within unique network namespaces.
Let's start by deleting all of our existing MacVLAN interfaces:
user@net1:~$ sudo ip link del macvlan1 user@net1:~$ sudo ip link del macvlan2 user@net1:~$ sudo ip link del macvlan3 user@net2:~$ sudo ip link del macvlan4
Much like we did in Chapter 1, Linux Networking Constructs, we can create an interface and then move it into a namespace. We start by creating the namespace:
user@net1:~$ sudo ip netns add namespace1
Then, we create the MacVLAN interface:
user@net1:~$ sudo ip link add macvlan1 link eth0 type macvlan
Next, we move the interface into the newly created network namespace:
user@net1:~$ sudo ip link set macvlan1 netns namespace1
And finally, from within the namespace, we assign it an IP address and bring it up:
user@net1:~$ sudo ip netns exec namespace1 ip address add 172.16.10.5/24 dev macvlan1 user@net1:~$ sudo ip netns exec namespace1 ip link set dev macvlan1 up
Let's also create a second interface within a second namespace for testing purposes:
user@net1:~$ sudo ip netns add namespace2 user@net1:~$ sudo ip link add macvlan2 link eth0 type macvlan user@net1:~$ sudo ip link set macvlan2 netns namespace2 user@net1:~$ sudo ip netns exec namespace2 ip address add 172.16.10.6/24 dev macvlan2 user@net1:~$ sudo ip netns exec namespace2 ip link set dev macvlan2 up
As you play around with different configurations, it's common to create and delete the same interface a number of times. In doing so, you'll likely generate interfaces with the same IP address, but different MAC addresses. Since we're presenting these MAC address to the upstream physical network, always make sure that the upstream device or gateway has the most recent ARP entry for the IP you are trying to reach. It's common for many switches and routers to have long ARP timeout values during which they won't ARP for the newer MAC entry.
At this point, we have a topology that looks something like this:
The parent interface (eth0
) has an IP address as before, but this time, the MacVLAN interfaces live within their own unique namespaces. Despite being in separate namespaces, they still share the same parent since this was done before moving them into the namespace.
At this point, you should note that external hosts can no longer ping all of the IP addresses. Rather, you can only reach the eth0
IP address of 172.16.10.2
. The reason for this is simple. As you'll recall, namespaces are comparable to a Virtual Routing and Forwarding (VRF) and have their own routing table. If you examine, the routing table of both of the namespaces, you'll see that neither of them have a default route:
user@net1:~$ sudo ip netns exec namespace1 ip route 172.16.10.0/24 dev macvlan1 proto kernel scope link src 172.16.10.5 user@net1:~$ sudo ip netns exec namespace2 ip route 172.16.10.0/24 dev macvlan2 proto kernel scope link src 172.16.10.6 user@net1:~$
In order for these interfaces to be reachable off network, we'll need to give each namespace a default route pointing to the gateway on that subnet (172.16.10.1
). Again, this is the benefit of addressing the MacVLAN interfaces in the same subnet as the parent interface. The routing is already there on the physical network. Add the routes and test again:
user@net1:~$ sudo ip netns exec namespace1 ip route add 0.0.0.0/0 via 172.16.10.1 user@net1:~$ sudo ip netns exec namespace2 ip route add 0.0.0.0/0 via 172.16.10.1
From the external test host (some output removed for brevity):
user@test_server:~$ ping 172.16.10.2 -c 2 PING 172.16.10.2 (172.16.10.2) 56(84) bytes of data. 64 bytes from 172.16.10.2: icmp_seq=1 ttl=63 time=0.459 ms 64 bytes from 172.16.10.2: icmp_seq=2 ttl=63 time=0.441 ms user@test_server:~$ ping 172.16.10.5 -c 2 PING 172.16.10.5 (172.16.10.5) 56(84) bytes of data. 64 bytes from 172.16.10.5: icmp_seq=1 ttl=63 time=0.521 ms 64 bytes from 172.16.10.5: icmp_seq=2 ttl=63 time=0.528 ms user@test_server:~$ ping 172.16.10.6 -c 2 PING 172.16.10.6 (172.16.10.6) 56(84) bytes of data. 64 bytes from 172.16.10.6: icmp_seq=1 ttl=63 time=0.524 ms 64 bytes from 172.16.10.6: icmp_seq=2 ttl=63 time=0.551 ms
So while external connectivity appears to be working as expected, you'll note that none of the interfaces can talk to each other:
user@net1:~$ sudo ip netns exec namespace2 ping 172.16.10.5 PING 172.16.10.5 (172.16.10.5) 56(84) bytes of data. --- 172.16.10.5 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 0ms user@net1:~$ sudo ip netns exec namespace2 ping 172.16.10.2 PING 172.16.10.2 (172.16.10.2) 56(84) bytes of data. --- 172.16.10.2 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 0ms user@net1:~$
This seems odd because they all share the same parent interface. The problem is in how the MacVLAN interfaces were configured. The MacVLAN interface type supports four different modes:
While not easy to discern without knowing where to look, our MacVLAN interfaces happen to be of type VEPA, which happens to be the default. We can see this by passing the details (-d
) flag to the ip
command:
user@net1:~$ sudo ip netns exec namespace1 ip -d link show 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 20: macvlan1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 36:90:37:f6:08:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 macvlan mode vepa user@net1:~$
In our case, the VEPA mode is what's preventing the two namespace interfaces from talking directly to each other. More commonly, MacVLAN interfaces are defined as type bridge
to allowed communication between interfaces on the same parent. However, even in this mode, the child interfaces are not allowed to communicate directly with the IP address assigned directly to the parent interface (in this case 172.16.10.2
). This should be a separate paragraph.
user@net1:~$ sudo ip netns del namespace1 user@net1:~$ sudo ip netns del namespace2
Now we can recreate both interfaces specifying the bridge
mode for each MacVLAN interface:
user@net1:~$ sudo ip netns add namespace1 user@net1:~$ sudo ip link add macvlan1 link eth0 type macvlan mode bridge user@net1:~$ sudo ip link set macvlan1 netns namespace1 user@net1:~$ sudo ip netns exec namespace1 ip address add 172.16.10.5/24 dev macvlan1 user@net1:~$ sudo ip netns exec namespace1 ip link set dev macvlan1 up user@net1:~$ sudo ip netns add namespace2 user@net1:~$ sudo ip link add macvlan2 link eth0 type macvlan mode bridge user@net1:~$ sudo ip link set macvlan2 netns namespace2 user@net1:~$ sudo ip netns exec namespace2 sudo ip address add 172.16.10.6/24 dev macvlan2 user@net1:~$ sudo ip netns exec namespace2 ip link set dev macvlan2 up
After specifying the bridge
mode, we can verify that the two interfaces can directly to one another:
user@net1:~$ sudo ip netns exec namespace1 ping 172.16.10.6 -c 2 PING 172.16.10.6 (172.16.10.6) 56(84) bytes of data. 64 bytes from 172.16.10.6: icmp_seq=1 ttl=64 time=0.041 ms 64 bytes from 172.16.10.6: icmp_seq=2 ttl=64 time=0.030 ms --- 172.16.10.6 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.030/0.035/0.041/0.008 ms user@net1:~$
However, we also note that we still cannot reach the hosts IP address defined on the parent interface (eth0
):
user@net1:~$ sudo ip netns exec namespace1 ping 172.16.10.2 -c 2 PING 172.16.10.2 (172.16.10.2) 56(84) bytes of data. --- 172.16.10.2 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1008ms user@net1:~$