Docker

Wed 30 September 2015

In tech.

Networking

By default, when the docker daemon start, it creates a bridge named docker0. It attributes an IP to this bridge. Then all containers created will have two interfaces: - One inside the container with an IP choosen by docker inside the subnet attributed to docker0; - One outside the container that is attached to docker0.

That means that all containers can be accessed via the docker0 bridge.

You can change the subnet and IP used on docker0 with the --bip option.

Create an image

Create a file named Dockerfile:

FROM debian

RUN apt-get update && apt-get install iperf

EXPOSE 5001

ENTRYPOINT ["iperf"]
CMD ["-s"]

Then create the container:

docker build -t iperf .

The image is then available:

# docker images
REPOSITORY                       TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
iperf                            latest              7321c6c74780        19 seconds ago      135.1 MB

So we can run it:

# docker run -d iperf
9028998301c1185a8ac46f3cb79351b03cb33e65c8c640987535f9c07ba85eda

The -d option allow the container to run in the background so that we can get our prompt back. Without it, we would get all outputs from the running container.

By default, the ENTRYPOINT command is run with the optional CMD arguments appended. If I want to run iperf as a client, I can do:

# docker run iperf -c 10.2.0.115

Here the ENTRYPOINT is run but CMD is replaced by the arguments we gave on the command line. (I removed the -d option because I need to get the output of the iperf client, this is not related to the fact that I pass other arguments).

We can see our container running:

# docker ps
CONTAINER ID    IMAGE    COMMAND                  CREATED              STATUS              PORTS        NAMES
9028998301c1    iperf    "/bin/sh -c 'iperf -u"   About a minute ago   Up About a minute   5001/tcp     kickass_sinoussi

We have a container that is running, but we don't know much about it... except if we use:

# docker inspect 9028998301c1

This will give us usefull informations like the IP attributed, the state, the command that is running, ...

The previous command gives some informations but maybe we want more. We have the ability to launch commands inside of the running container to get specific informations or make specific actions.

# docker exec -it 9028998301c1 ip a show dev eth0
193: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:0a:02:00:5f brd ff:ff:ff:ff:ff:ff
    inet 10.2.0.95/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe02:5f/64 scope link 
       valid_lft forever preferred_lft forever

# docker exec -it 9028998301c1 ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4328   624 ?        Ss   19:13   0:00 /bin/sh -c iperf -u -s
root         6  0.0  0.0  96700  1432 ?        Sl   19:13   0:00 iperf -u -s
root        30  0.0  0.0  17492  1996 ?        Rs+  19:20   0:00 ps aux

Network performance

All the network traffic is bridged between containers. Let's see how much we can expect with our test environment:

First, I use the default linux behavior which apply filtering on the bridge:

# sysctl -a | grep bridge
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-filter-pppoe-tagged = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
net.bridge.bridge-nf-pass-vlan-input-dev = 0

With UDP test, I get about 800MB/s.

With TCP I get:

# docker run iperf -i 2 -c 10.2.0.115
------------------------------------------------------------
Client connecting to 10.2.0.115, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.2.0.117 port 52744 connected with 10.2.0.115 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec  4.19 GBytes  18.0 Gbits/sec
[  3]  2.0- 4.0 sec  4.63 GBytes  19.9 Gbits/sec
[  3]  4.0- 6.0 sec  4.50 GBytes  19.3 Gbits/sec
[  3]  6.0- 8.0 sec  4.55 GBytes  19.5 Gbits/sec
[  3]  8.0-10.0 sec  4.53 GBytes  19.4 Gbits/sec
[  3]  0.0-10.0 sec  22.4 GBytes  19.2 Gbits/sec

Now I disable the filtering on the bridge:

# sysctl -w net.bridge.bridge-nf-call-arptables=0
net.bridge.bridge-nf-call-arptables = 0

# sysctl -w net.bridge.bridge-nf-call-ip6tables=0
net.bridge.bridge-nf-call-ip6tables = 0

# sysctl -w net.bridge.bridge-nf-call-iptables=0
net.bridge.bridge-nf-call-iptables = 0

# sysctl -a | grep bridge
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-filter-pppoe-tagged = 0
net.bridge.bridge-nf-filter-vlan-tagged = 0
net.bridge.bridge-nf-pass-vlan-input-dev = 0

With UDP I get the same result: 800MB/s.

With TCP I get:

# docker run iperf -i 2 -c 10.2.0.115
------------------------------------------------------------
Client connecting to 10.2.0.115, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.2.0.119 port 51078 connected with 10.2.0.115 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec  4.86 GBytes  20.9 Gbits/sec
[  3]  2.0- 4.0 sec  5.05 GBytes  21.7 Gbits/sec
[  3]  4.0- 6.0 sec  5.53 GBytes  23.8 Gbits/sec
[  3]  6.0- 8.0 sec  5.16 GBytes  22.2 Gbits/sec
[  3]  0.0-10.0 sec  25.8 GBytes  22.1 Gbits/sec

I get roughly 3Gbits/sec more without bridge filtering.

The limit seems to be the iperf exec using 100% of CPU. I don't see any kernel thread consuming lot of CPU.