====== Forwarding performance lab of an IBM eServer 306m with Intel 82546GB ======
{{description>Forwarding performance lab of an one core IBM server and dual-port gigabit Intel NIC}}
===== Hardware detail =====
This lab will test an [[IBM eServer xSeries 306m]] with **one** core (Intel Pentium4 3.00GHz, hyper-threading disabled) and a dual NIC 82546GB connected to the PCI-X Bus.
Overloading a one core server should be easy.
===== Lab set-up =====
The lab is detailed here: [[documentation:examples:Setting up a forwarding performance benchmark lab]].
BSDRP-amd64 v1.4 (FreeBSD 9.1) is used on the router.
==== Diagram ====
+-------------------+ +----------------------------------------+ +-------------------+
| Packet generator | | Device under Test | | Packet receiver |
| em0: 2.2.2.2 |=====>| em1: 2.2.2.3 em0: 1.1.1.3 |=====>| em0: 1.1.1.1 |
| 00:1b:21:d5:66:15 | | 00:0e:0c:de:45:df 00:0e:0c:de:45:de | | 00:1b:21:d5:66:0e |
+-------------------+ +----------------------------------------+ +-------------------+
Generator will use this command:
pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10
Receiver will use this command:
pkt-gen -i em0 -w 10
===== Configuring =====
For this small lab, we will configure the router.
==== Disabling Ethernet flow-control ===
First, disable Ethernet flow-control:
echo "hw.em.0.fc=0" >> /etc/sysctl.conf
echo "hw.em.1.fc=0" >> /etc/sysctl.conf
sysctl hw.em.0.fc=0
sysctl hw.em.1.fc=0
==== Static ARP entries ====
Here is the modified value of the default BSDRP /etc/rc.conf for static ARP:
ifconfig_em0="inet 1.1.1.3/24"
ifconfig_em1="inet 2.2.2.3/24"
static_arp_pairs="receiver generator"
static_arp_receiver="1.1.1.1 00:1b:21:d5:66:0e"
static_arp_generator="2.2.2.2 00:1b:21:d5:66:15"
===== em(4) drivers tunning with 82546GB =====
==== Default FreeBSD values ====
Default FreeBSD NIC parameters are used for this first test.
Edit BSDRP /boot/loader.conf.local and comment all NIC tunning, then reboot.
[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: 100
hw.em.txd: 1024
hw.em.rxd: 1024
The generator will push packet at 1.4Mpps:
root@generator:~ # pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10
main [832] ether_aton(00:0e:0c:de:45:df) gives 0x800f9b292
main [900] map size is 334980 Kb
main [922] mmapping 334980 Kbytes
Sending on em0: 1 queues, 1 threads and 1 cpus.
2.2.2.2 -> 1.1.1.1 (00:1b:21:d5:66:15 -> 00:0e:0c:de:45:df)
main [975] Wait 10 secs for phy reset
main [977] Ready...
sender_body [479] start
main [1085] 1405293 pps
main [1085] 1406363 pps
main [1085] 1406409 pps
main [1085] 1406509 pps
main [1085] 1406560 pps
main [1085] 1406439 pps
main [1085] 1405242 pps
...
Meanwhile the receiver show this:
root@receiver:~ # pkt-gen -i em0 -w 10
main [900] map size is 334980 Kb
main [922] mmapping 334980 Kbytes
Receiving from em0: 1 queues, 1 threads and 1 cpus.
main [975] Wait 10 secs for phy reset
main [977] Ready...
receiver_body [621] waiting for initial packets, poll returns 0 0
main [1085] 0 pps
receiver_body [621] waiting for initial packets, poll returns 0 0
main [1085] 0 pps
main [1085] 159098 pps
main [1085] 400260 pps
main [1085] 400247 pps
...
main [1085] 400198 pps
main [1085] 400287 pps
main [1085] 400240 pps
main [1085] 400235 pps
main [1085] 400245 pps
main [1085] 400232 pps
main [1085] 215381 pps
main [1085] 0 pps
Received 30859400 packets, in 77.13 seconds.
Speed: 400.07Kpps.
Receiver results in Kpps for 5 tests (with a reboot between them):
400.15
400.10
400.09
399.88
399.93
=> the receiver measure 400Kpps.
Now what about the router stats:
[root@BSDRP]~# netstat -ihw 1
input (Total) output
packets errs idrops bytes packets errs bytes colls
387k 983k 0 22M 387k 0 15M 0
389k 981k 0 22M 389k 0 16M 0
390k 982k 0 22M 390k 0 16M 0
390k 982k 0 22M 390k 0 16M 0
390k 983k 0 22M 390k 0 16M 0
390k 979k 0 22M 390k 0 16M 0
390k 982k 0 22M 390k 0 16M 0
390k 982k 0 22M 390k 0 16M 0
390k 983k 0 22M 390k 0 16M 0
[root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt total rate
irq17: em0 bge1 2555941 2745
irq18: em1 uhci2 2555879 2745
[root@BSDRP]~# top -nCHSIzs1
last pid: 1334; load averages: 0.00, 0.02, 0.05 up 0+00:15:38 10:17:26
76 processes: 2 running, 58 sleeping, 16 waiting
Mem: 9984K Active, 8432K Inact, 64M Wired, 17M Buf, 894M Free
Swap:
PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
0 root -92 0 0K 176K - 9:30 93.65% kernel{em1 taskq}
0 root -92 0 0K 176K - 0:27 3.76% kernel{em0 taskq}
11 root -92 - 0K 256K WAIT 0:07 0.68% intr{irq17: em0 bge1}
11 root -92 - 0K 256K WAIT 0:06 0.59% intr{irq18: em1 uhci2}
[root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
mbuf_packet: 256, 0, 1024, 128,263652366, 0, 0
mbuf: 256, 0, 1, 137, 137, 0, 0
mbuf_cluster: 2048, 262144, 1152, 6, 1152, 0, 0
mbuf_jumbo_page: 4096, 12800, 0, 0, 0, 0, 0
mbuf_jumbo_9k: 9216, 6400, 0, 0, 0, 0, 0
mbuf_jumbo_16k: 16384, 3200, 0, 0, 0, 0, 0
mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0
=> The router is still very well responding, but it display a forwarding rate of 390Kpps.
The receiver measured 400Kpps, there is a 10Kpps gap between them.
We need to check the switch stats for a tie:
switch>sh int Gi0/10
GigabitEthernet0/10 is up, line protocol is up
Hardware is Gigabit Ethernet, address is 000c.307c.208a (bia 000c.307c.208a)
Description: Receiver-em0
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 52/255, rxload 1/255
(etc...)
30 second input rate 0 bits/sec, 0 packets/sec
30 second output rate 204464000 bits/sec, 399348 packets/sec
=> Switch stats confirm the number of 400Kpps received: There is a problem with FreeBSD self-counters that misses about 10Kpps in this case.
We need to use the receiver stats and not the router stats.
==== Default BSDRP values ====
BSDRP NIC parameters are used for this first test:
* Maximum number of received packets to process at a time is increase to 500
* Number of transmit/received descriptors per queue are increase to their maxium (4096)
[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: 500
hw.em.txd: 4096
hw.em.rxd: 4096
Receiver results in Kpps for 5 tests (with a reboot between them):
405.29
405.23
406.25
405.20
404.98
=> Throughput increase to 405Kpps: A small 5Kpps gain.
Some router counters:
[root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt total rate
irq17: em0 bge1 181885 301
irq18: em1 uhci2 181884 301
[root@BSDRP]~# top -nCHSIzs1
last pid: 1344; load averages: 0.00, 0.05, 0.07 up 0+00:10:19 10:31:13
76 processes: 2 running, 58 sleeping, 16 waiting
Mem: 9976K Active, 8300K Inact, 85M Wired, 17M Buf, 873M Free
Swap:
PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
0 root -92 0 0K 176K - 3:39 93.07% kernel{em1 taskq}
0 root -92 0 0K 176K - 0:11 4.20% kernel{em0 taskq}
[root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
mbuf_packet: 256, 0, 8704, 512,110372794, 0, 0
mbuf: 256, 0, 1, 128, 135, 0, 0
mbuf_cluster: 2048, 262144, 9216, 6, 9216, 0, 0
mbuf_jumbo_page: 4096, 12800, 0, 0, 0, 0, 0
mbuf_jumbo_9k: 9216, 6400, 0, 0, 0, 0, 0
mbuf_jumbo_16k: 16384, 3200, 0, 0, 0, 0, 0
mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0
=> The server is still very well responding.
There is no IRQ storm neither mbuf problem: It's taskq em that consume all CPU resources.
==== Removing rx limit ====
For this test we completely disable the maximum number of received packets to process at a time (-1):
[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: -1
hw.em.txd: 4096
hw.em.rxd: 4096
Receiver results in Kpps for 5 tests (with a reboot between them):
410.52
409.65
410.31
408.92
410.52
=> Performance increased to 410Kpps (10Kpps gain regarding default value, and 5Kpps gain regarding tuned-but-still-limited value).
What's about the router stats now:
[root@BSDRP3]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt total rate
irq17: em0 bge1 81956 84
irq18: em1 uhci2 81928 84
[root@BSDRP3]~# top -nCHSIzs1
last pid: 1343; load averages: 3.05, 2.29, 1.49 up 0+00:19:57 10:56:06
80 processes: 5 running, 59 sleeping, 16 waiting
Mem: 10M Active, 8348K Inact, 93M Wired, 17M Buf, 865M Free
Swap:
PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
0 root -92 0 0K 176K - 17:02 100.00% kernel{em1 taskq}
=> The router is very very slow to respond, almost unusable. Receiving packet consume all its CPU.
Disabling the limit regarding the maximum number of received packets to process at a time is a bad idea on this server
==== Results ====
Ministat graphs:
x hw.em.txd-rxd=1024.hw.em.rx_proc_lim=100
+ hw.em.txd-rxd=4096.hw.em.rx_proc_lim=500
* hw.em.txd-rxd=4096.hw.em.rx_proc_lim=-1
+----------------------------------------------------------------------------------------------------+
|x x + * |
|x x + ++ + * * * * |
||AM |__M_A___| |______A__M__||
+----------------------------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 399.88 400.15 400.09 400.03 0.11768602
+ 5 404.98 406.25 405.23 405.39 0.4948232
Difference at 95.0% confidence
5.36 +/- 0.524533
1.3399% +/- 0.131123%
(Student's t, pooled s = 0.359653)
* 5 408.92 410.52 410.31 409.984 0.69363535
Difference at 95.0% confidence
9.954 +/- 0.725551
2.48831% +/- 0.181374%
(Student's t, pooled s = 0.497484)
===== Firewall impact =====
==== IPFW ====
Now we will test the impact of enabling a simple IPFW rules:
cat > /etc/ipfw.rules <<'EOF'
#!/bin/sh
fwcmd="/sbin/ipfw"
# Flush out the list before we begin.
${fwcmd} -f flush
${fwcmd} add 3000 allow ip from any to any
'EOF'
echo 'firewall_enable="YES"' >> /etc/rc.conf
echo 'firewall_script="/etc/ipfw.rules" >> /etc/rc.conf
Receiver results in Kpps for 5 tests (with a reboot between them):
320.63
320.13
320.79
320.18
320.52
=> Throughput reduced to 320Kpps: Enabling ipfw add an impact of about 80Kpps on this server. Router still respond perfectly on the CLI.
==== PF ====
cat >/etc/pf.conf <<'EOF'
set skip on lo0
pass
'EOF'
echo 'pf_enable="YES"' >> /etc/rc.conf
Receiver results in Kpps for 5 tests (with a reboot between them):
272.40
274.78
272.56
275.65
274.51
=> Very big performance impact here ! Drop to 274Kpps and router is not responsive at all: If watchdog is enabled It will trigger a reboot of the router.
==== Results ====
ministat graphs:
x ipfw
+ pf
+----------------------------------------------------------------------------------------------------+
|+ xx|
|+ ++ + xx|
||__AM_| A||
+----------------------------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 320.13 320.79 320.52 320.45 0.28644371
+ 5 272.4 275.65 274.51 273.98 1.4337538
Difference at 95.0% confidence
-46.47 +/- 1.50781
-14.5015% +/- 0.47053%
(Student's t, pooled s = 1.03385)