====== Forwarding performance lab of an IBM eServer 306m with Intel 82546GB ====== {{description>Forwarding performance lab of an one core IBM server and dual-port gigabit Intel NIC}} ===== Hardware detail ===== This lab will test an [[IBM eServer xSeries 306m]] with **one** core (Intel Pentium4 3.00GHz, hyper-threading disabled) and a dual NIC 82546GB connected to the PCI-X Bus. Overloading a one core server should be easy. ===== Lab set-up ===== The lab is detailed here: [[documentation:examples:Setting up a forwarding performance benchmark lab]]. BSDRP-amd64 v1.4 (FreeBSD 9.1) is used on the router. ==== Diagram ==== +-------------------+ +----------------------------------------+ +-------------------+ | Packet generator | | Device under Test | | Packet receiver | | em0: 2.2.2.2 |=====>| em1: 2.2.2.3 em0: 1.1.1.3 |=====>| em0: 1.1.1.1 | | 00:1b:21:d5:66:15 | | 00:0e:0c:de:45:df 00:0e:0c:de:45:de | | 00:1b:21:d5:66:0e | +-------------------+ +----------------------------------------+ +-------------------+ Generator will use this command: pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10 Receiver will use this command: pkt-gen -i em0 -w 10 ===== Configuring ===== For this small lab, we will configure the router. ==== Disabling Ethernet flow-control === First, disable Ethernet flow-control: echo "hw.em.0.fc=0" >> /etc/sysctl.conf echo "hw.em.1.fc=0" >> /etc/sysctl.conf sysctl hw.em.0.fc=0 sysctl hw.em.1.fc=0 ==== Static ARP entries ==== Here is the modified value of the default BSDRP /etc/rc.conf for static ARP: ifconfig_em0="inet 1.1.1.3/24" ifconfig_em1="inet 2.2.2.3/24" static_arp_pairs="receiver generator" static_arp_receiver="1.1.1.1 00:1b:21:d5:66:0e" static_arp_generator="2.2.2.2 00:1b:21:d5:66:15" ===== em(4) drivers tunning with 82546GB ===== ==== Default FreeBSD values ==== Default FreeBSD NIC parameters are used for this first test. Edit BSDRP /boot/loader.conf.local and comment all NIC tunning, then reboot. [root@BSDRP]~# sysctl hw.em. hw.em.rx_process_limit: 100 hw.em.txd: 1024 hw.em.rxd: 1024 The generator will push packet at 1.4Mpps: root@generator:~ # pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10 main [832] ether_aton(00:0e:0c:de:45:df) gives 0x800f9b292 main [900] map size is 334980 Kb main [922] mmapping 334980 Kbytes Sending on em0: 1 queues, 1 threads and 1 cpus. 2.2.2.2 -> 1.1.1.1 (00:1b:21:d5:66:15 -> 00:0e:0c:de:45:df) main [975] Wait 10 secs for phy reset main [977] Ready... sender_body [479] start main [1085] 1405293 pps main [1085] 1406363 pps main [1085] 1406409 pps main [1085] 1406509 pps main [1085] 1406560 pps main [1085] 1406439 pps main [1085] 1405242 pps ... Meanwhile the receiver show this: root@receiver:~ # pkt-gen -i em0 -w 10 main [900] map size is 334980 Kb main [922] mmapping 334980 Kbytes Receiving from em0: 1 queues, 1 threads and 1 cpus. main [975] Wait 10 secs for phy reset main [977] Ready... receiver_body [621] waiting for initial packets, poll returns 0 0 main [1085] 0 pps receiver_body [621] waiting for initial packets, poll returns 0 0 main [1085] 0 pps main [1085] 159098 pps main [1085] 400260 pps main [1085] 400247 pps ... main [1085] 400198 pps main [1085] 400287 pps main [1085] 400240 pps main [1085] 400235 pps main [1085] 400245 pps main [1085] 400232 pps main [1085] 215381 pps main [1085] 0 pps Received 30859400 packets, in 77.13 seconds. Speed: 400.07Kpps. Receiver results in Kpps for 5 tests (with a reboot between them): 400.15 400.10 400.09 399.88 399.93 => the receiver measure 400Kpps. Now what about the router stats: [root@BSDRP]~# netstat -ihw 1 input (Total) output packets errs idrops bytes packets errs bytes colls 387k 983k 0 22M 387k 0 15M 0 389k 981k 0 22M 389k 0 16M 0 390k 982k 0 22M 390k 0 16M 0 390k 982k 0 22M 390k 0 16M 0 390k 983k 0 22M 390k 0 16M 0 390k 979k 0 22M 390k 0 16M 0 390k 982k 0 22M 390k 0 16M 0 390k 982k 0 22M 390k 0 16M 0 390k 983k 0 22M 390k 0 16M 0 [root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em interrupt total rate irq17: em0 bge1 2555941 2745 irq18: em1 uhci2 2555879 2745 [root@BSDRP]~# top -nCHSIzs1 last pid: 1334; load averages: 0.00, 0.02, 0.05 up 0+00:15:38 10:17:26 76 processes: 2 running, 58 sleeping, 16 waiting Mem: 9984K Active, 8432K Inact, 64M Wired, 17M Buf, 894M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND 0 root -92 0 0K 176K - 9:30 93.65% kernel{em1 taskq} 0 root -92 0 0K 176K - 0:27 3.76% kernel{em0 taskq} 11 root -92 - 0K 256K WAIT 0:07 0.68% intr{irq17: em0 bge1} 11 root -92 - 0K 256K WAIT 0:06 0.59% intr{irq18: em1 uhci2} [root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP mbuf_packet: 256, 0, 1024, 128,263652366, 0, 0 mbuf: 256, 0, 1, 137, 137, 0, 0 mbuf_cluster: 2048, 262144, 1152, 6, 1152, 0, 0 mbuf_jumbo_page: 4096, 12800, 0, 0, 0, 0, 0 mbuf_jumbo_9k: 9216, 6400, 0, 0, 0, 0, 0 mbuf_jumbo_16k: 16384, 3200, 0, 0, 0, 0, 0 mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0 => The router is still very well responding, but it display a forwarding rate of 390Kpps. The receiver measured 400Kpps, there is a 10Kpps gap between them. We need to check the switch stats for a tie: switch>sh int Gi0/10 GigabitEthernet0/10 is up, line protocol is up Hardware is Gigabit Ethernet, address is 000c.307c.208a (bia 000c.307c.208a) Description: Receiver-em0 MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, reliability 255/255, txload 52/255, rxload 1/255 (etc...) 30 second input rate 0 bits/sec, 0 packets/sec 30 second output rate 204464000 bits/sec, 399348 packets/sec => Switch stats confirm the number of 400Kpps received: There is a problem with FreeBSD self-counters that misses about 10Kpps in this case. We need to use the receiver stats and not the router stats. ==== Default BSDRP values ==== BSDRP NIC parameters are used for this first test: * Maximum number of received packets to process at a time is increase to 500 * Number of transmit/received descriptors per queue are increase to their maxium (4096) [root@BSDRP]~# sysctl hw.em. hw.em.rx_process_limit: 500 hw.em.txd: 4096 hw.em.rxd: 4096 Receiver results in Kpps for 5 tests (with a reboot between them): 405.29 405.23 406.25 405.20 404.98 => Throughput increase to 405Kpps: A small 5Kpps gain. Some router counters: [root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em interrupt total rate irq17: em0 bge1 181885 301 irq18: em1 uhci2 181884 301 [root@BSDRP]~# top -nCHSIzs1 last pid: 1344; load averages: 0.00, 0.05, 0.07 up 0+00:10:19 10:31:13 76 processes: 2 running, 58 sleeping, 16 waiting Mem: 9976K Active, 8300K Inact, 85M Wired, 17M Buf, 873M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND 0 root -92 0 0K 176K - 3:39 93.07% kernel{em1 taskq} 0 root -92 0 0K 176K - 0:11 4.20% kernel{em0 taskq} [root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP mbuf_packet: 256, 0, 8704, 512,110372794, 0, 0 mbuf: 256, 0, 1, 128, 135, 0, 0 mbuf_cluster: 2048, 262144, 9216, 6, 9216, 0, 0 mbuf_jumbo_page: 4096, 12800, 0, 0, 0, 0, 0 mbuf_jumbo_9k: 9216, 6400, 0, 0, 0, 0, 0 mbuf_jumbo_16k: 16384, 3200, 0, 0, 0, 0, 0 mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0 => The server is still very well responding. There is no IRQ storm neither mbuf problem: It's taskq em that consume all CPU resources. ==== Removing rx limit ==== For this test we completely disable the maximum number of received packets to process at a time (-1): [root@BSDRP]~# sysctl hw.em. hw.em.rx_process_limit: -1 hw.em.txd: 4096 hw.em.rxd: 4096 Receiver results in Kpps for 5 tests (with a reboot between them): 410.52 409.65 410.31 408.92 410.52 => Performance increased to 410Kpps (10Kpps gain regarding default value, and 5Kpps gain regarding tuned-but-still-limited value). What's about the router stats now: [root@BSDRP3]~# vmstat -i | head -1 ; vmstat -i | grep em interrupt total rate irq17: em0 bge1 81956 84 irq18: em1 uhci2 81928 84 [root@BSDRP3]~# top -nCHSIzs1 last pid: 1343; load averages: 3.05, 2.29, 1.49 up 0+00:19:57 10:56:06 80 processes: 5 running, 59 sleeping, 16 waiting Mem: 10M Active, 8348K Inact, 93M Wired, 17M Buf, 865M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND 0 root -92 0 0K 176K - 17:02 100.00% kernel{em1 taskq} => The router is very very slow to respond, almost unusable. Receiving packet consume all its CPU. Disabling the limit regarding the maximum number of received packets to process at a time is a bad idea on this server ==== Results ==== Ministat graphs: x hw.em.txd-rxd=1024.hw.em.rx_proc_lim=100 + hw.em.txd-rxd=4096.hw.em.rx_proc_lim=500 * hw.em.txd-rxd=4096.hw.em.rx_proc_lim=-1 +----------------------------------------------------------------------------------------------------+ |x x + * | |x x + ++ + * * * * | ||AM |__M_A___| |______A__M__|| +----------------------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 399.88 400.15 400.09 400.03 0.11768602 + 5 404.98 406.25 405.23 405.39 0.4948232 Difference at 95.0% confidence 5.36 +/- 0.524533 1.3399% +/- 0.131123% (Student's t, pooled s = 0.359653) * 5 408.92 410.52 410.31 409.984 0.69363535 Difference at 95.0% confidence 9.954 +/- 0.725551 2.48831% +/- 0.181374% (Student's t, pooled s = 0.497484) ===== Firewall impact ===== ==== IPFW ==== Now we will test the impact of enabling a simple IPFW rules: cat > /etc/ipfw.rules <<'EOF' #!/bin/sh fwcmd="/sbin/ipfw" # Flush out the list before we begin. ${fwcmd} -f flush ${fwcmd} add 3000 allow ip from any to any 'EOF' echo 'firewall_enable="YES"' >> /etc/rc.conf echo 'firewall_script="/etc/ipfw.rules" >> /etc/rc.conf Receiver results in Kpps for 5 tests (with a reboot between them): 320.63 320.13 320.79 320.18 320.52 => Throughput reduced to 320Kpps: Enabling ipfw add an impact of about 80Kpps on this server. Router still respond perfectly on the CLI. ==== PF ==== cat >/etc/pf.conf <<'EOF' set skip on lo0 pass 'EOF' echo 'pf_enable="YES"' >> /etc/rc.conf Receiver results in Kpps for 5 tests (with a reboot between them): 272.40 274.78 272.56 275.65 274.51 => Very big performance impact here ! Drop to 274Kpps and router is not responsive at all: If watchdog is enabled It will trigger a reboot of the router. ==== Results ==== ministat graphs: x ipfw + pf +----------------------------------------------------------------------------------------------------+ |+ xx| |+ ++ + xx| ||__AM_| A|| +----------------------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 320.13 320.79 320.52 320.45 0.28644371 + 5 272.4 275.65 274.51 273.98 1.4337538 Difference at 95.0% confidence -46.47 +/- 1.50781 -14.5015% +/- 0.47053% (Student's t, pooled s = 1.03385)