User Tools

Site Tools


documentation:examples:ibm_eserver_306m_with_intel_82546gb

Forwarding performance lab of an IBM eServer 306m with Intel 82546GB

Hardware detail

This lab will test an IBM eServer xSeries 306m with one core (Intel Pentium4 3.00GHz, hyper-threading disabled) and a dual NIC 82546GB connected to the PCI-X Bus. Overloading a one core server should be easy.

Lab set-up

The lab is detailed here: Setting up a forwarding performance benchmark lab.

BSDRP-amd64 v1.4 (FreeBSD 9.1) is used on the router.

Diagram

+-------------------+      +----------------------------------------+      +-------------------+
| Packet generator  |      |            Device under Test           |      |  Packet receiver  |
|    em0: 2.2.2.2   |=====>| em1: 2.2.2.3              em0: 1.1.1.3 |=====>|    em0: 1.1.1.1   |
| 00:1b:21:d5:66:15 |      | 00:0e:0c:de:45:df    00:0e:0c:de:45:de |      | 00:1b:21:d5:66:0e |
+-------------------+      +----------------------------------------+      +-------------------+

Generator will use this command:

pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10

Receiver will use this command:

pkt-gen -i em0 -w 10

Configuring

For this small lab, we will configure the router.

Disabling Ethernet flow-control

First, disable Ethernet flow-control:

echo "hw.em.0.fc=0" >> /etc/sysctl.conf
echo "hw.em.1.fc=0" >> /etc/sysctl.conf
sysctl hw.em.0.fc=0
sysctl hw.em.1.fc=0

Static ARP entries

Here is the modified value of the default BSDRP /etc/rc.conf for static ARP:

ifconfig_em0="inet 1.1.1.3/24"
ifconfig_em1="inet 2.2.2.3/24"
static_arp_pairs="receiver generator"
static_arp_receiver="1.1.1.1 00:1b:21:d5:66:0e"
static_arp_generator="2.2.2.2 00:1b:21:d5:66:15"

em(4) drivers tunning with 82546GB

Default FreeBSD values

Default FreeBSD NIC parameters are used for this first test.

Edit BSDRP /boot/loader.conf.local and comment all NIC tunning, then reboot.

[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: 100
hw.em.txd: 1024
hw.em.rxd: 1024

The generator will push packet at 1.4Mpps:

root@generator:~ # pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00:0e:0c:de:45:df -s 2.2.2.2 -w 10
main [832] ether_aton(00:0e:0c:de:45:df) gives 0x800f9b292
main [900] map size is 334980 Kb
main [922] mmapping 334980 Kbytes
Sending on em0: 1 queues, 1 threads and 1 cpus.
2.2.2.2 -> 1.1.1.1 (00:1b:21:d5:66:15 -> 00:0e:0c:de:45:df)
main [975] Wait 10 secs for phy reset
main [977] Ready...
sender_body [479] start
main [1085] 1405293 pps
main [1085] 1406363 pps
main [1085] 1406409 pps
main [1085] 1406509 pps
main [1085] 1406560 pps
main [1085] 1406439 pps
main [1085] 1405242 pps
...

Meanwhile the receiver show this:

root@receiver:~ # pkt-gen -i em0 -w 10
main [900] map size is 334980 Kb
main [922] mmapping 334980 Kbytes
Receiving from em0: 1 queues, 1 threads and 1 cpus.
main [975] Wait 10 secs for phy reset
main [977] Ready...
receiver_body [621] waiting for initial packets, poll returns 0 0
main [1085] 0 pps
receiver_body [621] waiting for initial packets, poll returns 0 0
main [1085] 0 pps
main [1085] 159098 pps
main [1085] 400260 pps
main [1085] 400247 pps
...
main [1085] 400198 pps
main [1085] 400287 pps
main [1085] 400240 pps
main [1085] 400235 pps
main [1085] 400245 pps
main [1085] 400232 pps
main [1085] 215381 pps
main [1085] 0 pps
Received 30859400 packets, in 77.13 seconds.
Speed: 400.07Kpps.

Receiver results in Kpps for 5 tests (with a reboot between them):

400.15
400.10
400.09
399.88
399.93

⇒ the receiver measure 400Kpps.

Now what about the router stats:

[root@BSDRP]~# netstat -ihw 1
            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
      387k  983k     0        22M       387k     0        15M     0
      389k  981k     0        22M       389k     0        16M     0
      390k  982k     0        22M       390k     0        16M     0
      390k  982k     0        22M       390k     0        16M     0
      390k  983k     0        22M       390k     0        16M     0
      390k  979k     0        22M       390k     0        16M     0
      390k  982k     0        22M       390k     0        16M     0
      390k  982k     0        22M       390k     0        16M     0
      390k  983k     0        22M       390k     0        16M     0
[root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt                          total       rate
irq17: em0 bge1                  2555941       2745
irq18: em1 uhci2                 2555879       2745
[root@BSDRP]~# top -nCHSIzs1
last pid:  1334;  load averages:  0.00,  0.02,  0.05  up 0+00:15:38    10:17:26
76 processes:  2 running, 58 sleeping, 16 waiting

Mem: 9984K Active, 8432K Inact, 64M Wired, 17M Buf, 894M Free
Swap:


  PID USERNAME PRI NICE   SIZE    RES STATE    TIME    CPU COMMAND
    0 root     -92    0     0K   176K -        9:30 93.65% kernel{em1 taskq}
    0 root     -92    0     0K   176K -        0:27  3.76% kernel{em0 taskq}
   11 root     -92    -     0K   256K WAIT     0:07  0.68% intr{irq17: em0 bge1}
   11 root     -92    -     0K   256K WAIT     0:06  0.59% intr{irq18: em1 uhci2}

[root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
mbuf_packet:            256,      0,    1024,     128,263652366,   0,   0
mbuf:                   256,      0,       1,     137,     137,   0,   0
mbuf_cluster:          2048, 262144,    1152,       6,    1152,   0,   0
mbuf_jumbo_page:       4096,  12800,       0,       0,       0,   0,   0
mbuf_jumbo_9k:         9216,   6400,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384,   3200,       0,       0,       0,   0,   0
mbuf_ext_refcnt:          4,      0,       0,       0,       0,   0,   0

⇒ The router is still very well responding, but it display a forwarding rate of 390Kpps. The receiver measured 400Kpps, there is a 10Kpps gap between them.

We need to check the switch stats for a tie:

switch>sh int Gi0/10
GigabitEthernet0/10 is up, line protocol is up
  Hardware is Gigabit Ethernet, address is 000c.307c.208a (bia 000c.307c.208a)
  Description: Receiver-em0
  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
     reliability 255/255, txload 52/255, rxload 1/255
  (etc...)
  30 second input rate 0 bits/sec, 0 packets/sec
  30 second output rate 204464000 bits/sec, 399348 packets/sec

⇒ Switch stats confirm the number of 400Kpps received: There is a problem with FreeBSD self-counters that misses about 10Kpps in this case.

We need to use the receiver stats and not the router stats.

Default BSDRP values

BSDRP NIC parameters are used for this first test:

  • Maximum number of received packets to process at a time is increase to 500
  • Number of transmit/received descriptors per queue are increase to their maxium (4096)
[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: 500
hw.em.txd: 4096
hw.em.rxd: 4096

Receiver results in Kpps for 5 tests (with a reboot between them):

405.29
405.23
406.25
405.20
404.98

⇒ Throughput increase to 405Kpps: A small 5Kpps gain.

Some router counters:

[root@BSDRP]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt                          total       rate
irq17: em0 bge1                   181885        301
irq18: em1 uhci2                  181884        301

[root@BSDRP]~# top -nCHSIzs1
last pid:  1344;  load averages:  0.00,  0.05,  0.07  up 0+00:10:19    10:31:13
76 processes:  2 running, 58 sleeping, 16 waiting

Mem: 9976K Active, 8300K Inact, 85M Wired, 17M Buf, 873M Free
Swap:

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME    CPU COMMAND
    0 root     -92    0     0K   176K -        3:39 93.07% kernel{em1 taskq}
    0 root     -92    0     0K   176K -        0:11  4.20% kernel{em0 taskq}

[root@BSDRP]~# vmstat -z | head -1 ; vmstat -z | grep -i mbuf
ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
mbuf_packet:            256,      0,    8704,     512,110372794,   0,   0
mbuf:                   256,      0,       1,     128,     135,   0,   0
mbuf_cluster:          2048, 262144,    9216,       6,    9216,   0,   0
mbuf_jumbo_page:       4096,  12800,       0,       0,       0,   0,   0
mbuf_jumbo_9k:         9216,   6400,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384,   3200,       0,       0,       0,   0,   0
mbuf_ext_refcnt:          4,      0,       0,       0,       0,   0,   0

⇒ The server is still very well responding.

There is no IRQ storm neither mbuf problem: It's taskq em that consume all CPU resources.

Removing rx limit

For this test we completely disable the maximum number of received packets to process at a time (-1):

[root@BSDRP]~# sysctl hw.em.
hw.em.rx_process_limit: -1
hw.em.txd: 4096
hw.em.rxd: 4096

Receiver results in Kpps for 5 tests (with a reboot between them):

410.52
409.65
410.31
408.92
410.52

⇒ Performance increased to 410Kpps (10Kpps gain regarding default value, and 5Kpps gain regarding tuned-but-still-limited value).

What's about the router stats now:

[root@BSDRP3]~# vmstat -i | head -1 ; vmstat -i | grep em
interrupt                          total       rate
irq17: em0 bge1                    81956         84
irq18: em1 uhci2                   81928         84
[root@BSDRP3]~# top -nCHSIzs1
last pid:  1343;  load averages:  3.05,  2.29,  1.49  up 0+00:19:57    10:56:06
80 processes:  5 running, 59 sleeping, 16 waiting

Mem: 10M Active, 8348K Inact, 93M Wired, 17M Buf, 865M Free
Swap:

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME    CPU COMMAND
    0 root     -92    0     0K   176K -       17:02 100.00% kernel{em1 taskq}

⇒ The router is very very slow to respond, almost unusable. Receiving packet consume all its CPU.

Disabling the limit regarding the maximum number of received packets to process at a time is a bad idea on this server

Results

Ministat graphs:

x hw.em.txd-rxd=1024.hw.em.rx_proc_lim=100
+ hw.em.txd-rxd=4096.hw.em.rx_proc_lim=500
* hw.em.txd-rxd=4096.hw.em.rx_proc_lim=-1
+----------------------------------------------------------------------------------------------------+
|x x                                              +                                                * |
|x x                                            + ++       +                        *      *     * * |
||AM                                           |__M_A___|                              |______A__M__||
+----------------------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5        399.88        400.15        400.09        400.03    0.11768602
+   5        404.98        406.25        405.23        405.39     0.4948232
Difference at 95.0% confidence
        5.36 +/- 0.524533
        1.3399% +/- 0.131123%
        (Student's t, pooled s = 0.359653)
*   5        408.92        410.52        410.31       409.984    0.69363535
Difference at 95.0% confidence
        9.954 +/- 0.725551
        2.48831% +/- 0.181374%
        (Student's t, pooled s = 0.497484)

Firewall impact

IPFW

Now we will test the impact of enabling a simple IPFW rules:

cat > /etc/ipfw.rules <<'EOF'
#!/bin/sh
fwcmd="/sbin/ipfw"
# Flush out the list before we begin.
${fwcmd} -f flush
${fwcmd} add 3000 allow ip from any to any
'EOF'

echo 'firewall_enable="YES"' >> /etc/rc.conf
echo 'firewall_script="/etc/ipfw.rules"  >> /etc/rc.conf

Receiver results in Kpps for 5 tests (with a reboot between them):

320.63
320.13
320.79
320.18
320.52

⇒ Throughput reduced to 320Kpps: Enabling ipfw add an impact of about 80Kpps on this server. Router still respond perfectly on the CLI.

PF

cat >/etc/pf.conf <<'EOF'
set skip on lo0
pass
'EOF'

echo 'pf_enable="YES"' >> /etc/rc.conf

Receiver results in Kpps for 5 tests (with a reboot between them):

272.40
274.78
272.56
275.65
274.51

⇒ Very big performance impact here ! Drop to 274Kpps and router is not responsive at all: If watchdog is enabled It will trigger a reboot of the router.

Results

ministat graphs:

x ipfw
+ pf
+----------------------------------------------------------------------------------------------------+
|+                                                                                                 xx|
|+   ++ +                                                                                          xx|
||__AM_|                                                                                           A||
+----------------------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5        320.13        320.79        320.52        320.45    0.28644371
+   5         272.4        275.65        274.51        273.98     1.4337538
Difference at 95.0% confidence
        -46.47 +/- 1.50781
        -14.5015% +/- 0.47053%
        (Student's t, pooled s = 1.03385)
documentation/examples/ibm_eserver_306m_with_intel_82546gb.txt · Last modified: 2013/04/11 06:13 (external edit)