documentation:examples:ibm_eserver_306m_with_intel_82546gb
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
— | documentation:examples:ibm_eserver_306m_with_intel_82546gb [2013/04/11 06:13] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Forwarding performance lab of an IBM eServer 306m with Intel 82546GB ====== | ||
+ | {{description> | ||
+ | ===== Hardware detail ===== | ||
+ | |||
+ | This lab will test an [[IBM eServer xSeries 306m]] with **one** core (Intel Pentium4 3.00GHz, hyper-threading disabled) and a dual NIC 82546GB connected to the PCI-X Bus. | ||
+ | Overloading a one core server should be easy. | ||
+ | |||
+ | ===== Lab set-up ===== | ||
+ | |||
+ | The lab is detailed here: [[documentation: | ||
+ | |||
+ | BSDRP-amd64 v1.4 (FreeBSD 9.1) is used on the router. | ||
+ | |||
+ | ==== Diagram ==== | ||
+ | |||
+ | < | ||
+ | +-------------------+ | ||
+ | | Packet generator | ||
+ | | em0: 2.2.2.2 | ||
+ | | 00: | ||
+ | +-------------------+ | ||
+ | </ | ||
+ | |||
+ | Generator will use this command: | ||
+ | < | ||
+ | pkt-gen -i em0 -t 0 -l 42 -d 1.1.1.1 -D 00: | ||
+ | </ | ||
+ | Receiver will use this command: | ||
+ | < | ||
+ | pkt-gen -i em0 -w 10 | ||
+ | </ | ||
+ | ===== Configuring ===== | ||
+ | |||
+ | For this small lab, we will configure the router. | ||
+ | |||
+ | ==== Disabling Ethernet flow-control === | ||
+ | |||
+ | First, disable Ethernet flow-control: | ||
+ | < | ||
+ | echo " | ||
+ | echo " | ||
+ | sysctl hw.em.0.fc=0 | ||
+ | sysctl hw.em.1.fc=0 | ||
+ | </ | ||
+ | |||
+ | ==== Static ARP entries ==== | ||
+ | |||
+ | Here is the modified value of the default BSDRP / | ||
+ | < | ||
+ | ifconfig_em0=" | ||
+ | ifconfig_em1=" | ||
+ | static_arp_pairs=" | ||
+ | static_arp_receiver=" | ||
+ | static_arp_generator=" | ||
+ | </ | ||
+ | |||
+ | ===== em(4) drivers tunning with 82546GB ===== | ||
+ | |||
+ | ==== Default FreeBSD values ==== | ||
+ | |||
+ | Default FreeBSD NIC parameters are used for this first test. | ||
+ | |||
+ | Edit BSDRP / | ||
+ | |||
+ | < | ||
+ | [root@BSDRP]~# | ||
+ | hw.em.rx_process_limit: | ||
+ | hw.em.txd: 1024 | ||
+ | hw.em.rxd: 1024 | ||
+ | </ | ||
+ | |||
+ | The generator will push packet at 1.4Mpps: | ||
+ | |||
+ | < | ||
+ | root@generator: | ||
+ | main [832] ether_aton(00: | ||
+ | main [900] map size is 334980 Kb | ||
+ | main [922] mmapping 334980 Kbytes | ||
+ | Sending on em0: 1 queues, 1 threads and 1 cpus. | ||
+ | 2.2.2.2 -> 1.1.1.1 (00: | ||
+ | main [975] Wait 10 secs for phy reset | ||
+ | main [977] Ready... | ||
+ | sender_body [479] start | ||
+ | main [1085] 1405293 pps | ||
+ | main [1085] 1406363 pps | ||
+ | main [1085] 1406409 pps | ||
+ | main [1085] 1406509 pps | ||
+ | main [1085] 1406560 pps | ||
+ | main [1085] 1406439 pps | ||
+ | main [1085] 1405242 pps | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | Meanwhile the receiver show this: | ||
+ | < | ||
+ | root@receiver: | ||
+ | main [900] map size is 334980 Kb | ||
+ | main [922] mmapping 334980 Kbytes | ||
+ | Receiving from em0: 1 queues, 1 threads and 1 cpus. | ||
+ | main [975] Wait 10 secs for phy reset | ||
+ | main [977] Ready... | ||
+ | receiver_body [621] waiting for initial packets, poll returns 0 0 | ||
+ | main [1085] 0 pps | ||
+ | receiver_body [621] waiting for initial packets, poll returns 0 0 | ||
+ | main [1085] 0 pps | ||
+ | main [1085] 159098 pps | ||
+ | main [1085] 400260 pps | ||
+ | main [1085] 400247 pps | ||
+ | ... | ||
+ | main [1085] 400198 pps | ||
+ | main [1085] 400287 pps | ||
+ | main [1085] 400240 pps | ||
+ | main [1085] 400235 pps | ||
+ | main [1085] 400245 pps | ||
+ | main [1085] 400232 pps | ||
+ | main [1085] 215381 pps | ||
+ | main [1085] 0 pps | ||
+ | Received 30859400 packets, in 77.13 seconds. | ||
+ | Speed: 400.07Kpps. | ||
+ | </ | ||
+ | |||
+ | Receiver results in Kpps for 5 tests (with a reboot between them): | ||
+ | < | ||
+ | 400.15 | ||
+ | 400.10 | ||
+ | 400.09 | ||
+ | 399.88 | ||
+ | 399.93 | ||
+ | </ | ||
+ | |||
+ | => the receiver measure 400Kpps. | ||
+ | |||
+ | Now what about the router stats: | ||
+ | |||
+ | < | ||
+ | [root@BSDRP]~# | ||
+ | input (Total) | ||
+ | | ||
+ | 387k 983k | ||
+ | 389k 981k | ||
+ | 390k 982k | ||
+ | 390k 982k | ||
+ | 390k 983k | ||
+ | 390k 979k | ||
+ | 390k 982k | ||
+ | 390k 982k | ||
+ | 390k 983k | ||
+ | [root@BSDRP]~# | ||
+ | interrupt | ||
+ | irq17: em0 bge1 2555941 | ||
+ | irq18: em1 uhci2 | ||
+ | [root@BSDRP]~# | ||
+ | last pid: 1334; load averages: | ||
+ | 76 processes: | ||
+ | |||
+ | Mem: 9984K Active, 8432K Inact, 64M Wired, 17M Buf, 894M Free | ||
+ | Swap: | ||
+ | |||
+ | |||
+ | PID USERNAME PRI NICE | ||
+ | 0 root | ||
+ | 0 root | ||
+ | 11 root | ||
+ | 11 root | ||
+ | |||
+ | [root@BSDRP]~# | ||
+ | ITEM | ||
+ | mbuf_packet: | ||
+ | mbuf: | ||
+ | mbuf_cluster: | ||
+ | mbuf_jumbo_page: | ||
+ | mbuf_jumbo_9k: | ||
+ | mbuf_jumbo_16k: | ||
+ | mbuf_ext_refcnt: | ||
+ | |||
+ | </ | ||
+ | |||
+ | => The router is still very well responding, but it display a forwarding rate of 390Kpps. | ||
+ | The receiver measured 400Kpps, there is a 10Kpps gap between them. | ||
+ | |||
+ | We need to check the switch stats for a tie: | ||
+ | |||
+ | < | ||
+ | switch> | ||
+ | GigabitEthernet0/ | ||
+ | Hardware is Gigabit Ethernet, address is 000c.307c.208a (bia 000c.307c.208a) | ||
+ | Description: | ||
+ | MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, | ||
+ | | ||
+ | (etc...) | ||
+ | 30 second input rate 0 bits/sec, 0 packets/sec | ||
+ | 30 second output rate 204464000 bits/sec, 399348 packets/sec | ||
+ | </ | ||
+ | |||
+ | => Switch stats confirm the number of 400Kpps received: There is a problem with FreeBSD self-counters that misses about 10Kpps in this case. | ||
+ | |||
+ | < | ||
+ | |||
+ | ==== Default BSDRP values ==== | ||
+ | |||
+ | BSDRP NIC parameters are used for this first test: | ||
+ | * Maximum number of received packets to process at a time is increase to 500 | ||
+ | * Number of transmit/ | ||
+ | |||
+ | < | ||
+ | [root@BSDRP]~# | ||
+ | hw.em.rx_process_limit: | ||
+ | hw.em.txd: 4096 | ||
+ | hw.em.rxd: 4096 | ||
+ | </ | ||
+ | |||
+ | Receiver results in Kpps for 5 tests (with a reboot between them): | ||
+ | < | ||
+ | 405.29 | ||
+ | 405.23 | ||
+ | 406.25 | ||
+ | 405.20 | ||
+ | 404.98 | ||
+ | </ | ||
+ | |||
+ | => Throughput increase to 405Kpps: A small 5Kpps gain. | ||
+ | |||
+ | Some router counters: | ||
+ | |||
+ | < | ||
+ | [root@BSDRP]~# | ||
+ | interrupt | ||
+ | irq17: em0 bge1 | ||
+ | irq18: em1 uhci2 181884 | ||
+ | |||
+ | [root@BSDRP]~# | ||
+ | last pid: 1344; load averages: | ||
+ | 76 processes: | ||
+ | |||
+ | Mem: 9976K Active, 8300K Inact, 85M Wired, 17M Buf, 873M Free | ||
+ | Swap: | ||
+ | |||
+ | PID USERNAME PRI NICE | ||
+ | 0 root | ||
+ | 0 root | ||
+ | |||
+ | [root@BSDRP]~# | ||
+ | ITEM | ||
+ | mbuf_packet: | ||
+ | mbuf: | ||
+ | mbuf_cluster: | ||
+ | mbuf_jumbo_page: | ||
+ | mbuf_jumbo_9k: | ||
+ | mbuf_jumbo_16k: | ||
+ | mbuf_ext_refcnt: | ||
+ | |||
+ | </ | ||
+ | |||
+ | => The server is still very well responding. | ||
+ | |||
+ | There is no IRQ storm neither mbuf problem: It's taskq em that consume all CPU resources. | ||
+ | |||
+ | ==== Removing rx limit ==== | ||
+ | |||
+ | For this test we completely disable the maximum number of received packets to process at a time (-1): | ||
+ | < | ||
+ | [root@BSDRP]~# | ||
+ | hw.em.rx_process_limit: | ||
+ | hw.em.txd: 4096 | ||
+ | hw.em.rxd: 4096 | ||
+ | </ | ||
+ | |||
+ | Receiver results in Kpps for 5 tests (with a reboot between them): | ||
+ | < | ||
+ | 410.52 | ||
+ | 409.65 | ||
+ | 410.31 | ||
+ | 408.92 | ||
+ | 410.52 | ||
+ | </ | ||
+ | |||
+ | => Performance increased to 410Kpps (10Kpps gain regarding default value, and 5Kpps gain regarding tuned-but-still-limited value). | ||
+ | |||
+ | What's about the router stats now: | ||
+ | < | ||
+ | [root@BSDRP3]~# | ||
+ | interrupt | ||
+ | irq17: em0 bge1 81956 84 | ||
+ | irq18: em1 uhci2 | ||
+ | [root@BSDRP3]~# | ||
+ | last pid: 1343; load averages: | ||
+ | 80 processes: | ||
+ | |||
+ | Mem: 10M Active, 8348K Inact, 93M Wired, 17M Buf, 865M Free | ||
+ | Swap: | ||
+ | |||
+ | PID USERNAME PRI NICE | ||
+ | 0 root | ||
+ | |||
+ | </ | ||
+ | |||
+ | => The router is very very slow to respond, almost unusable. Receiving packet consume all its CPU. | ||
+ | |||
+ | < | ||
+ | |||
+ | ==== Results ==== | ||
+ | |||
+ | Ministat graphs: | ||
+ | |||
+ | < | ||
+ | x hw.em.txd-rxd=1024.hw.em.rx_proc_lim=100 | ||
+ | + hw.em.txd-rxd=4096.hw.em.rx_proc_lim=500 | ||
+ | * hw.em.txd-rxd=4096.hw.em.rx_proc_lim=-1 | ||
+ | +----------------------------------------------------------------------------------------------------+ | ||
+ | |x x + * | | ||
+ | |x x + ++ | ||
+ | ||AM | ||
+ | +----------------------------------------------------------------------------------------------------+ | ||
+ | N | ||
+ | x | ||
+ | + | ||
+ | Difference at 95.0% confidence | ||
+ | 5.36 +/- 0.524533 | ||
+ | 1.3399% +/- 0.131123% | ||
+ | (Student' | ||
+ | * | ||
+ | Difference at 95.0% confidence | ||
+ | 9.954 +/- 0.725551 | ||
+ | 2.48831% +/- 0.181374% | ||
+ | (Student' | ||
+ | </ | ||
+ | |||
+ | ===== Firewall impact ===== | ||
+ | |||
+ | ==== IPFW ==== | ||
+ | Now we will test the impact of enabling a simple IPFW rules: | ||
+ | |||
+ | < | ||
+ | cat > / | ||
+ | #!/bin/sh | ||
+ | fwcmd="/ | ||
+ | # Flush out the list before we begin. | ||
+ | ${fwcmd} -f flush | ||
+ | ${fwcmd} add 3000 allow ip from any to any | ||
+ | ' | ||
+ | |||
+ | echo ' | ||
+ | echo ' | ||
+ | |||
+ | </ | ||
+ | |||
+ | Receiver results in Kpps for 5 tests (with a reboot between them): | ||
+ | < | ||
+ | 320.63 | ||
+ | 320.13 | ||
+ | 320.79 | ||
+ | 320.18 | ||
+ | 320.52 | ||
+ | </ | ||
+ | |||
+ | => Throughput reduced to 320Kpps: | ||
+ | |||
+ | ==== PF ==== | ||
+ | |||
+ | < | ||
+ | cat >/ | ||
+ | set skip on lo0 | ||
+ | pass | ||
+ | ' | ||
+ | |||
+ | echo ' | ||
+ | </ | ||
+ | |||
+ | Receiver results in Kpps for 5 tests (with a reboot between them): | ||
+ | < | ||
+ | 272.40 | ||
+ | 274.78 | ||
+ | 272.56 | ||
+ | 275.65 | ||
+ | 274.51 | ||
+ | </ | ||
+ | |||
+ | => Very big performance impact here ! Drop to 274Kpps and router is not responsive at all: If watchdog is enabled It will trigger a reboot of the router. | ||
+ | |||
+ | ==== Results ==== | ||
+ | |||
+ | ministat graphs: | ||
+ | |||
+ | < | ||
+ | x ipfw | ||
+ | + pf | ||
+ | +----------------------------------------------------------------------------------------------------+ | ||
+ | |+ xx| | ||
+ | |+ ++ + xx| | ||
+ | ||__AM_| | ||
+ | +----------------------------------------------------------------------------------------------------+ | ||
+ | N | ||
+ | x | ||
+ | + | ||
+ | Difference at 95.0% confidence | ||
+ | -46.47 +/- 1.50781 | ||
+ | -14.5015% +/- 0.47053% | ||
+ | (Student' | ||
+ | </ |
documentation/examples/ibm_eserver_306m_with_intel_82546gb.txt · Last modified: 2013/04/11 06:13 by 127.0.0.1