User Tools

Site Tools


documentation:technical_docs:performance

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

documentation:technical_docs:performance [2012/05/20 01:59]
127.0.0.1 external edit
documentation:technical_docs:performance [2013/05/17 14:13] (current)
olivier [Guides]
Line 1: Line 1:
-====== FreeBSD ​networking ​Performance ====== +====== FreeBSD ​forwarding ​Performance ====== 
-{{description>​Tips and information about forwarding performance}} +{{description>​Tips and information about FreeBSD ​forwarding performance}} 
-FreeBSD network architecture [[http://​lists.freebsd.org/​pipermail/​freebsd-current/​2009-September/​011148.html|is limited to 700kpps (700 000 paquet per second)]]: This is the big difference with [[http://​www.vyatta.com|the linux kernel used in Vyatta]] that don't have this limitation. +There are lot's of guide about [[http://serverfault.com/questions/64356/freebsd-performance-tuning-sysctls-loader-conf-kernel|tuning FreeBSD TCP performance]] (where the FreeBSD host is an end-point of the TCP session), but it's not the same that tunig forwarding performance (where the FreeBSD host don't have to read the TCP information of the packet ​being forwarded).
-There are lot's of guide about [[http://http://fasterdata.es.net/TCP-tuning/​FreeBSD.html|tuning FreeBSD TCP performance]] (where the FreeBSD host is an end-point of the TCP session), but it's not the same that tunig forwarding performance (where the FreeBSD host don't have to read the TCP information of the packet ​that is forwarding).+
  
 ===== Concept ===== ===== Concept =====
  
-Basics:+==== How to bench a router ==== 
 + 
 +Benchmarking a router is not measuring the maximum bandwidth crossing the router: 
 +  * [[http://​www.ietf.org/​rfc/​rfc2544.txt|RFC2544:​ Benchmarking Methodology for Network Interconnect Devices]] 
 +  * [[http://​tools.ietf.org/​html/​rfc3222|RFC3222:​ Terminology for Forwarding Information Base (FIB) based Router Performance]] 
 + 
 +==== Definition ==== 
 + 
 +Clear definition regarding some relations is mandatory:
   * [[http://​www.cisco.com/​web/​about/​security/​intelligence/​network_performance_metrics.html|Bandwidth,​ Packets Per Second, and Other Network Performance Metrics]]   * [[http://​www.cisco.com/​web/​about/​security/​intelligence/​network_performance_metrics.html|Bandwidth,​ Packets Per Second, and Other Network Performance Metrics]]
-  * [[http://​www.erg.abdn.ac.uk/​~gorry/​eg3567/​lan-pages/​enet-calc.html|Ethernet frame calculation]]+ 
 +===== Benchmarks ===== 
 + 
 +==== Cisco or Linux ==== 
   * [[http://​www.cisco.com/​web/​partners/​downloads/​765/​tools/​quickreference/​routerperformance.pdf|Routing performance of Cisco routers]] (PDF)   * [[http://​www.cisco.com/​web/​partners/​downloads/​765/​tools/​quickreference/​routerperformance.pdf|Routing performance of Cisco routers]] (PDF)
-  * [[http://​wiki.networksecuritytoolkit.org/​nstwiki/​index.php/​LAN_Ethernet_Maximum_Rates,​_Generation,​_Capturing_%26_Monitoring|LAN Ethernet Maximum Rates, Generation, Capturing & Monitoring]]+  ​* [[http://​www.telematica.polito.it/​oldsite/​courmayeur06/​papers/​06-A.2.1.pdf|RFC2544 Performance Evaluation for a Linux Based Open Router]] 
 +  * [[http://​data.guug.de/​slides/​lk2008/​10G_preso_lk2008.pdf|Towards 10Gb/s open-source routing]]:​ Include an hardware comparison between a "​real"​ router and a PC. 
 +==== FreeBSD ==== 
 + 
 +Here are some benchs regarding network forwarding performance of FreeBSD: 
 +  * [[http://​lists.freebsd.org/​pipermail/​freebsd-net/​2012-July/​032832.html|FreeBSD as 10 Giagbit router-on-a-stick]]:​ About 1Mpps, this thread have lot's of very useful tips. 
 +  * [[http://​www.net.t-labs.tu-berlin.de/​papers/​SWF-PCCH10GEE-07.pdf|Packet capture in 10-Gigabit environments using Contemporary Commodity Hardware 
 +]] (pdf) 
 + 
 +===== Bench lab ===== 
 + 
 +The bench lab should permit to measure the pps. For obtaining accurate result the [[http://​www.ietf.org/​rfc/​rfc2544.txt|RFC 2544 (Benchmarking Methodology for Network Interconnect Devices)]] is a good reference. 
 + 
 +==== Packet generator ==== 
 + 
 +A packet generator 
 +  ​* [[http://​wiki.networksecuritytoolkit.org/​nstwiki/​index.php/​LAN_Ethernet_Maximum_Rates,​_Generation,​_Capturing_%26_Monitoring|LAN Ethernet Maximum Rates, Generation, Capturing & Monitoring]] ​... on GNU/Linux 
 +  * pkt-gen from the netmap suite 
 + 
 +===== Tuning FreeBSD ===== 
 + 
 +==== Guides ====
  
 Here is a list of sources about optimizing forwarding performance under FreeBSD. Here is a list of sources about optimizing forwarding performance under FreeBSD.
Line 16: Line 48:
 How to bench or tune the network stack: How to bench or tune the network stack:
   * [[http://​wiki.freebsd.org/​NetworkPerformanceTuning |  FreeBSD Network Performance Tuning]]: What need to be done to tune networking stack   * [[http://​wiki.freebsd.org/​NetworkPerformanceTuning |  FreeBSD Network Performance Tuning]]: What need to be done to tune networking stack
-  * [[https://​calomel.org/​network_performance.html | Calomel.org advice for 10Giga tunning]]: A simple and rapid guide+  * [[https://​calomel.org/​network_performance.html | Calomel.org advice for 10Giga tunning]]: ​simple and rapid guide
   * [[http://​www.freebsd.org/​projects/​netperf/​index.html|FreeBSD Network Performance Project (netperf)]]   * [[http://​www.freebsd.org/​projects/​netperf/​index.html|FreeBSD Network Performance Project (netperf)]]
   * [[http://​www.watson.org/​~robert/​freebsd/​netperf/​20051027-eurobsdcon2005-netperf.pdf|Introduction to Multithreading and Multiprocessing in the FreeBSD SMPng Network Stack]], EuroBSDCon 2005 (PDF)   * [[http://​www.watson.org/​~robert/​freebsd/​netperf/​20051027-eurobsdcon2005-netperf.pdf|Introduction to Multithreading and Multiprocessing in the FreeBSD SMPng Network Stack]], EuroBSDCon 2005 (PDF)
Line 22: Line 54:
   * [[http://​wwwx.cs.unc.edu/​~krishnan/​classes/​spring_07/​os_impl/​report.pdf|Improving Memory and Interrupt Processing in FreeBSD Network Stack]] (PDF)   * [[http://​wwwx.cs.unc.edu/​~krishnan/​classes/​spring_07/​os_impl/​report.pdf|Improving Memory and Interrupt Processing in FreeBSD Network Stack]] (PDF)
   * [[http://​conferences.sigcomm.org/​sigcomm/​2009/​workshops/​presto/​papers/​p37.pdf|Optimizing the BSD Routing System for Parallel Processing]] (PDF)   * [[http://​conferences.sigcomm.org/​sigcomm/​2009/​workshops/​presto/​papers/​p37.pdf|Optimizing the BSD Routing System for Parallel Processing]] (PDF)
-  * [[https://​people.sunyit.edu/​~sengupta/​CSC521/​systemperformance.pp|Using netstat and vmstat for performance analysis]] (Powerpoint))+  * [[https://​people.sunyit.edu/​~sengupta/​CSC521/​systemperformance.ppt|Using netstat and vmstat for performance analysis]] (Powerpoint))
   * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=polling&​sektion=4|polling man page]] (Warning: enabling polling is not a good idea with the new generation of Ethernet controller that include interruption control)   * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=polling&​sektion=4|polling man page]] (Warning: enabling polling is not a good idea with the new generation of Ethernet controller that include interruption control)
   * [[http://​info.iet.unipi.it/​~luigi/​polling/​|Device Polling support for FreeBSD ]], the original presentation of polling implementation   * [[http://​info.iet.unipi.it/​~luigi/​polling/​|Device Polling support for FreeBSD ]], the original presentation of polling implementation
Line 29: Line 61:
 FreeBSD Experimental high-performance network stacks: FreeBSD Experimental high-performance network stacks:
   * [[http://​info.iet.unipi.it/​~luigi/​netmap/​|Netmap - memory mapping of network devices]] //"​(...)a single core running at 1.33GHz can generate the 14.8Mpps that saturate a 10GigE interface."//​   * [[http://​info.iet.unipi.it/​~luigi/​netmap/​|Netmap - memory mapping of network devices]] //"​(...)a single core running at 1.33GHz can generate the 14.8Mpps that saturate a 10GigE interface."//​
-  * [[http://​wiki.freebsd.org/​AlexandreFiveg/​RingmapOverview | ringmap ]] : Packet capturing in high-speed networks  ​ 
-===== Existing benchmarks ===== 
  
-Here are some benchs regarding network forwarding performance of FreeBSD: +==== Where is the bottleneck ? ====
-  * [[http://​www.net.t-labs.tu-berlin.de/​papers/​SWF-PCCH10GEE-07.pdf|Packet capture in 10-Gigabit environments using Contemporary Commodity Hardware +
-]] (pdf) +
-  * [[http://​www.tancsa.com/​blast.html|BSD firewall / Router Performance]] : Very interesting test measuring ​the pps (packet per second) between all BSDs and Linux +
-  * [[http://​wiki.freebsd.org/​WrapTest|PC Engines WRAP Test]] : Easy to to test, but measuring bandwith only, between differents FreeBSD release on a PC Engines WRAP.+
  
-Hardware advice+Tools
-  * [[https://calomel.org/network_performance.html|Network Tuning and Performance Guide (OpenBSD)]]: Very good guide for choosing hardware.+  * [[http://www.freebsd.org/cgi/man.cgi?​query=netstat|netstat]]: show network status 
 +  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=vmstat|vmstat]]:​ report virtual memory statistics 
 +  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=top|top]]:​ display and update information about the top cpu processes
  
-===== Bench method ===== 
  
-The bench lab should permit to measure the pps. For obtaining accurate result the [[http://​www.ietf.org/​rfc/​rfc2544.txt|RFC 2544 (Benchmarking Methodology for Network Interconnect Devices)]] is a good reference.+=== Packet traffic ===
  
-===== Routing Daemon Performance =====+Display the information regarding packet traffic, with refresh each second.
  
-Here are some links about some comparaisons between routing daemon: +Here is a first example:
-  * [[http://​www.nanog.org/​meetings/​nanog48/​presentations/​Monday/​Jasinska_RouteServer_N48.pdf | As route server (Quagga vs Openbgp vs bird)]]+
  
 +<​code>​
 +[root@BSDRP3]~#​ netstat -i -h -w 1
 +            input        (Total) ​          ​output
 +   ​packets ​ errs idrops ​     bytes    packets ​ errs      bytes colls
 +      370k     ​0 ​    ​0 ​       38M       ​370k ​    ​0 ​       38M     0
 +      369k     ​0 ​    ​0 ​       38M       ​368k ​    ​0 ​       38M     0
 +      370k     ​0 ​    ​0 ​       38M       ​370k ​    ​0 ​       38M     0
 +      373k     ​0 ​    ​0 ​       38M       ​376k ​    ​0 ​       38M     0
 +      370k     ​0 ​    ​0 ​       38M       ​368k ​    ​0 ​       38M     0
 +      368k     ​0 ​    ​0 ​       38M       ​368k ​    ​0 ​       38M     0
 +      368k     ​0 ​    ​0 ​       38M       ​369k ​    ​0 ​       38M     0
 +</​code>​
  
-===== Find the bootleneck =====+=> This system is forwarding 370Kpps (in and out) without any in/out errs (The packet generator used netblast with 64B packet-size a 370Kpps).
  
-Tools: +Here is a second example:
-  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=netstat&​apropos=0&​sektion=0&​manpath=FreeBSD+7.2-RELEASE&​format=html|netstat]]:​ show network status +
-  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=vmstat&​apropos=0&​sektion=0&​manpath=FreeBSD+7.2-RELEASE&​format=html|vmstat]]:​ report virtual memory statistics +
-  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=iostat&​apropos=0&​sektion=0&​manpath=FreeBSD+7.2-RELEASE&​format=html|iostat]]:​ report I/O statistics+
  
- 
-==== Statistics ==== 
- 
-Display the information regarding packet traffic, with refresh each second 
 <​code>​ <​code>​
-netstat --I nfe0 +[root@BSDRP3]~# ​netstat -ihw 
- +            input        (Total)           ​output 
- +   ​packets ​ errs idrops ​     ​bytes ​   packets ​ errs      bytes colls 
-            input         ​(vge0)           ​output +      ​399k ​ 915k     0        ​25M       ​395k ​    0        ​24M ​    0 
-   ​packets ​ errs      bytes    packets ​ errs      bytes colls +      ​398k ​ 914k     0        ​24M       ​398k ​    0        ​24M ​    0 
-       303     0      ​21364 ​       711     0    ​1059721 ​    0 +      ​399k ​ 915k     ​0 ​       25M       ​399k     0        ​25M     0 
-       233     0      ​15276 ​       545     0     816149 ​    0 +      398k  915k     ​0        24M       ​397k ​    ​0 ​       24M     0 
-       188     0      ​14346 ​       508     0     752803 ​    0 +      ​399k ​ 914k     ​0 ​       25M       ​398k     0        ​24M ​    0 
-       298     0      ​19451        692     0    ​1040904 ​    0 +      398k  914k     ​0 ​       24M       ​400k ​    0        ​25M ​    0 
-       263     0      ​18366 ​       531     0     779491 ​    0 +      ​398k ​ 915k     ​0 ​       24M       ​396k     0        ​24M     0 
-       239     0      ​19480 ​       535     0     785283 ​    0 +      400k  915k     ​0        25M       ​401k ​    ​0 ​       25M     0 
-       236     0      ​27161 ​       488     0     715859 ​    0+      ​397k ​ 914k     ​0 ​       24M       ​397k     0        ​24M     0 
 +      398k  914k     ​0        24M       ​399k ​    ​0 ​       25M     0 
 +      ​400k ​ 914k     ​0 ​       25M       ​401k     0        ​25M     0 
 +      398k  914k     ​0        24M       ​397k ​    ​0 ​       24M     0
 </​code>​ </​code>​
  
-Or more cool using systat: +=This system is forwarding about 400Kpps (in and out), but it's overloaded because it drops (errs) about 914Kpps (the generator used netmap pkt-gen with 64B packet size at a rate of 1.34Mpps).
-<code> +
-systat ​-ifstat+
  
-                    /0   /​1 ​  /​2 ​  /​3 ​  /​4 ​  /​5 ​  /​6 ​  /​7 ​  /​8 ​  /​9 ​  /10 
-     Load Average ​  | 
  
-      Interface ​          ​Traffic ​              ​Peak ​               Total +=== Interrupt usage ===
-            lo0  in      0.121 KB/s          3.923 KB/s            3.693 GB +
-                 ​out ​    0.121 KB/s          3.923 KB/s            3.693 GB+
  
-           ​vge0 ​ in     31.579 KB/s         ​35.850 KB/s            1.039 GB +Report on the number of interrupts taken by each device since system startup.
-                 ​out ​    1.424 MB/s          1.593 MB/s            3.906 GB+
  
 +Here is a first example:
 +<​code>​
 +[root@BSDRP3]~#​ vmstat -i
 +interrupt ​                         total       rate
 +irq4: uart0                         ​6670 ​         5
 +irq14: ata0                            5          0
 +irq16: bge0                           ​27 ​         0
 +irq17: em0 bge1                  5209668 ​      4510
 +cpu0:​timer ​                      ​1299291 ​      1124
 +irq256: ahci0                       ​1172 ​         1
 +Total                            6516833 ​      5642
 </​code>​ </​code>​
  
-Display system-wide statistics for each network protocol+=> Notice that em0 and bge1 are sharing the same IRQ. It's not a good news.
  
-<​code>​ +Here is a second example:
-netstat -s +
- +
-tcp: +
-        549838287 packets sent +
-                497401134 data packets (223809603 bytes) +
-                8557796 data packets (3621832021 bytes) retransmitted +
-                122752 data packets unnecessarily retransmitted +
-                4 resends initiated by MTU discovery +
-                30644686 ack-only packets (2815507 delayed) +
-                0 URG only packets +
-                6026 window probe packets +
-                12653388 window update packets +
-                575517 control packets +
-        313868917 packets received +
-                223261175 acks (for 221487405 bytes) +
-                23944038 duplicate acks +
-                0 acks for unsent data +
-                63662409 packets (558708967 bytes) received in-sequence +
-                511546 completely duplicate packets (100621861 bytes) +
-                2604 old duplicate packets +
-                10140 packets with some dup. data (1622759 bytes duped) +
-                3539217 out-of-order packets (373067493 bytes) +
-                2659 packets (2513 bytes) of data after window +
-                2513 window probes +
-                4846447 window update packets +
-                26645 packets received after close +
-                1415 discarded for bad checksums +
-                0 discarded for bad header offset fields +
-                0 discarded because packet too short +
-                22580 discarded due to memory problems +
-        123764 connection requests +
-        353961 connection accepts +
-        0 bad connection attempts +
-        1904 listen queue overflows +
-        1973 ignored RSTs in the windows +
-        456412 connections established (including accepts) +
-        477764 connections closed (including 10055 drops) +
-                76872 connections updated cached RTT on close +
-                77503 connections updated cached RTT variance on close +
-                17518 connections updated cached ssthresh on close +
-        3341 embryonic connections dropped +
-        155166651 segments updated rtt (of 142193749 attempts) +
-        1633302 retransmit timeouts +
-                2642 connections dropped by rexmit timeout +
-        10747 persist timeouts +
-                7 connections dropped by persist timeout +
-        0 Connections (fin_wait_2) dropped because of timeout +
-        9 keepalive timeouts +
-                2 keepalive probes sent +
-                7 connections dropped by keepalive +
-        21436112 correct ACK header predictions +
-        55502379 correct data packet header predictions +
-        355472 syncache entries added +
-                4169 retransmitted +
-                1251 dupsyn +
-                0 dropped +
-                353961 completed +
-                0 bucket overflow +
-                0 cache overflow +
-                139 reset +
-                461 stale +
-                1904 aborted +
-                0 badack +
-                2 unreach +
-                0 zone failures +
-        355472 cookies sent +
-        995 cookies received +
-        2302476 SACK recovery episodes +
-        5464810 segment rexmits in SACK recovery episodes +
-        3538359759 byte rexmits in SACK recovery episodes +
-        24257592 SACK options (SACK blocks) received +
-        6659181 SACK options (SACK blocks) sent +
-        0 SACK scoreboard overflow +
-udp: +
-        34558 datagrams received +
-        0 with incomplete header +
-        0 with bad data length field +
-        0 with bad checksum +
-        0 with no checksum +
-        6663 dropped due to no socket +
-        0 broadcast/​multicast datagrams undelivered +
-        0 dropped due to full socket buffers +
-        0 not for hashed pcb +
-        27895 delivered +
-        27925 datagrams output +
-        0 times multicast source filter matched +
-(etc...)+
  
 +<​code>​
 +[root@BSDRP3]#​ vmstat -i
 +interrupt ​                         total       rate
 +irq4: uart0                        17869          0
 +irq14: ata0                            5          0
 +irq16: bge0                            1          0
 +irq17: em0 bge1                        2          0
 +cpu0:​timer ​                    ​214331752 ​      1125
 +irq256: ahci0                       ​1725 ​         0
 +Total                          214351354 ​      1126
 </​code>​ </​code>​
  
-==== Memory Buffer ​====+=> Almost zero rate and counters regarding NIC IRQ means polling is enabled: IRQ management of current NIC avoid the use of polling. 
 + 
 +=== Memory Buffer ===
  
 Show statistics recorded by the memory management routines. The network manages a private pool of memory buffers. Show statistics recorded by the memory management routines. The network manages a private pool of memory buffers.
  
 <​code>​ <​code>​
-netstat -m +[root@BSDRP3]~# ​netstat -m 
- +5220/810/6030 mbufs in use (current/​cache/​total) 
- +5219/675/5894/512000 ​mbuf clusters in use (current/​cache/​total/​max) 
-925/575/1500 mbufs in use (current/​cache/​total) +5219/669 mbuf+clusters out of packet secondary zone in use (current/​cache) 
-679/385/1064/25600 mbuf clusters in use (current/​cache/​total/​max) +0/0/0/256000 ​4k (page size) jumbo clusters in use (current/​cache/​total/​max) 
-679/226 mbuf+clusters out of packet secondary zone in use (current/​cache) +0/0/0/128000 ​9k jumbo clusters in use (current/​cache/​total/​max) 
-244/54/298/12800 4k (page size) jumbo clusters in use (current/​cache/​total/​max) +0/0/0/64000 16k jumbo clusters in use (current/​cache/​total/​max) 
-0/0/0/6400 9k jumbo clusters in use (current/​cache/​total/​max) +11743K/1552K/13295K ​bytes allocated to network (current/​cache/​total)
-0/0/0/3200 16k jumbo clusters in use (current/​cache/​total/​max) +
-2565K/1129K/3695K bytes allocated to network (current/​cache/​total)+
 0/0/0 requests for mbufs denied (mbufs/​clusters/​mbuf+clusters) 0/0/0 requests for mbufs denied (mbufs/​clusters/​mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
-0/5/6656 sfbufs in use (current/​peak/​max)+0/0/sfbufs in use (current/​peak/​max)
 0 requests for sfbufs denied 0 requests for sfbufs denied
 0 requests for sfbufs delayed 0 requests for sfbufs delayed
 0 requests for I/O initiated by sendfile 0 requests for I/O initiated by sendfile
 0 calls to protocol drain routines 0 calls to protocol drain routines
 +</​code>​
  
 +Or more verbose:
 +
 +<​code>​
 +[root@BSDRP3]~#​ vmstat -z | head -1 ; vmstat -z | grep -i mbuf
 +ITEM                   ​SIZE ​ LIMIT     ​USED ​    ​FREE ​     REQ FAIL SLEEP
 +mbuf_packet: ​           256,      0,    5221,     ​667,​414103198, ​  ​0, ​  0
 +mbuf:                   ​256, ​     0,       ​1, ​    ​141, ​    ​135, ​  ​0, ​  0
 +mbuf_cluster: ​         2048, 512000, ​   5888,       ​6, ​   5888,   ​0, ​  0
 +mbuf_jumbo_page: ​      4096, 256000, ​      ​0, ​      ​0, ​      ​0, ​  ​0, ​  0
 +mbuf_jumbo_9k: ​        9216, 128000, ​      ​0, ​      ​0, ​      ​0, ​  ​0, ​  0
 +mbuf_jumbo_16k: ​      ​16384, ​ 64000, ​      ​0, ​      ​0, ​      ​0, ​  ​0, ​  0
 +mbuf_ext_refcnt: ​         4,      0,       ​0, ​      ​0, ​      ​0, ​  ​0, ​  0
 </​code>​ </​code>​
  
-==== Interrupt usage ====+=> No "​failed"​ here.
  
-Report on the number of interrupts taken by each device since system startup:+=== CPU / NIC === 
 + 
 +top can give very usefull information regarding ​the CPU/NIC affinity:
  
 <​code>​ <​code>​
-vmstat ​-i+[root@BSDRP3]~#​ top -nCHSIzs1 
 +last pid:  1392;  load averages: ​ 0.15,  0.48,  0.33  up 0+00:​22:​06 ​   15:44:26 
 +75 processes: ​ 2 running, 57 sleeping, 16 waiting
  
-interrupt ​                         total       ​rate +Mem10M Active, 8752K Inact, 79M Wired, 272K Cache, 17M Buf, 878M Free 
-irq18vge0                    517365969 ​      1041 +Swap
-irq20atapci0 ​                 34811697 ​        70 + 
-cpu0: timer                    994124321 ​      2000 + 
-Total                         ​1546301987 ​      ​3112+  PID USERNAME PRI NICE   ​SIZE ​   RES STATE    TIME    CPU COMMAND 
 +    0 root     ​-92 ​   0     ​0K ​  176K -       ​16:24 96.39% kernel{em0 taskq} 
 +   11 root     ​-92 ​   -     ​0K ​  256K WAIT     ​1:​01 ​ 5.76% intr{irq17: em0 bge1}
 </​code>​ </​code>​
 +
 +=> Not very interesting output one CPU here.
 +
 +Here is another example on a 2 cores computer :
 +
 +<​code>​
 +[root@BSDRP2]~#​ top -nCHSIzs1 | awk '$5 ~ /(K|SIZE)/ { printf "%7s %2s %6s %10s %15s %s\n", $7, $8, $9, $10, $11, $12}'
 +  STATE  C   ​TIME ​       CPU         ​COMMAND
 +   ​CPU0 ​ 0   ​7:​23 ​    ​99.76% ​     kernel{em0 rxq}
 +    RUN  1   ​0:​44 ​     6.40%    intr{irq260:​ bce1}
 + ​istorm ​ 1   ​4:​18 ​     4.05%    intr{irq256:​ em0:rx
 +    RUN  1   ​0:​04 ​     0.68%    intr{irq258:​ em0:link}
 +</​code>​
 +
 +=> em0 is under interrupt storm, and consume 100% of CPU n°1.
 +
 +=== Drivers ===
 +
 +Depending the NIC drivers used, there are some counters available:
 +
 +<​code>​
 +[root@BSDRP3]~#​ sysctl dev.em.0.mac_stats. | grep -v ': 0'
 +dev.em.0.mac_stats.missed_packets:​ 221189883
 +dev.em.0.mac_stats.recv_no_buff:​ 94987654
 +dev.em.0.mac_stats.total_pkts_recvd:​ 351270928
 +dev.em.0.mac_stats.good_pkts_recvd:​ 130081045
 +dev.em.0.mac_stats.bcast_pkts_recvd:​ 1
 +dev.em.0.mac_stats.rx_frames_64:​ 2
 +dev.em.0.mac_stats.rx_frames_65_127:​ 130081043
 +dev.em.0.mac_stats.good_octets_recvd:​ 14308901524
 +dev.em.0.mac_stats.good_octets_txd:​ 892
 +dev.em.0.mac_stats.total_pkts_txd:​ 10
 +dev.em.0.mac_stats.good_pkts_txd:​ 10
 +dev.em.0.mac_stats.bcast_pkts_txd:​ 2
 +dev.em.0.mac_stats.mcast_pkts_txd:​ 5
 +dev.em.0.mac_stats.tx_frames_64:​ 2
 +dev.em.0.mac_stats.tx_frames_65_127:​ 8
 +</​code>​
 +
 +=> Notice the high level of missed_packets and recv_no_buff.
 +It's a problem regarding performance of the NIC or its drivers (on this example, the packet generator send packet at a rate about 1.38Mpps).
 +
documentation/technical_docs/performance.1337471945.txt.gz · Last modified: 2013/01/07 15:26 (external edit)