User Tools

Site Tools


documentation:technical_docs:performance

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
documentation:technical_docs:performance [2019/03/26 00:22]
127.0.0.1 external edit
documentation:technical_docs:performance [2020/01/18 01:04] (current)
olivier [Choosing good Hardware]
Line 27: Line 27:
   * [[http://​www.telematica.polito.it/​oldsite/​courmayeur06/​papers/​06-A.2.1.pdf|RFC2544 Performance Evaluation for a Linux Based Open Router]] (2006, June)   * [[http://​www.telematica.polito.it/​oldsite/​courmayeur06/​papers/​06-A.2.1.pdf|RFC2544 Performance Evaluation for a Linux Based Open Router]] (2006, June)
   * [[http://​data.guug.de/​slides/​lk2008/​10G_preso_lk2008.pdf|Towards 10Gb/s open-source routing]] (2008):​ Include an hardware comparison between a "​real"​ router and a PC.   * [[http://​data.guug.de/​slides/​lk2008/​10G_preso_lk2008.pdf|Towards 10Gb/s open-source routing]] (2008):​ Include an hardware comparison between a "​real"​ router and a PC.
 +  * [[https://​wiki.fd.io/​images/​7/​7b/​Performance_Consideration_for_packet_processing_on_Intel_Architecture.pptx|Performance consideration for packet processing on Intel Architecture (ppt)]]
 ==== FreeBSD ==== ==== FreeBSD ====
  
Line 86: Line 87:
 Beware of configurations setup that prevent multi-queue,​ like GRE,​GIF,​IPSec tunnels or PPPoE (= same source/​destination address). If PPPoE usage is mandatory on your Gigabit Internet link, using small hardware, like 4 cores AMD GX (PC Engines APU2), will prevent to reach Gigabit speed. Beware of configurations setup that prevent multi-queue,​ like GRE,​GIF,​IPSec tunnels or PPPoE (= same source/​destination address). If PPPoE usage is mandatory on your Gigabit Internet link, using small hardware, like 4 cores AMD GX (PC Engines APU2), will prevent to reach Gigabit speed.
 </​note>​ </​note>​
-==== Choosing ​good Hardware ​====+==== Choosing ​hardware ​====
 === CPU === === CPU ===
  
 Avoid NUMA architecture but prefer a CPU in only one package with maximum core (8 or 16). Avoid NUMA architecture but prefer a CPU in only one package with maximum core (8 or 16).
-If you are using NUMA, check that inbound/​outbound NIC queues are correctly ​mapping ​to the same package.+If you are using NUMA, you need to check that inbound/​outbound NIC queues are correctly ​bind to their local package ​to avoid useless QPI crossing.
  
 === Network Interface Card === === Network Interface Card ===
Line 104: Line 105:
 ==== Choosing good FreeBSD release ==== ==== Choosing good FreeBSD release ====
  
-Before tuning, you need to use the good FreeBSD version. +Before tuning, you need to use the good FreeBSD version... this mean a recent ​FreeBSD -head.
-This mean a FreeBSD -head version older than r309257 (Andrey V. Elsukov 's improvement:​ Rework ip_tryforward() to use FIB4 KPI) backported to FreeBSD 11-stable r310771 (MFC to stable).+
  
-BSDRP since version 1.70 is using a FreeBSD ​11-stable (r312663that includes this improvement.+BSDRP is currently following ​FreeBSD ​12-stable ​branch, to try to have a mix between recent features and stability. 
 +==== Disabling Hyper Threading ​(on specific CPU only====
  
-{{documentation:​technical_docs:​2016-performance-evolution.png|2016 Forwarding performance evolution of FreeBSD -head on a 8 core Atom}} +By default ​multi-queue NIC drivers create one queue per core. 
- +But on some older CPU (like Xeon E5-2650 V1) those logical cores didn't help at all for managing interrupts generated by high speed NIC.
-For better (and linear scale) performance there is the [[https://​svnweb.freebsd.org/​base/​projects/​routing/​|projects/​routing]] too [[http://​blog.cochard.me/​2015/​09/​receipt-for-building-10mpps-freebsd.html|that still give better performance]]. +
- +
-==== Disabling Hyper Threading ==== +
- +
-Disable Hyper Threading (HT): By default, lot's of multi-queue NIC drivers create one queue per core. +
-But "logical" ​cores didn't help at all for managing interrupts generated by high speed NIC.+
  
 HT can be disabled with this command: HT can be disabled with this command:
Line 123: Line 118:
 </​code>​ </​code>​
  
-Here is an example on a 8cores x hardware threads Intel CPU and 10G Chelsio NIC:+Here is an example on a Xeon E5 2650 (8c,​16t) ​and 10G Chelsio NIC where it improve performance by disabling HT:
  
 <​code>​ <​code>​
-x HT-enabled-8rxq(default).packets-per-seconds +x HT-enabled-8rxq(default): inet packets-per-second forwarded 
-+ HT-enabled-16rxq.packets-per-seconds ++ HT-enabled-16rxq: inet packets-per-second forwarded 
-* HT-disabled.packets-per-seconds+* HT-disabled-8rxq: inet packets-per-seconds ​forwarded
 +--------------------------------------------------------------------------+ +--------------------------------------------------------------------------+
 |                                                                        **| |                                                                        **|
Line 150: Line 145:
 </​code>​ </​code>​
  
-There is a benefit of about 24% to disable hyper threading.+There is a benefit of about 24% to disable hyper threading ​on this old CPU. 
 + 
 +But here is another example where there is a benefit to kept it enabled (and with the NIC configured to uses all the treads) on Xeon E5 2650L (10c, 20t): 
 + 
 +<​code>​ 
 +x HT on, 8q (default): inet4 packets-per-second forwarded 
 ++ HT off, 8q: inet4 packets-per-second forwarded 
 +* HT on, 16q: inet4 packets-per-second forwarded 
 ++--------------------------------------------------------------------------+ 
 +|x x              ++                                                   ​* ​ *| 
 +|x xx            +++                                                 * *  *| 
 +||AM|            |A_|                                                |_MA_|| 
 ++--------------------------------------------------------------------------+ 
 +    N           ​Min ​          ​Max ​       Median ​          ​Avg ​       Stddev 
 +x   ​5 ​      ​4265579 ​    ​4433699.5 ​    ​4409249.5 ​    ​4359580.3 ​      ​81559.4 
 ++   ​5 ​      ​5257621 ​      ​5443012 ​      ​5372493 ​    ​5372693.5 ​    ​73316.243 
 +Difference at 95.0% confidence 
 +        1.01311e+06 +/- 113098 
 +        23.2388% +/- 2.94299% 
 +        (Student'​s t, pooled s = 77547.4) 
 +*   ​5 ​      ​8566972 ​      ​8917315 ​    ​8734750.5 ​    ​8769616.1 ​    ​147186.74 
 +Difference at 95.0% confidence 
 +        4.41004e+06 +/- 173536 
 +        101.157% +/- 5.21388% 
 +        (Student'​s t, pooled s = 118987) 
 +</​code>​
  
 ==== fastforwarding ==== ==== fastforwarding ====
Line 235: Line 255:
 {{documentation:​technical_docs:​entropy_source_impact.png|Impact of disabling some entropy source on FreeBSD forwarding performance}} {{documentation:​technical_docs:​entropy_source_impact.png|Impact of disabling some entropy source on FreeBSD forwarding performance}}
  
-==== Polling mode ==== 
  
-Polling can be used in 2 cases: 
-   * On **old hardware only** (where Ethernet card doesn'​t support **Intelligent interrupt management**),​ using the polling mode can improve performance by reducing CPU interrupt 
-   * When used [[http://​lists.freebsd.org/​pipermail/​freebsd-net/​2013-May/​035626.html|for usage in a Virtual Machine]] but don't forgot to [[https://​lists.freebsd.org/​pipermail/​freebsd-net/​2015-March/​041657.html|overwrite the default HZ value in this case too]]. 
-For enabling polling mode: 
-  - Edit /​etc/​rc.conf.misc and replace //​polling_enable="​NO"//​ by //​polling_enable="​YES"//​ 
-  - Execute: service polling start 
- 
-=== NIC drivers compatibility matrix === 
- 
-BSDRP can use some special features on somes NIC: 
-  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=polling|Ethernet device polling]] for high performance with Ethernet controllers that didn't include interrupt management feature or [[http://​info.iet.unipi.it/​~luigi/​papers/​20130520-rizzo-vm.pdf|for usage in a VM]]. 
-  * [[http://​www.freebsd.org/​cgi/​man.cgi?​query=altq|ALTQ]] for queuing, but try to use [[http://​www.freebsd.org/​cgi/​man.cgi?​query=dummynet|dummynet]] in place 
- 
-And only theses devices support these modes: 
- 
-^ name      ^ Description ​   ^ Polling ​  ^ ALTQ       ^ 
-| ae  | Attansic/​Atheros L2 FastEthernet controller driver | no | yes |  
-| age | Attansic/​Atheros L1 Gigabit Ethernet driver | no | yes | 
-| alc | Atheros AR813x/​AR815x Gigabit/​Fast Ethernet driver | no | yes | 
-| ale | Atheros AR8121/​AR8113/​AR8114 Gigabit/​Fast Ethernet driver | no | yes | 
-| bce | Broadcom NetXtreme II (BCM5706/​5708/​5709/​5716) PCI/PCIe Gigabit Ethernet adapter driver | no | yes | 
-| bfe | Broadcom BCM4401 Ethernet Device Driver | no | yes | 
-| bge | Broadcom BCM570x/​5714/​5721/​5722/​5750/​5751/​5752/​5789 PCI Gigabit Ethernet adapter driver | yes | yes | 
-| cas | Sun Cassini/​Cassini+ and National Semiconductor DP83065 Saturn Gigabit Ethernet driver ​ | no | yes | 
-| cxgbe | Chelsio T4 and T5 based 40Gb, 10Gb, and 1Gb Ethernet adapter driver | no | yes | 
-| dc | DEC/Intel 21143 and clone 10/100 Ethernet driver | yes | yes | 
-| de | DEC DC21x4x Ethernet device driver | no | yes | 
-| ed | NE-2000 and WD-80x3 Ethernet driver | no | yes | 
-| em | Intel(R) PRO/1000 Gigabit Ethernet adapter driver | yes | yes | 
-| et | Agere ET1310 10/​100/​Gigabit Ethernet driver | no | yes | 
-| ep | Ethernet driver for 3Com Etherlink III (3c5x9) interfaces | no | yes | 
-| fxp | Intel EtherExpress PRO/100 Ethernet device driver | yes | yes | 
-| gem | ERI/​GEM/​GMAC Ethernet device driver | no | yes | 
-| hme | Sun Microelectronics STP2002-STQ Ethernet interfaces device driver | no | yes | 
-| igb | Intel(R) PRO/1000 PCI Express Gigabit Ethernet adapter driver | yes | needs IGB_LEGACY_TX | 
-| ixgb(e) | Intel(R) 10Gb Ethernet driver | yes | needs IGB_LEGACY_TX | 
-| jme | JMicron Gigabit/​Fast Ethernet driver | no | yes | 
-| le | AMD Am7900 LANCE and Am79C9xx ILACC/PCnet Ethernet interface driver | no | yes | 
-| msk | Marvell/​SysKonnect Yukon II Gigabit Ethernet adapter driver | no | yes | 
-| mxge | Myricom Myri10GE 10 Gigabit Ethernet adapter driver | no | yes | 
-| my | Myson Technology Ethernet PCI driver | no | yes | 
-| nfe | NVIDIA nForce MCP Ethernet driver | yes | yes | 
-| nge | National Semiconductor PCI Gigabit Ethernet adapter driver | yes | no | 
-| nve | NVIDIA nForce MCP Networking Adapter device driver | no | yes | 
-| qlxgb | QLogic 10 Gigabit Ethernet & CNA Adapter Driver | no | yes | 
-| re | RealTek 8139C+/​8169/​816xS/​811xS/​8101E PCI/PCIe Ethernet adapter driver | yes | yes | 
-| rl | RealTek 8129/8139 Fast Ethernet device driver | yes | yes | 
-| sf | Adaptec AIC‐6915 "​Starfire"​ PCI Fast Ethernet adapter driver | yes | yes | 
-| sge | Silicon Integrated Systems SiS190/191 Fast/​Gigabit Ethernet driver | no | yes | 
-| sis | SiS 900, SiS 7016 and NS DP83815/​DP83816 Fast Ethernet device driver | yes | yes | 
-| sk | SysKonnect SK-984x and SK-982x PCI Gigabit Ethernet adapter driver | yes | yes | 
-| ste | Sundance Technologies ST201 Fast Ethernet device driver | no | yes | 
-| stge | Sundance/​Tamarack TC9021 Gigabit Ethernet adapter driver | yes | yes | 
-| ti | Alteon Networks Tigon I and Tigon II Gigabit Ethernet driver | no | yes | 
-| txp | 3Com 3XP Typhoon/​Sidewinder (3CR990) Ethernet interface | no | yes | 
-| vge | VIA Networking Technologies VT6122 PCI Gigabit Ethernet adapter driver | yes | yes | 
-| vr | VIA Technologies Rhine I/II/III Ethernet device driver | yes | yes | 
-| xl | 3Com Etherlink XL and Fast Etherlink XL Ethernet device driver | yes | yes | 
- 
-Using others NIC will works too :-) 
 ==== NIC drivers tuning ==== ==== NIC drivers tuning ====
  
Line 746: Line 705:
 </​code>​ </​code>​
  
-On this case the bootleneck is just the network stack.+On this case the bootleneck is just the network stack (most of the time spend into function ip_findroute called by ip_tryforward).
  
 == CPU cycles spent == == CPU cycles spent ==
Line 761: Line 720:
 <​code>​ <​code>​
 pmcstat -z 50 -S cpu_clk_unhalted.thread -l 20 -O /​data/​pmc.out pmcstat -z 50 -S cpu_clk_unhalted.thread -l 20 -O /​data/​pmc.out
 +pmcstat -R /​data/​pmc.out -z50 -G /​data/​pmc.stacks
 +less /​data/​pmc.stacks
 </​code>​ </​code>​
  
-Then analyses ​the output with: +=== Lock contention source === 
-<​code>​ + 
-fetch http://​BSDRP-release-debug +To identifying lock contention source (like if function lock_delay or __mtx_lock_sleep was quite high from the pcm output), you can try to search which lock is contended and why with lockstat. 
-tar xzfv BSDRP-release-debug.tar.xz + 
-pmcannotate /​data/​pmc.out /​data/​debug/​boot/​kernel/​kernel.symbols +You can generate 2 output
-</​code>​+  * contented locks broken down by type: <​code>​lockstat ​-x aggsize=4m sleep 10 > lock-type.txt</​code>​ 
 +  * stacks associated with the lock contention to identify the source: <​code>​lockstat ​-x aggsize=4m ​-s 10 sleep 10 > lock-stacks.txt </​code>​
documentation/technical_docs/performance.1553556169.txt.gz · Last modified: 2019/03/26 00:22 by 127.0.0.1