Differences

This shows you the differences between two versions of the page.

--- documentation:technical_docs:performance [2019/03/26 00:22] – external edit 127.0.0.1
+++ documentation:technical_docs:performance [2019/11/21 17:16] – [Choosing good Hardware] olivier
@@ Line 90: / Line 90: @@
 Avoid NUMA architecture but prefer a CPU in only one package with maximum core (8 or 16).
-If you are using NUMA, check that inbound/outbound NIC queues are correctly mapping to the same package.
+If you are using NUMA, check that inbound/outbound NIC queues are correctly bind to their local package.
 === Network Interface Card ===
@@ Line 746: / Line 746: @@
 </code>
-On this case the bootleneck is just the network stack.
+On this case the bootleneck is just the network stack (most of the time spend into function ip_findroute called by ip_tryforward).
 == CPU cycles spent ==
@@ Line 761: / Line 761: @@
 <code>
 pmcstat -z 50 -S cpu_clk_unhalted.thread -l 20 -O /data/pmc.out
+pmcstat -R /data/pmc.out -z50 -G /data/pmc.stacks
+less /data/pmc.stacks
 </code>
-Then analyses the output with:
+=== Lock contention source ===
-<code>
-fetch http://BSDRP-release-debug
+To identifying lock contention source (like if function lock_delay or __mtx_lock_sleep was quite high from the pcm output), you can try to search which lock is contended and why with lockstat.
-tar xzfv BSDRP-release-debug.tar.xz
-pmcannotate /data/pmc.out /data/debug/boot/kernel/kernel.symbols
+You can generate 2 output:
-</code>
+  * contented locks broken down by type: <code>lockstat -x aggsize=4m sleep 10 > lock-type.txt</code>
+  * stacks associated with the lock contention to identify the source: <code>lockstat -x aggsize=4m -s 10 sleep 10 > lock-stacks.txt </code>