[documentation] Mainly update the benchmarking document; plus minor other changes.

This commit is contained in:
Ciprian Dorin Craciun 2021-12-23 19:50:15 +02:00
parent 55a0aaa396
commit ef7c360d27
6 changed files with 429 additions and 177 deletions

View file

@ -18,29 +18,102 @@ Benchmarks
--------
Summary Summary
------- -------
Bottom line (**even on my 6 years old laptop**):
* under normal conditions (16 concurrent connections), you get around 111k requests / second, at about 0.25ms latency for 99% of the requests; .. important::
* under light stress conditions (128 concurrent connections), you get around 118k requests / second, at about 2.5ms latency for 99% of the requests;
* under medium stress conditions (512 concurrent connections), you get around 106k requests / second, at about 10ms latency for 99% of the requests (meanwhile the average is 4.5ms); Last updated in December 2021!
* **under high stress conditions (2048 concurrent connections), you get around 100k requests / second, at about 400ms latency for 99% of the requests (meanwhile the average is 45ms);**
* under extreme stress conditions (16384 concurrent connections) (i.e. someone tries to DDOS the server), you get around 53k requests / second, at about 2.8s latency for 99% of the requests (meanwhile the average is 200ms);
* (the timeout errors are due to the fact that ``wrk`` is configured to timeout after only 1 second of waiting while connecting or receiving the full response;) Bottom line (**even on my 6 years old laptop**),
* (the read errors are due to the fact that the server closes a keep-alive connection after serving 256k requests;) using only 1 core with 2 hyperthreads
* **the raw performance is at least on-par with NGinx**; (from my measurements ``kawipiko`` serves in fact 30% more requests / second than NGinx, at least for my "synthetic" benchmark;) however, especially for a "real world" scenarios (i.e. thousand of small files, accessed in a random patterns), I think ``kawipiko`` fares better; (not to mention how simple it is to configure and deploy ``kawipiko`` as compared to NGinx;) (one core for the server, and a separate core for the load generator),
with HTTP Keep-Alive capped at 256k requests per connection:
* under normal conditions (16 concurrent connections),
I get around 105k requests / second,
at about 0.4ms latency for 99% of the requests;
* under normal conditions (64 concurrent connections),
I get around 107k requests / second,
at about 1.5ms latency for 99% of the requests;
* under light stress conditions (128 concurrent connections),
I get around 110k requests / second,
at about 3.0ms latency for 99% of the requests;
* under medium stress conditions (512 concurrent connections),
I get around 104k requests / second,
at about 9.3ms latency for 99% of the requests
(meanwhile the average is under 5.0ms);
* under high stress conditions (2048 concurrent connections),
I get around 103k requests / second,
at about 240ms latency for 99% of the requests
(meanwhile the average is under 20ms);
* under extreme stress conditions (16384 concurrent connections)
(i.e. someone tries to DDOS the server),
I get around 90k requests / second,
at about 3.1s latency for 99% of the requests
(meanwhile the average is under 200ms);
* **the performance is at least on-par with NGinx**;
however, especially for a real world scenarios
(i.e. thousand of small files, accessed in a random patterns),
I believe ``kawipiko`` fares much better;
(not to mention how simple it is to configure and deploy ``kawipiko`` as compared to NGinx,
which took a lot of time, fiddling, and trial and error to get it right;)
Regarding HTTPS, my initial benchmarks
(only covering plain HTTPS with HTTP/1)
seem to indicate that ``kawipiko`` is at least on-par with NGinx.
Regarding HTTP/2, my initial benchmarks
seem to indicate that ``kawipiko``'s performance is 6 times less than plain HTTPS with HTTP/1
(mainly due to the unoptimized Go ``net/http`` implementation).
In this regard NGinx is much better, having a HTTP/2 performance similar to plain HTTPS with HTTP/1.
Regarding HTTP/3, given that the QUIC library is still experimental,
my initial benchmarks seem to indicate that ``kawipiko``'s performance is quite poor
(at about 5k requests / second).
Results --------
-------
Results values
..............
Performance
-----------
.. important::
Last updated in August 2018!
The results are based on an older version of ``kawipiko``;
the current version is at least 10% more efficient.
The methodology used is described in a `dedicated section <#methodology>`__.
Performance values
..................
.. note :: .. note ::
@ -48,7 +121,8 @@ Results values
Please note that the values under *Thread Stats* are reported per thread. Please note that the values under *Thread Stats* are reported per thread.
Therefore it is best to look at the first two values, i.e. *Requests/sec*. Therefore it is best to look at the first two values, i.e. *Requests/sec*.
* 16 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 16 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 111720.73 Requests/sec: 111720.73
Transfer/sec: 18.01MB Transfer/sec: 18.01MB
@ -63,7 +137,8 @@ Results values
90% 212.00us 99% 261.00us 90% 212.00us 99% 261.00us
3362742 requests in 30.10s, 541.98MB read 3362742 requests in 30.10s, 541.98MB read
* 128 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 128 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 118811.41 Requests/sec: 118811.41
Transfer/sec: 19.15MB Transfer/sec: 19.15MB
@ -78,7 +153,8 @@ Results values
90% 1.89ms 99% 2.42ms 90% 1.89ms 99% 2.42ms
3564527 requests in 30.00s, 574.50MB read 3564527 requests in 30.00s, 574.50MB read
* 512 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 512 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 106698.89 Requests/sec: 106698.89
Transfer/sec: 17.20MB Transfer/sec: 17.20MB
@ -94,7 +170,8 @@ Results values
3206540 requests in 30.05s, 516.80MB read 3206540 requests in 30.05s, 516.80MB read
Socket errors: connect 0, read 105, write 0, timeout 0 Socket errors: connect 0, read 105, write 0, timeout 0
* 2048 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 2048 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 100296.65 Requests/sec: 100296.65
Transfer/sec: 16.16MB Transfer/sec: 16.16MB
@ -110,7 +187,8 @@ Results values
3015868 requests in 30.07s, 486.07MB read 3015868 requests in 30.07s, 486.07MB read
Socket errors: connect 0, read 128, write 0, timeout 86 Socket errors: connect 0, read 128, write 0, timeout 86
* 4096 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 4096 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 95628.34 Requests/sec: 95628.34
Transfer/sec: 15.41MB Transfer/sec: 15.41MB
@ -126,7 +204,8 @@ Results values
2871404 requests in 30.03s, 462.79MB read 2871404 requests in 30.03s, 462.79MB read
Socket errors: connect 0, read 27, write 0, timeout 4449 Socket errors: connect 0, read 27, write 0, timeout 4449
* 16384 connections / 2 server threads / 2 wrk threads: ::
* ``kawipiko``, 16384 connections / 2 server threads / 2 ``wrk`` threads: ::
Requests/sec: 53548.52 Requests/sec: 53548.52
Transfer/sec: 8.63MB Transfer/sec: 8.63MB
@ -143,22 +222,49 @@ Results values
Socket errors: connect 0, read 115, write 0, timeout 2288 Socket errors: connect 0, read 115, write 0, timeout 2288
Results notes
.............
* the machine was my personal laptop: 6 years old with an Intel Core i7 3667U (2 cores with 2 threads each);
* the ``kawipiko-server`` was started with ``--processes 1 --threads 2``; (i.e. 2 threads handling the requests;) Performance notes
* the ``kawipiko-server`` was started with ``--archive-inmem``; (i.e. the CDB database file was preloaded into memory, thus no disk I/O;) .................
* the ``kawipiko-server`` was started with ``--security-headers-disable``; (because these headers are not set by default by other HTTP servers;)
* the ``kawipiko-server`` was started with ``--timeout-disable``; (because, due to a known Go issue, using ``net.Conn.SetDeadline`` has an impact of about 20% of the raw performance; thus the reported values above might be about 10%-15% smaller when used with timeouts;)
* the benchmarking tool was wrk_; * the machine was my personal laptop,
with an Intel Core i7 3667U (2 physical cores times 2 hyper-threads each),
and with 8 GiB of RAM;
* the ``kawipiko-server`` was started with ``--processes 1 --threads 2``;
(i.e. 2 threads handling the requests;)
* the ``kawipiko-server`` was started with ``--archive-inmem``;
(i.e. the CDB database file was preloaded into memory, thus no disk IO;)
* the ``kawipiko-server`` was started with ``--security-headers-disable``;
(because these headers are not set by default by other HTTP servers;)
* the ``kawipiko-server`` was started with ``--timeout-disable``;
(because, due to a known Go issue, using ``net.Conn.SetDeadline`` has an impact of about 20% of the raw performance;
thus the reported values above might be about 10%-15% smaller when used with timeouts;)
* the benchmarking tool was ``wrk``;
* both ``kawipiko-server`` and ``wrk`` tools were run on the same machine; * both ``kawipiko-server`` and ``wrk`` tools were run on the same machine;
* both ``kawipiko-server`` and ``wrk`` tools were pinned on different physical cores; * both ``kawipiko-server`` and ``wrk`` tools were pinned on different physical cores;
* the benchmark was run over loopback networking (i.e. ``127.0.0.1``); * the benchmark was run over loopback networking (i.e. ``127.0.0.1``);
* the served file contains ``Hello World!``; * the served file contains ``Hello World!``;
* the protocol was HTTP (i.e. no TLS), with keep-alive; * the protocol was HTTP (i.e. no TLS), with keep-alive;
* both the CDB and the NGinx folder were put on ``tmpfs`` (which implies that the disk is not a limiting factor); (in fact ``kawipiko`` performs quite well even on spinning disks due to careful storage management;)
* see the `methodology section <#methodology>`__ for details; * both the CDB and the NGinx folder were put on ``tmpfs``
(which implies that the disk is not a limiting factor);
(in fact ``kawipiko`` performs quite well even on spinning disks due to careful storage management;)
--------
@ -167,10 +273,23 @@ Comparisons
----------- -----------
.. important::
Last updated in August 2019!
The results are based on an older version of ``kawipiko``;
the current version is at least 10% more efficient.
The methodology used is described in a `dedicated section <#methodology>`__.
Comparisons with NGinx Comparisons with NGinx
...................... ......................
* NGinx 512 connections / 2 server workers / 2 wrk thread: ::
* NGinx, 512 connections / 2 worker processes / 2 ``wrk`` threads: ::
Requests/sec: 79816.08 Requests/sec: 79816.08
Transfer/sec: 20.02MB Transfer/sec: 20.02MB
@ -185,7 +304,8 @@ Comparisons with NGinx
90% 8.51ms 99% 11.10ms 90% 8.51ms 99% 11.10ms
2399069 requests in 30.06s, 601.73MB read 2399069 requests in 30.06s, 601.73MB read
* NGinx 2048 connections / 2 server workers / 2 wrk thread: ::
* NGinx, 2048 connections / 2 worker processes / 2 ``wrk`` threads: ::
Requests/sec: 78211.46 Requests/sec: 78211.46
Transfer/sec: 19.62MB Transfer/sec: 19.62MB
@ -201,7 +321,8 @@ Comparisons with NGinx
2351933 requests in 30.07s, 589.90MB read 2351933 requests in 30.07s, 589.90MB read
Socket errors: connect 0, read 0, write 0, timeout 11 Socket errors: connect 0, read 0, write 0, timeout 11
* NGinx 4096 connections / 2 server workers / 2 wrk thread: ::
* NGinx, 4096 connections / 2 worker processes / 2 ``wrk`` threads: ::
Requests/sec: 75970.82 Requests/sec: 75970.82
Transfer/sec: 19.05MB Transfer/sec: 19.05MB
@ -217,7 +338,8 @@ Comparisons with NGinx
2282223 requests in 30.04s, 572.42MB read 2282223 requests in 30.04s, 572.42MB read
Socket errors: connect 0, read 0, write 0, timeout 187 Socket errors: connect 0, read 0, write 0, timeout 187
* NGinx 16384 connections / 2 server workers / 2 wrk thread: ::
* NGinx, 16384 connections / 2 worker processes / 2 ``wrk`` threads: ::
Requests/sec: 43909.67 Requests/sec: 43909.67
Transfer/sec: 11.01MB Transfer/sec: 11.01MB
@ -233,21 +355,42 @@ Comparisons with NGinx
1320562 requests in 30.07s, 331.22MB read 1320562 requests in 30.07s, 331.22MB read
Socket errors: connect 0, read 12596, write 34, timeout 1121 Socket errors: connect 0, read 12596, write 34, timeout 1121
* the NGinx configuration file can be found in the `examples folder <./examples>`__; the configuration was obtained after many experiments to squeeze out of NGinx as much performance as possible, given the targeted use-case, namely many small files;
* the NGinx configuration file can be found in the `examples folder <../examples/nginx>`__;
the configuration was obtained after many experiments to squeeze out of NGinx as much performance as possible,
given the targeted use-case, namely many small files;
* moreover NGinx seems to be quite sensitive to the actual path requested: * moreover NGinx seems to be quite sensitive to the actual path requested:
* if one requests ``http://127.0.0.1:8080/``, and one has configured NGinx to look for ``index.txt``, and that file actually exists, the performance is quite a bit lower than just asking for that file; (perhaps it issues more syscalls, searching for the index file;) * if one requests ``http://127.0.0.1:8080/``,
* if one requests ``http://127.0.0.1:8080/index.txt``, as mentioned above, it achieves the higher performance; (perhaps it issues fewer syscalls;) and one has configured NGinx to look for ``index.txt``,
* if one requests ``http://127.0.0.1:8080/does-not-exist``, it seems to achieve the "best" performance; (perhaps it issues the least amount of syscalls;) (however this is not an actual "use-ful" corner-case;) and that file actually exists,
* it must be noted that ``kawipiko`` doesn't exhibit this behaviour, the same performance is achieved regardless of the path variant; the performance is quite a bit lower than just asking for that file;
* therefore the benchmarks above use ``/index.txt`` as opposed to ``/``; (perhaps it issues more syscalls searching for the index file;)
* if one requests ``http://127.0.0.1:8080/index.txt``,
as mentioned above, it achieves the higher performance;
(perhaps it issues fewer syscalls;)
* if one requests ``http://127.0.0.1:8080/does-not-exist``,
it seems to achieve the best performance;
(perhaps it issues the least amount of syscalls;)
(however this is not an actual useful corner-case;)
* it must be noted that ``kawipiko`` doesn't exhibit this behaviour,
the same performance is achieved regardless of the path variant;
* therefore the benchmarks above use ``/index.txt`` as opposed to ``/``,
in order not to disfavour NGinx;
Comparisons with others Comparisons with others
....................... .......................
* darkhttpd_ 512 connections / 1 server process / 2 wrk threads: ::
* ``darkhttpd``, 512 connections / 1 server process / 2 ``wrk`` threads: ::
Requests/sec: 38191.65 Requests/sec: 38191.65
Transfer/sec: 8.74MB Transfer/sec: 8.74MB
@ -265,123 +408,7 @@ Comparisons with others
Methodology --------
-----------
* get the executables (either `download <#download-prebuilt-executables>`__ or `build <#build-from-sources>`__ them);
* get the ``hello-world.cdb`` (from the `examples <./examples>`__ folder inside the repository);
Single process / single threaded
................................
* this scenario will yield a "base-line performance" per core;
* execute the server (in-memory and indexed) (i.e. the "best case scenario"): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-inmem \
--index-all \
--processes 1 \
--threads 1 \
#
* execute the server (memory mapped) (i.e. the "the recommended scenario"): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-mmap \
--processes 1 \
--threads 1 \
#
Single process / two threads
............................
* this scenario is the usual setup; configure ``--threads`` to equal the number of cores;
* execute the server (memory mapped): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-mmap \
--processes 1 \
--threads 2 \
#
Load generators
...............
* 512 concurrent connections (handled by 2 threads): ::
wrk \
--threads 2 \
--connections 512 \
--timeout 1s \
--duration 30s \
--latency \
http://127.0.0.1:8080/index.txt \
#
* 4096 concurrent connections (handled by 2 threads): ::
wrk \
--threads 2 \
--connections 4096 \
--timeout 1s \
--duration 30s \
--latency \
http://127.0.0.1:8080/index.txt \
#
Methodology notes
.................
* the number of threads for the server plus for ``wkr`` shouldn't be larger than the number of available cores; (or use different machines for the server and the client;)
* also take into account that by default the number of "file descriptors" on most UNIX/Linux machines is 1024, therefore if you want to try with more connections than 1000, you need to raise this limit; (see bellow;)
* additionally, you can try to pin the server and ``wrk`` to specific cores, increase various priorities (scheduling, IO, etc.); (given that Intel processors have HyperThreading which appear to the OS as individual cores, you should make sure that you pin each process on cores part of the same physical processor / core;)
* pinning the server (cores ``0`` and ``1`` are mapped on physical core ``1``): ::
sudo -u root -n -E -P -- \
\
taskset -c 0,1 \
nice -n -19 -- \
ionice -c 2 -n 0 -- \
chrt -r 10 \
prlimit -n262144 -- \
\
sudo -u "${USER}" -n -E -P -- \
\
kawipiko-server \
... \
#
* pinning the client (cores ``2`` and ``3`` are mapped on physical core ``2``): ::
sudo -u root -n -E -P -- \
\
taskset -c 2,3 \
nice -n -19 -- \
ionice -c 2 -n 0 -- \
chrt -r 10 \
prlimit -n262144 -- \
\
sudo -u "${USER}" -n -E -P -- \
\
wrk \
... \
#
@ -390,47 +417,87 @@ OpenStreetMap tiles
------------------- -------------------
.. important::
Last updated in August 2019!
The results are based on an older version of ``kawipiko``;
the current version is at least 10% more efficient.
The methodology used is described in a `dedicated section <#methodology>`__.
Scenario notes Scenario notes
.............. ..............
As a benchmark much closer to the "real world" use-cases for ``kawipiko`` I've done the following: As a benchmark much closer to the "real world" use-cases for ``kawipiko`` I've done the following:
* downloaded from OpenStreetMap servers all tiles for my home town (from zoom level 0 to zoom level 19), which resulted in: * downloaded from OpenStreetMap servers all tiles for my home town
(from zoom level 0 to zoom level 19), which resulted in:
* around ~250K PNG files totaling ~330 MiB; * around ~250k PNG files totaling ~330 MiB;
* with an average of 1.3 KiB and a median of 103B; (i.e. lots of extreemly small files;) * with an average of 1.3 KiB and a median of 103B;
(i.e. lots of extreemly small files;)
* occupying actualy around 1.1 GiB of storage (on Ext4) due to file-system overheads; * occupying actualy around 1.1 GiB of storage (on Ext4) due to file-system overheads;
* created a CDB archive, which resulted in: * created a CDB archive, which resulted in:
* a single file totaling ~376 MiB (both "apparent" and "occupied" storage); (i.e. no storage space wasted;) * a single file totaling ~376 MiB (both "apparent" and "occupied" storage);
* which contains only ~100K PNG files, due to elimination of duplicate PNG files; (i.e. at higher zoom levels, the tiles start to repeat;) (i.e. no storage space wasted;)
* which contains only ~100k PNG files, due to elimination of duplicate PNG files;
(i.e. at higher zoom levels, the tiles start to repeat;)
* listed all the available tiles, and benchmarked both ``kawipiko`` and NGinx, with 16K concurrent connections; * listed all the available tiles, and benchmarked both ``kawipiko`` and NGinx,
* the methodology is the same one described above, with the following changes: with 16k concurrent connections;
* the methodology is the same one described above,
with the following changes:
* the machine used was my desktop,
with an Intel Core i7 4770 (4 physical cores times 2 hyper-threads each),
and with 32 GiB of RAM;
* the host used in benchmarks has a desktop-grade Intel Core i7 4770 (i.e. 4th generation, about 6 years old) with 4 physical cores and 32 GiB of RAM;
* the files (both CDB and tiles folder) were put in ``tmpfs``; * the files (both CDB and tiles folder) were put in ``tmpfs``;
* both ``kawipiko``, NGinx and ``wrk`` were configured to use 8 threads / processes, and were pinned on two separate physical cores each;
* (the host had almost nothing running on it except the minimal required services;) * both ``kawipiko``, NGinx and ``wrk``
were configured to use 8 threads,
and were pinned on two separate physical cores each;
* (the machine had almost nothing running on it except the minimal required services;)
Results notes Results notes
............. .............
Based on my benchmark the following are my findings: Based on my benchmark the following are my findings:
* ``kawipiko`` outperformed NGinx by ~25% in requests / second; * ``kawipiko`` outperformed NGinx by ~25% in requests / second;
* ``kawipiko`` outperformed NGinx by ~29% in average response latency; * ``kawipiko`` outperformed NGinx by ~29% in average response latency;
* ``kawipiko`` outperformed NGinx by ~40% in 90-percentile response latency; * ``kawipiko`` outperformed NGinx by ~40% in 90-percentile response latency;
* ``kawipiko`` used ~6% less CPU while serving requests for 2 minutes; * ``kawipiko`` used ~6% less CPU while serving requests for 2 minutes;
* ``kawipiko`` used ~25% less CPU per request; * ``kawipiko`` used ~25% less CPU per request;
* NGinx used the least amount of RAM, meanwhile ``kawipiko`` (due to either in RAM loading or ``mmap`` usage) used around 1GiB of RAM;
* NGinx used the least amount of RAM,
meanwhile ``kawipiko`` used around 1 GiB of RAM
(due to either in RAM loading or ``mmap`` usage);
Results values Results values
.............. ..............
* ``kawipiko`` with ``--archive-inmem`` and ``--index-all`` (1 process, 8 threads): :: * ``kawipiko`` with ``--archive-inmem`` and ``--index-all`` (1 process, 8 threads): ::
Requests/sec: 238499.86 Requests/sec: 238499.86
@ -447,6 +514,7 @@ Results values
28640139 requests in 2.00m, 44.98GB read 28640139 requests in 2.00m, 44.98GB read
Socket errors: connect 0, read 0, write 0, timeout 7032 Socket errors: connect 0, read 0, write 0, timeout 7032
* ``kawipiko`` with ``--archive-mmap`` (1 process, 8 threads): :: * ``kawipiko`` with ``--archive-mmap`` (1 process, 8 threads): ::
Requests/sec: 237239.35 Requests/sec: 237239.35
@ -463,6 +531,7 @@ Results values
28489533 requests in 2.00m, 44.77GB read 28489533 requests in 2.00m, 44.77GB read
Socket errors: connect 0, read 0, write 0, timeout 10730 Socket errors: connect 0, read 0, write 0, timeout 10730
* ``kawipiko`` with ``--archive-mmap`` (8 processes, 1 thread): :: * ``kawipiko`` with ``--archive-mmap`` (8 processes, 1 thread): ::
Requests/sec: 248266.83 Requests/sec: 248266.83
@ -479,7 +548,8 @@ Results values
29816650 requests in 2.00m, 46.83GB read 29816650 requests in 2.00m, 46.83GB read
Socket errors: connect 0, read 0, write 0, timeout 15244 Socket errors: connect 0, read 0, write 0, timeout 15244
* NGinx (8 workers): ::
* NGinx (8 worker processes): ::
Requests/sec: 188255.32 Requests/sec: 188255.32
Transfer/sec: 302.88MB Transfer/sec: 302.88MB
@ -495,3 +565,160 @@ Results values
22607727 requests in 2.00m, 35.52GB read 22607727 requests in 2.00m, 35.52GB read
Socket errors: connect 0, read 109, write 0, timeout 16833 Socket errors: connect 0, read 109, write 0, timeout 16833
--------
Methodology
-----------
* get the ``kawipiko`` executables (either `download <./installation.rst#download-prebuilt-executables>`__ or `build <./installation.rst#build-from-sources>`__ them);
* get the ``hello-world.cdb`` (from the `examples <../examples>`__ folder inside the repository);
* install NGinx and ``wrk`` from the distribution packages;
Single process / single threaded
................................
* this scenario will yield a base-line performance per core;
* execute the server (in-memory and indexed)
(i.e. the *best case scenario*): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-inmem \
--index-all \
--processes 1 \
--threads 1 \
#
* execute the server (memory mapped)
(i.e. the *the recommended scenario*): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-mmap \
--processes 1 \
--threads 1 \
#
Single process / two threads
............................
* this scenario is the usual setup;
configure ``--threads`` to equal the number of logical cores
(i.e. multiply the number of physical cores with
the number of hyper-threads per physical core);
* execute the server (memory mapped): ::
kawipiko-server \
--bind 127.0.0.1:8080 \
--archive ./hello-world.cdb \
--archive-mmap \
--processes 1 \
--threads 2 \
#
Load generators
...............
* ``wrk``, 512 concurrent connections, handled by 2 threads: ::
wrk \
--threads 2 \
--connections 512 \
--timeout 1s \
--duration 30s \
--latency \
http://127.0.0.1:8080/index.txt \
#
* ``wrk``, 4096 concurrent connections, handled by 2 threads: ::
wrk \
--threads 2 \
--connections 4096 \
--timeout 1s \
--duration 30s \
--latency \
http://127.0.0.1:8080/index.txt \
#
Methodology notes
.................
* the number of threads for the server plus for ``wrk`` shouldn't be larger than the number of available cores;
(or use different machines for the server and the client;)
* also take into account that by default the number of file descriptors
on most UNIX / Linux systems is 1024,
therefore if you want to try with more connections than 1000,
you need to raise this limit;
(see bellow;)
* additionally, you can try to pin the server and ``wrk`` to specific cores,
increase various priorities (scheduling, IO, etc.);
(given that Intel processors have hyper-threading which appear to the OS as individual cores, you should make sure that you pin each process on cores part of the same physical processor / core;)
* pinning the server (cores ``0`` and ``1`` are mapped on the physical core ``1``): ::
sudo -u root -n -E -P -- \
\
taskset -c 0,1 \
nice -n -19 -- \
ionice -c 2 -n 0 -- \
chrt -r 10 \
prlimit -n262144 -- \
\
sudo -u "${USER}" -n -E -P -- \
\
kawipiko-server \
... \
#
* pinning the client (cores ``2`` and ``3`` are mapped on the physical core ``2``): ::
sudo -u root -n -E -P -- \
\
taskset -c 2,3 \
nice -n -19 -- \
ionice -c 2 -n 0 -- \
chrt -r 10 \
prlimit -n262144 -- \
\
sudo -u "${USER}" -n -E -P -- \
\
wrk \
... \
#

View file

@ -18,6 +18,11 @@ Installation
--------
Download prebuilt executables Download prebuilt executables
----------------------------- -----------------------------
@ -31,6 +36,11 @@ Download prebuilt executables
--------
Build from sources Build from sources
------------------ ------------------
@ -55,7 +65,7 @@ Install the prerequisites
* other Linux / FreeBSD / OpenBSD / OSX: * other Linux / FreeBSD / OpenBSD / OSX:
* fetch and install Go from: https://golang.org/dl * fetch and install Go from: `<https://golang.org/dl>`__
* add ``/usr/local/go/bin`` to your ``PATH``; * add ``/usr/local/go/bin`` to your ``PATH``;
* install Git; * install Git;

View file

@ -39,6 +39,11 @@ kawipiko -- blazingly fast static HTTP server
--------
Flags Flags
..... .....
@ -130,8 +135,8 @@ Ignored files
``_wildcard.*`` files Wildcard files
..................... ..............
By placing a file whose name matches ``_wildcard.*`` (i.e. with the prefix ``_wildcard.`` and any other suffix), it will be used to respond to any request whose URL fails to find a "better" match. By placing a file whose name matches ``_wildcard.*`` (i.e. with the prefix ``_wildcard.`` and any other suffix), it will be used to respond to any request whose URL fails to find a "better" match.

View file

@ -57,6 +57,11 @@ kawipiko -- blazingly fast static HTTP server
--------
Flags Flags
..... .....

View file

@ -14,8 +14,8 @@ supporting both **HTTP/1 (with or without TLS), HTTP/2 and HTTP/3 (over QUIC)**;
available as a **single statically linked executable** without any other dependencies. available as a **single statically linked executable** without any other dependencies.
However, `simple` doesn't imply `dumb` or `limited`, However, *simple* doesn't imply *dumb* or *limited*,
instead it implies `efficient` through the removal of superfluous features, instead it implies *efficient* through the removal of superfluous features,
thus being inline with UNIX's old philosophy of thus being inline with UNIX's old philosophy of
"`do one thing and do it well <https://en.wikipedia.org/wiki/Unix_philosophy#Do_One_Thing_and_Do_It_Well>`__". "`do one thing and do it well <https://en.wikipedia.org/wiki/Unix_philosophy#Do_One_Thing_and_Do_It_Well>`__".
Therefore, it supports only ``GET`` requests, Therefore, it supports only ``GET`` requests,
@ -43,7 +43,7 @@ which is what this project provides.
For those familiar with Netlify (or competitors like CloudFlare Pages, GitHub Pages, etc.), For those familiar with Netlify (or competitors like CloudFlare Pages, GitHub Pages, etc.),
``kawipiko`` is a `host-it-yourself` alternative featuring: ``kawipiko`` is a *host-it-yourself* alternative featuring:
* self-contained deployment with simple configuration; * self-contained deployment with simple configuration;
(i.e. just `fetch the executable <#installation>`__ and use the `proper flags <#kawipiko-server>`__;) (i.e. just `fetch the executable <#installation>`__ and use the `proper flags <#kawipiko-server>`__;)
@ -130,8 +130,8 @@ Unlike most (if not all) other servers out-there,
in which you just point your web server to the folder holding the static website content root, in which you just point your web server to the folder holding the static website content root,
``kawipiko`` takes a radically different approach: ``kawipiko`` takes a radically different approach:
in order to serve the static content, in order to serve the static content,
one has to first `archive` the content into the CDB archive through ``kawipiko-archiver``, one has to first *archive* the content into the CDB archive through ``kawipiko-archiver``,
and then one can `serve` it from the CDB archive through ``kawipiko-server``. and then one can *serve* it from the CDB archive through ``kawipiko-server``.
This two step phase also presents a few opportunities: This two step phase also presents a few opportunities:

View file

@ -66,6 +66,11 @@ References
* `bbolt <https://github.com/etcd-io/bbolt>`__ (project @GitHub, forked pure Go implementation, used by ``kawipiko``); * `bbolt <https://github.com/etcd-io/bbolt>`__ (project @GitHub, forked pure Go implementation, used by ``kawipiko``);
.. [dpbench]
* `dpbench <https://github.com/dpbench/dpbench>`__ (project @GitHub);
* describes a standard methodology for testing HTTP proxies; (it can also be applied to ``kawipiko``;)
* `HAProxy Forwards Over 2 Million HTTP Requests per Second <https://www.haproxy.com/blog/haproxy-forwards-over-2-million-http-requests-per-second-on-a-single-aws-arm-instance/>`__ (article);
.. [wrk] .. [wrk]
* `wrk <https://github.com/wg/wrk>`__ (project @GitHub); * `wrk <https://github.com/wg/wrk>`__ (project @GitHub);
* modern HTTP benchmarking tool; * modern HTTP benchmarking tool;