153 lines
6.6 KiB
ReStructuredText
153 lines
6.6 KiB
ReStructuredText
|
|
||
|
|
||
|
#############################################
|
||
|
kawipiko -- blazingly fast static HTTP server
|
||
|
#############################################
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
``kawipiko-archiver``
|
||
|
---------------------
|
||
|
|
||
|
|
||
|
::
|
||
|
|
||
|
>> kawipiko-archiver --help
|
||
|
|
||
|
::
|
||
|
|
||
|
--sources <path>
|
||
|
|
||
|
--archive <path>
|
||
|
|
||
|
--compress <gzip | zopfli | brotli | identity>
|
||
|
--compress-level <number>
|
||
|
--compress-cache <path>
|
||
|
|
||
|
--exclude-index
|
||
|
--exclude-strip
|
||
|
--exclude-cache
|
||
|
--include-etag
|
||
|
|
||
|
--exclude-file-listing
|
||
|
--include-folder-listing
|
||
|
|
||
|
--progress
|
||
|
--debug
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
Flags
|
||
|
.....
|
||
|
|
||
|
``--sources``
|
||
|
|
||
|
The path to the source folder that is the root of the static website content.
|
||
|
|
||
|
``--archive``
|
||
|
|
||
|
The path to the target CDB file that contains the archived static content.
|
||
|
|
||
|
``--compress``, and ``--compress-level``
|
||
|
|
||
|
Each individual file (and consequently of the corresponding HTTP response body) is compressed with either ``gzip``, ``zopfli`` or ``brotli``; by default (or alternatively with ``identity``) no compression is used.
|
||
|
|
||
|
Even if compression is explicitly requested, if the compression ratio is bellow a certain threshold (depending on the uncompressed size), the file is stored without any compression.
|
||
|
(It's senseless to force the client to spend time and decompress the response body if that time is not recovered during network transmission.)
|
||
|
|
||
|
The compression level can be chosen, the value depending on the algorithm:
|
||
|
|
||
|
* ``gzip`` -- ``-1`` for algorithm default, ``-2`` for Huffman only, ``0`` to ``9`` for fast to slow;
|
||
|
* ``zopfli`` -- ``-1`` for algorithm default, ``0`` to ``30`` iterations for fast to slow;
|
||
|
* ``brotli`` -- ``-1`` for algorithm default, ``0`` to ``9`` for fast to slow, ``-2`` for extreme;
|
||
|
* (by "algorithm default", it is meant "what that algorithm considers the recommended default compression level";)
|
||
|
* ``kawipiko`` by default uses the maximum compression level for each algorithm; (i.e. ``9`` for ``gzip``, ``30`` for ``zopfli``, and ``-2`` for ``brotli``;)
|
||
|
|
||
|
``--sources-cache <path>``, and ``--compress-cache <path>``
|
||
|
|
||
|
At the given path a single file is created (that is an BBolt database), that will be used to cache the following information:
|
||
|
|
||
|
* in case of ``--sources-cache``, the fingerprint of each file contents is stored, so that if the file was not changed, re-reading it shouldn't be attempted unless it is absolutely necessary; also if the file is small enough, its contents is stored in this database (deduplicated by its fingerprint);
|
||
|
* in case of ``--compress-cache`` the compression outcome of each file contents is stored (deduplicated by its fingerprint), so that compression is done only once over multiple runs;
|
||
|
|
||
|
Each of these caches can be safely reused between multiple related archives, especially when they have many files in common.
|
||
|
Each of these caches can be independently used (or shared).
|
||
|
|
||
|
Using these caches allows one to very quickly rebuild an archive when only a couple of files have been changed, without even touching the file-system for the unchanged ones.
|
||
|
|
||
|
``--exclude-index``
|
||
|
|
||
|
Disables using ``index.*`` files (where ``.*`` is one of ``.html``, ``.htm``, ``.xhtml``, ``.xht``, ``.txt``, ``.json``, and ``.xml``) to respond to a request whose URL path ends in ``/`` (corresponding to the folder wherein ``index.*`` file is located).
|
||
|
(This can be used to implement "slash" blog style URL's like ``/blog/whatever/`` which maps to ``/blog/whatever/index.html``.)
|
||
|
|
||
|
``--exclude-strip``
|
||
|
|
||
|
Disables using a file with the suffix ``.html``, ``.htm``, ``.xhtml``, ``.xht``, and ``.txt`` to respond to a request whose URL does not exactly match an existing file.
|
||
|
(This can be used to implement "suffix-less" blog style URL's like ``/blog/whatever`` which maps to ``/blog/whatever.html``.)
|
||
|
|
||
|
``--exclude-cache``
|
||
|
|
||
|
Disables adding an ``Cache-Control: public, immutable, max-age=3600`` header that forces the browser (and other intermediary proxies) to cache the response for an hour (the ``public`` and ``max-age=3600`` arguments), and furthermore not request it even on reloads (the ``immutable`` argument).
|
||
|
|
||
|
``--include-etag``
|
||
|
|
||
|
Enables adding an ``ETag`` response header that contains the SHA256 of the response body.
|
||
|
|
||
|
By not including the ``ETag`` header (i.e. the default), and because identical headers are stored only one, if one has many files of the same type (that in turn without ``ETag`` generates the same headers), this can lead to significant reduction in stored headers blocks, including reducing RAM usage.
|
||
|
(At this moment it does not support HTTP conditional requests, i.e. the ``If-None-Match``, ``If-Modified-Since`` and their counterparts; however this ``ETag`` header might be used in conjuction with ``HEAD`` requests to see if the resource has changed.)
|
||
|
|
||
|
``--exclude-file-listing``
|
||
|
|
||
|
Disables the creation of an internal list of files that can be used in conjunction with the ``--index-all`` flag of the ``kawipiko-server``.
|
||
|
|
||
|
``--include-folder-listing``
|
||
|
|
||
|
Enables the creation of an internal list of folders. (Currently not used by the ``kawipiko-server`` tool.)
|
||
|
|
||
|
``--progress``
|
||
|
|
||
|
Enables periodic reporting of various metrics.
|
||
|
|
||
|
``--debug``
|
||
|
|
||
|
Enables verbose logging.
|
||
|
It will log various information about the archived files (including compression statistics).
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
Ignored files
|
||
|
.............
|
||
|
|
||
|
* any file with the following prefixes: ``.``, ``#``;
|
||
|
* any file with the following suffixes: ``~``, ``#``, ``.log``, ``.tmp``, ``.temp``, ``.lock``;
|
||
|
* any file that contains the following: ``#``;
|
||
|
* any file that exactly matches the following: ``Thumbs.db``, ``.DS_Store``;
|
||
|
* (at the moment these rules are not configurable through flags;)
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
``_wildcard.*`` files
|
||
|
.....................
|
||
|
|
||
|
|
||
|
By placing a file whose name matches ``_wildcard.*`` (i.e. with the prefix ``_wildcard.`` and any other suffix), it will be used to respond to any request whose URL fails to find a "better" match.
|
||
|
|
||
|
These wildcard files respect the folder hierarchy, in that wildcard files in (direct or transitive) subfolders override the wildcard file in their parents (direct or transitive).
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
Symlinks, hardlinks, loops, and duplicated files
|
||
|
................................................
|
||
|
|
||
|
You freely use symlinks (including pointing outside of the content root) and they will be crawled during archival respecting the "logical" hierarchy they introduce.
|
||
|
(Any loop that you introduce into the hierarchy will be ignored and a warning will be issued.)
|
||
|
|
||
|
You can safely symlink or hardlink the same file (or folder) in multiple places (within the content hierarchy), and its data will be stored only once.
|
||
|
(The same applies to duplicated files that have exactly the same data.)
|
||
|
|