2021-12-23 18:35:36 +00:00
|
|
|
KAWIPIKO-ARCHIVER(1) kawipiko KAWIPIKO-ARCHIVER(1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NAME
|
|
|
|
kawipiko -- blazingly fast static HTTP server - kawipiko-archiver
|
|
|
|
|
|
|
|
>> kawipiko-archiver --help
|
2021-12-23 18:39:06 +00:00
|
|
|
>> kawipiko-archiver --man
|
2021-12-23 18:35:36 +00:00
|
|
|
|
|
|
|
--sources <path>
|
|
|
|
|
|
|
|
--archive <path>
|
|
|
|
|
|
|
|
--compress <gzip | zopfli | brotli | identity>
|
|
|
|
--compress-level <number>
|
|
|
|
--compress-cache <path>
|
|
|
|
|
|
|
|
--exclude-index
|
|
|
|
--exclude-strip
|
|
|
|
--exclude-cache
|
|
|
|
--include-etag
|
|
|
|
|
|
|
|
--include-folder-listing
|
|
|
|
|
2022-08-26 11:15:31 +00:00
|
|
|
--exclude-paths-index
|
|
|
|
|
2021-12-23 18:39:06 +00:00
|
|
|
--progress --debug
|
|
|
|
|
|
|
|
--help (show this short help)
|
|
|
|
--man (show the full manual)
|
2021-12-23 18:35:36 +00:00
|
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FLAGS
|
|
|
|
--sources
|
|
|
|
The path to the source folder that is the root of the static website
|
|
|
|
content.
|
|
|
|
|
|
|
|
--archive
|
|
|
|
The path to the target CDB file that contains the archived static
|
|
|
|
content.
|
|
|
|
|
|
|
|
--compress, and --compress-level
|
|
|
|
Each individual file (and consequently of the corresponding HTTP
|
|
|
|
response body) is compressed with either gzip, zopfli or brotli; by
|
|
|
|
default (or alternatively with identity) no compression is used.
|
|
|
|
|
|
|
|
Even if compression is explicitly requested, if the compression
|
|
|
|
ratio is bellow a certain threshold (depending on the uncompressed
|
|
|
|
size), the file is stored without any compression. (It's senseless
|
|
|
|
to force the client to spend time and decompress the response body
|
|
|
|
if that time is not recovered during network transmission.)
|
|
|
|
|
|
|
|
The compression level can be chosen, the value depending on the
|
|
|
|
algorithm:
|
|
|
|
|
|
|
|
• gzip -- -1 for algorithm default, -2 for Huffman only, 0 to 9 for
|
|
|
|
fast to slow;
|
|
|
|
|
|
|
|
• zopfli -- -1 for algorithm default, 0 to 30 iterations for fast to
|
|
|
|
slow;
|
|
|
|
|
|
|
|
• brotli -- -1 for algorithm default, 0 to 9 for fast to slow, -2
|
|
|
|
for extreme;
|
|
|
|
|
|
|
|
• (by "algorithm default", it is meant "what that algorithm
|
|
|
|
considers the recommended default compression level";)
|
|
|
|
|
|
|
|
• kawipiko by default uses the maximum compression level for each
|
|
|
|
algorithm; (i.e. 9 for gzip, 30 for zopfli, and -2 for brotli;)
|
|
|
|
|
|
|
|
--sources-cache <path>, and --compress-cache <path>
|
|
|
|
At the given path a single file is created (that is an BBolt
|
|
|
|
database), that will be used to cache the following information:
|
|
|
|
|
|
|
|
• in case of --sources-cache, the fingerprint of each file contents
|
|
|
|
is stored, so that if the file was not changed, re-reading it
|
|
|
|
shouldn't be attempted unless it is absolutely necessary; also if
|
|
|
|
the file is small enough, its contents is stored in this database
|
|
|
|
(deduplicated by its fingerprint);
|
|
|
|
|
|
|
|
• in case of --compress-cache the compression outcome of each file
|
|
|
|
contents is stored (deduplicated by its fingerprint), so that
|
|
|
|
compression is done only once over multiple runs;
|
|
|
|
|
|
|
|
Each of these caches can be safely reused between multiple related
|
|
|
|
archives, especially when they have many files in common. Each of
|
|
|
|
these caches can be independently used (or shared).
|
|
|
|
|
|
|
|
Using these caches allows one to very quickly rebuild an archive
|
|
|
|
when only a couple of files have been changed, without even touching
|
|
|
|
the file-system for the unchanged ones.
|
|
|
|
|
|
|
|
--exclude-index
|
|
|
|
Disables using index.* files (where .* is one of .html, .htm,
|
|
|
|
.xhtml, .xht, .txt, .json, and .xml) to respond to a request whose
|
|
|
|
URL path ends in / (corresponding to the folder wherein index.* file
|
|
|
|
is located). (This can be used to implement "slash" blog style
|
|
|
|
URL's like /blog/whatever/ which maps to /blog/whatever/index.html.)
|
|
|
|
|
|
|
|
--exclude-strip
|
|
|
|
Disables using a file with the suffix .html, .htm, .xhtml, .xht, and
|
|
|
|
.txt to respond to a request whose URL does not exactly match an
|
|
|
|
existing file. (This can be used to implement "suffix-less" blog
|
|
|
|
style URL's like /blog/whatever which maps to /blog/whatever.html.)
|
|
|
|
|
|
|
|
--exclude-cache
|
|
|
|
Disables adding an Cache-Control: public, immutable, max-age=3600
|
|
|
|
header that forces the browser (and other intermediary proxies) to
|
|
|
|
cache the response for an hour (the public and max-age=3600
|
|
|
|
arguments), and furthermore not request it even on reloads (the
|
|
|
|
immutable argument).
|
|
|
|
|
|
|
|
--include-etag
|
|
|
|
Enables adding an ETag response header that contains the SHA256 of
|
|
|
|
the response body.
|
|
|
|
|
|
|
|
By not including the ETag header (i.e. the default), and because
|
|
|
|
identical headers are stored only one, if one has many files of the
|
|
|
|
same type (that in turn without ETag generates the same headers),
|
|
|
|
this can lead to significant reduction in stored headers blocks,
|
|
|
|
including reducing RAM usage. (At this moment it does not support
|
|
|
|
HTTP conditional requests, i.e. the If-None-Match, If-Modified-Since
|
|
|
|
and their counterparts; however this ETag header might be used in
|
|
|
|
conjuction with HEAD requests to see if the resource has changed.)
|
|
|
|
|
|
|
|
--include-folder-listing
|
2022-08-26 11:20:18 +00:00
|
|
|
Enables the creation of an internal list of folders.
|
|
|
|
|
|
|
|
--exclude-paths-index
|
|
|
|
Disables the creation of an internal list of references that can be
|
|
|
|
used in conjunction with the --index-all flag of the
|
|
|
|
kawipiko-server.
|
2021-12-23 18:35:36 +00:00
|
|
|
|
|
|
|
--progress
|
|
|
|
Enables periodic reporting of various metrics.
|
|
|
|
|
|
|
|
--debug
|
|
|
|
Enables verbose logging. It will log various information about the
|
|
|
|
archived files (including compression statistics).
|
|
|
|
|
|
|
|
IGNORED FILES
|
|
|
|
|
|
|
|
• any file with the following prefixes: ., #;
|
|
|
|
|
|
|
|
• any file with the following suffixes: ~, #, .log, .tmp, .temp, .lock;
|
|
|
|
|
|
|
|
• any file that contains the following: #;
|
|
|
|
|
|
|
|
• any file that exactly matches the following: Thumbs.db, .DS_Store;
|
|
|
|
|
|
|
|
• (at the moment these rules are not configurable through flags;)
|
|
|
|
|
|
|
|
WILDCARD FILES
|
|
|
|
By placing a file whose name matches _wildcard.* (i.e. with the prefix
|
|
|
|
_wildcard. and any other suffix), it will be used to respond to any
|
|
|
|
request whose URL fails to find a "better" match.
|
|
|
|
|
|
|
|
These wildcard files respect the folder hierarchy, in that wildcard
|
|
|
|
files in (direct or transitive) subfolders override the wildcard file
|
|
|
|
in their parents (direct or transitive).
|
|
|
|
|
|
|
|
SYMLINKS, HARDLINKS, LOOPS, AND DUPLICATED FILES
|
|
|
|
You freely use symlinks (including pointing outside of the content
|
|
|
|
root) and they will be crawled during archival respecting the "logical"
|
|
|
|
hierarchy they introduce. (Any loop that you introduce into the
|
|
|
|
hierarchy will be ignored and a warning will be issued.)
|
|
|
|
|
|
|
|
You can safely symlink or hardlink the same file (or folder) in
|
|
|
|
multiple places (within the content hierarchy), and its data will be
|
|
|
|
stored only once. (The same applies to duplicated files that have
|
|
|
|
exactly the same data.)
|
|
|
|
|
|
|
|
|
|
|
|
|
2022-08-26 11:20:18 +00:00
|
|
|
volution.ro 2022-08-26 KAWIPIKO-ARCHIVER(1)
|