ClamAV 1.5.0 includes the following improvements and changes:
Major changes
-
Added checks to determine if an OLE2-based Microsoft Office document is
encrypted. -
Added the ability to record URIs found in HTML if the generate-JSON-metadata
feature is enabled.
Also adds an option to disable this in case you want the JSON metadata
feature but do not want to record HTML URIs.
The ClamScan command-line option is--json-store-html-uris=no
.
Theclamd.conf
config option isJsonStoreHTMLURIs no
.
The libclamav general scan option isCL_SCAN_GENERAL_STORE_HTML_URIS
-
Added the ability to record URIs found in PDFs if the generate-JSON-metadata
feature is enabled.
Also adds an option to disable this in case you want the JSON metadata
feature but do not want to record PDF URIs.
The ClamScan command-line option is--json-store-pdf-uris=no
.
Theclamd.conf
config option isJsonStorePDFURIs no
.
The libclamav general scan option isCL_SCAN_GENERAL_STORE_PDF_URIS
-
Added regex support for the
clamd.conf
OnAccessExcludePath
config option.
This change courtesy of GitHub user b1tg. -
Added CVD signing/verification with external
.sign
files.Freshclam will now attempt to download external signature files to accompany
existing.cvd
databases and.cdiff
patch files. Sigtool now has commands
to sign and verify using the external signatures.ClamAV now installs a 'certs' directory in the app config directory
(e.g.,<prefix>/etc/certs
). The install path is configurable.
The CMake option to configure the CVD certs directory is
-D CVD_CERTS_DIRECTORY=PATH
New options to set an alternative CVD certs directory:
- The command-line option for Freshclam, ClamD, ClamScan, and Sigtool is
--cvdcertsdir PATH
- The environment variable for Freshclam, ClamD, ClamScan, and Sigtool is
CVD_CERTS_DIR
- The config option for Freshclam and ClamD is
CVDCertsDirectory PATH
Added two new APIs to the public clamav.h header:
cl_error_t cl_cvdverify_ex( const char *file, const char *certs_directory, uint32_t dboptions); cl_error_t cl_cvdunpack_ex( const char *file, const char *dir, const char *certs_directory, uint32_t dboptions);
The original
cl_cvdverify
andcl_cvdunpack
are deprecated.Added a
cl_engine_field
enum optionCL_ENGINE_CVDCERTSDIR
.
You may set this option withcl_engine_set_str
and get it with
cl_engine_get_str
, to override the compiled in default CVD certs directory.Thank you to Mark Carey at SAP for inspiring work on this feature with an
initial proof of concept for external-signature FIPS compliant CVD signing. - The command-line option for Freshclam, ClamD, ClamScan, and Sigtool is
-
Freshclam, ClamD, ClamScan, and Sigtool: Added an option to enable FIPS-like
limits disabling MD5 and SHA1 from being used for verifying digital signatures
or for being used to trust a file when checking for false positives (FPs).For
freshclam.conf
andclamd.conf
set this config option:FIPSCryptoHashLimits yes
For
clamscan
andsigtool
use this command-line option:--fips-limits
For libclamav: Enable FIPS-limits for a ClamAV engine like this:
cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1);
ClamAV will also attempt to detect if FIPS-mode is enabled. If so, it will
automatically enable the FIPS-limits feature.This change mitigates safety concerns over the use of MD5 and SHA1 algorithms
to trust files and is required to enable ClamAV to operate legitimately in
FIPS-mode enabled environments.Note: ClamAV may still calculate MD5 or SHA1 hashes as needed for detection
purposes or for informational purposes in FIPS-enabled environments and when
the FIPS-limits option is enabled. -
Upgraded the clean-file scan cache to use SHA2-256 (prior versions use MD5).
The clean-file cache algorithm is not configurable.This change resolves safety concerns over the use of MD5 to trust files and
is required to enable ClamAV to operate legitimately in FIPS-mode enabled
environments. -
ClamD: Added an option to disable select administrative commands including
SHUTDOWN
,RELOAD
,STATS
andVERSION
.The new
clamd.conf
options are:EnableShutdownCommand yes EnableReloadCommand yes EnableStatsCommand yes EnableVersionCommand yes
This change courtesy of GitHub user ChaoticByte.
-
libclamav: Added extended hashing functions with a "flags" parameter that
allows the caller to choose if they want to bypass FIPS hash algorithm limits:cl_error_t cl_hash_data_ex( const char *alg, const uint8_t *data, size_t data_len, uint8_t **hash, size_t *hash_len, uint32_t flags); cl_error_t cl_hash_init_ex( const char *alg, uint32_t flags, cl_hash_ctx_t **ctx_out); cl_error_t cl_update_hash_ex( cl_hash_ctx_t *ctx, const uint8_t *data, size_t length); cl_error_t cl_finish_hash_ex( cl_hash_ctx_t *ctx, uint8_t **hash, size_t *hash_len, uint32_t flags); void cl_hash_destroy(void *ctx); cl_error_t cl_hash_file_fd_ex( const char *alg, int fd, size_t offset, size_t length, uint8_t **hash, size_t *hash_len, uint32_t flags);
-
ClamScan: Improved the precision of the bytes-scanned and bytes-read counters.
The ClamScan scan summary will now report exact counts in "GiB", "MiB", "KiB",
or "B" as appropriate. Previously, it always reported "MB". -
ClamScan: Add hash & file-type in/out CLI options:
--hash-hint
: The file hash so that libclamav does not need to calculate
it. The type of hash must match the--hash-alg
.--log-hash
: Print the file hash after each file scanned. The type of hash
printed will match the--hash-alg
.--hash-alg
: The hashing algorithm used for either--hash-hint
or
--log-hash
. Supported algorithms are "md5", "sha1", "sha2-256".
If not specified, the default is "sha2-256".--file-type-hint
: The file type hint so that libclamav can optimize
scanning (e.g., "pe", "elf", "zip", etc.). You may also use ClamAV type names
such as "CL_TYPE_PE". ClamAV will ignore the hint if it is not familiar with
the specified type.
See also: https://docs.clamav.net/appendix/FileTypes.html#file-types--log-file-type
: Print the file type after each file scanned.
We will not be adding this for ClamDScan, as we do not have a mechanism in the
ClamD socket API to receive scan options or a way for ClamD to include scan
metadata in the response. -
libclamav: Added new scan functions that provide additional functionality:
cl_error_t cl_scanfile_ex( const char *filename, cl_verdict_t *verdict_out, const char **last_alert_out, uint64_t *scanned_out, const struct cl_engine *engine, struct cl_scan_options *scanoptions, void *context, const char *hash_hint, char **hash_out, const char *hash_alg, const char *file_type_hint, char **file_type_out); cl_error_t cl_scandesc_ex( int desc, const char *filename, cl_verdict_t *verdict_out, const char **last_alert_out, uint64_t *scanned_out, const struct cl_engine *engine, struct cl_scan_options *scanoptions, void *context, const char *hash_hint, char **hash_out, const char *hash_alg, const char *file_type_hint, char **file_type_out); cl_error_t cl_scanmap_ex( cl_fmap_t *map, const char *filename, cl_verdict_t *verdict_out, const char **last_alert_out, uint64_t *scanned_out, const struct cl_engine *engine, struct cl_scan_options *scanoptions, void *context, const char *hash_hint, char **hash_out, const char *hash_alg, const char *file_type_hint, char **file_type_out);
The older
cl_scan*()
functions are now deprecated and may be removed in a
future release. Seeclamav.h
for more details. -
libclamav: Added a new engine option to toggle temp directory recursion.
Temp directory recursion is the idea that each object scanned in ClamAV's
recursive extract/scan process will get a new temp subdirectory, mimicking
the nesting structure of the file.Temp directory recursion was introduced in ClamAV 0.103 and is enabled
whenever--leave-temps
/LeaveTemporaryFiles
is enabled.In ClamAV 1.5, an application linking to libclamav can separately enable temp
directory recursion if they wish.
For ClamScan and ClamD, it will remain tied to--leave-temps
/
LeaveTemporaryFiles
options.The new temp directory recursion option can be enabled with:
cl_engine_set_num(engine, CL_ENGINE_TMPDIR_RECURSION, 1);
-
libclamav: Added a class of scan callback functions that can be added with the
following API function:void cl_engine_set_scan_callback(struct cl_engine *engine, clcb_scan callback, cl_scan_callback_t location);
The scan callback location may be configured using the following five values:
CL_SCAN_CALLBACK_PRE_HASH
: Occurs just after basic file-type detection and
before any hashes have been calculated either for the cache or the gen-json
metadata.CL_SCAN_CALLBACK_PRE_SCAN
: Occurs before parser modules run and before
pattern matching.CL_SCAN_CALLBACK_POST_SCAN
: Occurs after pattern matching and after
running parser modules. A.k.a. the scan is complete for this layer.CL_SCAN_CALLBACK_ALERT
: Occurs each time an alert (detection) would be
triggered during a scan.CL_SCAN_CALLBACK_FILE_TYPE
: Occurs each time the file type determination
is refined. This may happen more than once per layer.
Each callback may alter scan behavior using the following return codes:
-
CL_BREAK
: Scan aborted by callback. The rest of the scan is skipped.
This does not mark the file as clean or infected, it just skips the rest of
the scan. -
CL_SUCCESS
/CL_CLEAN
: File scan will continue.For
CL_SCAN_CALLBACK_ALERT
: This means you want to ignore this specific
alert and keep scanning.This is different than
CL_VERIFIED
because it does not affect prior or
future alerts. ReturnCL_VERIFIED
instead if you want to remove prior
alerts for this layer and skip the rest of the scan for this layer. -
CL_VIRUS
: This means you do not trust the file. A new alert will be added.For
CL_SCAN_CALLBACK_ALERT
: This means you agree with the alert and no
extra alert is needed. -
CL_VERIFIED
: Layer explicitly trusted by the callback and previous alerts
removed for THIS layer. You might want to do this if you trust the hash or
verified a digital signature. The rest of the scan will be skipped for THIS
layer. For contained files, this does NOT mean that the parent or adjacent
layers are trusted.
Each callback is given a pointer to the current scan layer from which they can
get previous layers, can get the layer's fmap, and then various attributes of
the layer and of the fmap. To make this possible, there are new APIs to
query scan-layer details and fmap details:cl_error_t cl_fmap_set_name(cl_fmap_t *map, const char *name); cl_error_t cl_fmap_get_name(cl_fmap_t *map, const char **name_out); cl_error_t cl_fmap_set_path(cl_fmap_t *map, const char *path); cl_error_t cl_fmap_get_path(cl_fmap_t *map, const char **path_out, size_t *offset_out, size_t *len_out); cl_error_t cl_fmap_get_fd(const cl_fmap_t *map, int *fd_out, size_t *offset_out, size_t *len_out); cl_error_t cl_fmap_get_size(const cl_fmap_t *map, size_t *size_out); cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, char hash); cl_error_t cl_fmap_have_hash(const cl_fmap_t *map, const char *hash_alg, bool *have_hash_out); cl_error_t cl_fmap_will_need_hash_later(const cl_fmap_t *map, const char *hash_alg); cl_error_t cl_fmap_get_hash(const cl_fmap_t *map, const char *hash_alg, char **hash_out); cl_error_t cl_fmap_get_data(const cl_fmap_t *map, size_t offset, size_t len, const uint8_t **data_out, size_t *data_len_out); cl_error_t cl_scan_layer_get_fmap(cl_scan_layer_t *layer, cl_fmap_t **fmap_out); cl_error_t cl_scan_layer_get_parent_layer(cl_scan_layer_t *layer, cl_scan_layer_t **parent_layer_out); cl_error_t cl_scan_layer_get_type(cl_scan_layer_t *layer, const char **type_out); cl_error_t cl_scan_layer_get_recursion_level(cl_scan_layer_t *layer, uint32_t *recursion_level_out); cl_error_t cl_scan_layer_get_object_id(cl_scan_layer_t *layer, uint64_t *object_id_out); cl_error_t cl_scan_layer_get_last_alert(cl_scan_layer_t *layer, const char **alert_name_out); cl_error_t cl_scan_layer_get_attributes(cl_scan_layer_t *layer, uint32_t *attributes_out);
This deprecates, but does not immediately remove, the existing scan callbacks:
void cl_engine_set_clcb_pre_cache(struct cl_engine *engine, clcb_pre_cache callback); void cl_engine_set_clcb_file_inspection(struct cl_engine *engine, clcb_file_inspection callback); void cl_engine_set_clcb_pre_scan(struct cl_engine *engine, clcb_pre_scan callback); void cl_engine_set_clcb_post_scan(struct cl_engine *engine, clcb_post_scan callback); void cl_engine_set_clcb_virus_found(struct cl_engine *engine, clcb_virus_found callback); void cl_engine_set_clcb_hash(struct cl_engine *engine, clcb_hash callback);
There is an interactive test program to demonstrate the new callbacks.
See:examples/ex_scan_callbacks.c
-
Signature names that start with "Weak." will no longer alert.
Instead, they will be tracked internally and can be found in scan metadata
JSON. This is a step towards enabling alerting signatures to depend on prior
Weak indicator matches in the current layer or in child layers. -
For the "Generate Metadata JSON" feature:
-
The "Viruses" array of alert names has been replaced by two new arrays that
include additional details beyond just signature name:- "Indicators" records three types of indicators:
- Strong indicators are for traditional alerting signature matches and
will halt the scan, except in all-match mode. - Potentially Unwanted indicators will only cause an alert at the end of
the scan unless a Strong indicator is found. They are treated the same
as Strong indicators in all-match mode. - Weak indicators do not alert and will be leveraged in a future version
as a condition for logical signature matches.
- Strong indicators are for traditional alerting signature matches and
- "Alerts" records only alerting indicators. Events that trust a file, such
as false positive signatures, will remove affected indicators, and mark
them as "Ignored" in the "Indicators" array.
- "Indicators" records three types of indicators:
-
Add new option to calculate and record additional hash types when the
"generate metadata JSON" feature is enabled:- libclamav option:
CL_SCAN_GENERAL_STORE_EXTRA_HASHES
- ClamScan option:
--json-store-extra-hashes
(default off) clamd.conf
option:JsonStoreExtraHashes
(default 'no')
- libclamav option:
-
The file hash is now stored as "sha2-256" instead of "FileMD5". If you
enable the "extra hashes" option, then it will also record "md5" and "sha1". -
Each object scanned now has a unique "Object ID".
-
-
Sigtool: Renamed the sigtool option
--sha256
to--sha2-256
.
The original option is still functional but is deprecated.
Other improvements
-
Set a limit on the max-recursion config option. Users will no longer be
able to set max-recursion higher than 100.
This change prevents errors on start up or crashes if encountering
a file with that many layers of recursion. -
Build system: CMake improvements to support compiling for the AIX platform.
This change is courtesy of GitHub user KamathForAIX. -
Improve support for extracting malformed zip archives.
This change is courtesy of Frederick Sell. -
Windows: Code quality improvement for the ClamScan and ClamDScan
--move
and--remove
options.
This change is courtesy of Maxim Suhanov. -
Added file type recognition for an initial set of AI model file types.
The file type is accessible to applications using libclamav via the scan
callback functions and as an optional output parameter to the scan functions:
cl_scanfile_ex()
,cl_scanmap_ex()
, andcl_scandesc_ex()
.When scanning these files, type will now show "CL_TYPE_AI_MODEL" instead of
"CL_TYPE_BINARY_DATA". -
Added support for inline comments in ClamAV configuration files.
This change is courtesy of GitHub user userwiths. -
Disabled the MyDoom hardcoded/heuristic detection because of false positives.
-
Sigtool: Added support for creating
.cdiff
and.script
patch files for
CVDs that have underscores in the CVD name.
Also improved support for relative paths with the--diff
command. -
Windows: Improved support for file names with UTF-8 characters not found in
the ANSI or OEM code pages when printing scan results or showing activity in
the ClamDTOP monitoring utility.
Fixed a bug with opening files with such names with the Sigtool utility. -
Improved the code quality of the ZIP module. Added inline documentation.
-
Always run scan callbacks for embedded files. Embedded files are found within
other files through signature matches instead of by parsing. They will now
be processed the same way and then they can trigger application callbacks
(e.g., "pre-scan", "post-scan", etc.).This change will impact scans with both the "leave-temps" feature and the
"force-to-disk" feature enabled, resulting in additional temporary files. -
Added DevContainer templates to the ClamAV Git repository in order to make it
easier to set up AlmaLinux or Debian development environments.
Bug fixes
-
Reduced email multipart message parser complexity.
-
Fixed possible undefined behavior in inflate64 module.
The inflate64 module is a modified version of the zlib library, taken from
version 1.2.3 with some customization and with some cherry-picked fixes.
This adds one additional fix from zlib 1.2.9.
Thank you to TITAN Team for reporting this issue. -
Fixed a bug in ClamD that broke reporting of memory usage on Linux.
The STATS command can be used to monitor ClamD directly or through ClamDTOP.
The memory stats feature does not work on all platforms (e.g., Windows). -
Windows: Fixed a build issue when the same library dependency is found in
two different locations. -
Fixed an infinite loop when scanning some email files in debug-mode.
This fix is courtesy of Yoann Lecuyer. -
Fixed a stack buffer overflow bug in the phishing signature load process.
This fix is courtesy of GitHub user Shivam7-1. -
Fixed a race condition in the Freshclam feature tests.
This fix is courtesy of GitHub user rma-x. -
Windows: Fixed a 5-byte heap buffer overread in the Windows unit tests.
This fix is courtesy of GitHub user Sophie0x2E. -
Fix double-extraction of OOXML-based office documents.
-
ClamBC: Fixed crashes on startup.
Acknowledgments
Special thanks to the following people for code contributions and bug reports:
- b1tg
- ChaoticByte
- Frederick Sell
- KamathForAIX
- Mark Carey at SAP
- Maxim Suhanov
- rma-x
- Shivam7-1
- Sophie0x2E
- TITAN Team
- userwiths
- Yoann Lecuyer