github Cisco-Talos/clamav clamav-1.5.0-rc
ClamAV 1.5.0-rc

pre-release11 days ago

ClamAV 1.5.0 includes the following improvements and changes:

Major changes

  • Added checks to determine if an OLE2-based Microsoft Office document is
    encrypted.

    GitHub pull request

  • Added the ability to record URIs found in HTML if the generate-JSON-metadata
    feature is enabled.
    Also adds an option to disable this in case you want the JSON metadata
    feature but do not want to record HTML URIs.
    The ClamScan command-line option is --json-store-html-uris=no.
    The clamd.conf config option is JsonStoreHTMLURIs no.
    The libclamav general scan option is CL_SCAN_GENERAL_STORE_HTML_URIS

    GitHub pull request #1

    GitHub pull request #2

    GitHub pull request #3

  • Added the ability to record URIs found in PDFs if the generate-JSON-metadata
    feature is enabled.
    Also adds an option to disable this in case you want the JSON metadata
    feature but do not want to record PDF URIs.
    The ClamScan command-line option is --json-store-pdf-uris=no.
    The clamd.conf config option is JsonStorePDFURIs no.
    The libclamav general scan option is CL_SCAN_GENERAL_STORE_PDF_URIS

    GitHub pull request #1

    GitHub pull request #2

  • Added regex support for the clamd.conf OnAccessExcludePath config option.
    This change courtesy of GitHub user b1tg.

    GitHub pull request

  • Added CVD signing/verification with external .sign files.

    Freshclam will now attempt to download external signature files to accompany
    existing .cvd databases and .cdiff patch files. Sigtool now has commands
    to sign and verify using the external signatures.

    ClamAV now installs a 'certs' directory in the app config directory
    (e.g., <prefix>/etc/certs). The install path is configurable.
    The CMake option to configure the CVD certs directory is
    -D CVD_CERTS_DIRECTORY=PATH

    New options to set an alternative CVD certs directory:

    • The command-line option for Freshclam, ClamD, ClamScan, and Sigtool is
      --cvdcertsdir PATH
    • The environment variable for Freshclam, ClamD, ClamScan, and Sigtool is
      CVD_CERTS_DIR
    • The config option for Freshclam and ClamD is
      CVDCertsDirectory PATH

    Added two new APIs to the public clamav.h header:

    cl_error_t cl_cvdverify_ex(
        const char *file,
        const char *certs_directory,
        uint32_t dboptions);
    
    cl_error_t cl_cvdunpack_ex(
        const char *file,
        const char *dir,
        const char *certs_directory,
        uint32_t dboptions);

    The original cl_cvdverify and cl_cvdunpack are deprecated.

    Added a cl_engine_field enum option CL_ENGINE_CVDCERTSDIR.
    You may set this option with cl_engine_set_str and get it with
    cl_engine_get_str, to override the compiled in default CVD certs directory.

    Thank you to Mark Carey at SAP for inspiring work on this feature with an
    initial proof of concept for external-signature FIPS compliant CVD signing.

    GitHub pull request #1

    GitHub pull request #2

    GitHub pull request #3

    GitHub pull request #4

  • Freshclam, ClamD, ClamScan, and Sigtool: Added an option to enable FIPS-like
    limits disabling MD5 and SHA1 from being used for verifying digital signatures
    or for being used to trust a file when checking for false positives (FPs).

    For freshclam.conf and clamd.conf set this config option:

    FIPSCryptoHashLimits yes
    

    For clamscan and sigtool use this command-line option:

    --fips-limits
    

    For libclamav: Enable FIPS-limits for a ClamAV engine like this:

    cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1);

    ClamAV will also attempt to detect if FIPS-mode is enabled. If so, it will
    automatically enable the FIPS-limits feature.

    This change mitigates safety concerns over the use of MD5 and SHA1 algorithms
    to trust files and is required to enable ClamAV to operate legitimately in
    FIPS-mode enabled environments.

    Note: ClamAV may still calculate MD5 or SHA1 hashes as needed for detection
    purposes or for informational purposes in FIPS-enabled environments and when
    the FIPS-limits option is enabled.

    GitHub pull request

  • Upgraded the clean-file scan cache to use SHA2-256 (prior versions use MD5).
    The clean-file cache algorithm is not configurable.

    This change resolves safety concerns over the use of MD5 to trust files and
    is required to enable ClamAV to operate legitimately in FIPS-mode enabled
    environments.

    GitHub pull request

  • ClamD: Added an option to disable select administrative commands including
    SHUTDOWN, RELOAD, STATS and VERSION.

    The new clamd.conf options are:

    EnableShutdownCommand yes
    EnableReloadCommand yes
    EnableStatsCommand yes
    EnableVersionCommand yes
    

    This change courtesy of GitHub user ChaoticByte.

    GitHub pull request

  • libclamav: Added extended hashing functions with a "flags" parameter that
    allows the caller to choose if they want to bypass FIPS hash algorithm limits:

    cl_error_t cl_hash_data_ex(
        const char *alg,
        const uint8_t *data,
        size_t data_len,
        uint8_t **hash,
        size_t *hash_len,
        uint32_t flags);
    
    cl_error_t cl_hash_init_ex(
        const char *alg,
        uint32_t flags,
        cl_hash_ctx_t **ctx_out);
    
    cl_error_t cl_update_hash_ex(
        cl_hash_ctx_t *ctx,
        const uint8_t *data,
        size_t length);
    
    cl_error_t cl_finish_hash_ex(
        cl_hash_ctx_t *ctx,
        uint8_t **hash,
        size_t *hash_len,
        uint32_t flags);
    
    void cl_hash_destroy(void *ctx);
    
    cl_error_t cl_hash_file_fd_ex(
        const char *alg,
        int fd,
        size_t offset,
        size_t length,
        uint8_t **hash,
        size_t *hash_len,
        uint32_t flags);

    GitHub pull request

  • ClamScan: Improved the precision of the bytes-scanned and bytes-read counters.
    The ClamScan scan summary will now report exact counts in "GiB", "MiB", "KiB",
    or "B" as appropriate. Previously, it always reported "MB".

    GitHub pull request

  • ClamScan: Add hash & file-type in/out CLI options:

    • --hash-hint: The file hash so that libclamav does not need to calculate
      it. The type of hash must match the --hash-alg.
    • --log-hash: Print the file hash after each file scanned. The type of hash
      printed will match the --hash-alg.
    • --hash-alg: The hashing algorithm used for either --hash-hint or
      --log-hash. Supported algorithms are "md5", "sha1", "sha2-256".
      If not specified, the default is "sha2-256".
    • --file-type-hint: The file type hint so that libclamav can optimize
      scanning (e.g., "pe", "elf", "zip", etc.). You may also use ClamAV type names
      such as "CL_TYPE_PE". ClamAV will ignore the hint if it is not familiar with
      the specified type.
      See also: https://docs.clamav.net/appendix/FileTypes.html#file-types
    • --log-file-type: Print the file type after each file scanned.

    We will not be adding this for ClamDScan, as we do not have a mechanism in the
    ClamD socket API to receive scan options or a way for ClamD to include scan
    metadata in the response.

    GitHub pull request

  • libclamav: Added new scan functions that provide additional functionality:

    cl_error_t cl_scanfile_ex(
        const char *filename,
        cl_verdict_t *verdict_out,
        const char **last_alert_out,
        uint64_t *scanned_out,
        const struct cl_engine *engine,
        struct cl_scan_options *scanoptions,
        void *context,
        const char *hash_hint,
        char **hash_out,
        const char *hash_alg,
        const char *file_type_hint,
        char **file_type_out);
    
    cl_error_t cl_scandesc_ex(
        int desc,
        const char *filename,
        cl_verdict_t *verdict_out,
        const char **last_alert_out,
        uint64_t *scanned_out,
        const struct cl_engine *engine,
        struct cl_scan_options *scanoptions,
        void *context,
        const char *hash_hint,
        char **hash_out,
        const char *hash_alg,
        const char *file_type_hint,
        char **file_type_out);
    
    cl_error_t cl_scanmap_ex(
        cl_fmap_t *map,
        const char *filename,
        cl_verdict_t *verdict_out,
        const char **last_alert_out,
        uint64_t *scanned_out,
        const struct cl_engine *engine,
        struct cl_scan_options *scanoptions,
        void *context,
        const char *hash_hint,
        char **hash_out,
        const char *hash_alg,
        const char *file_type_hint,
        char **file_type_out);

    The older cl_scan*() functions are now deprecated and may be removed in a
    future release. See clamav.h for more details.

    GitHub pull request

  • libclamav: Added a new engine option to toggle temp directory recursion.

    Temp directory recursion is the idea that each object scanned in ClamAV's
    recursive extract/scan process will get a new temp subdirectory, mimicking
    the nesting structure of the file.

    Temp directory recursion was introduced in ClamAV 0.103 and is enabled
    whenever --leave-temps / LeaveTemporaryFiles is enabled.

    In ClamAV 1.5, an application linking to libclamav can separately enable temp
    directory recursion if they wish.
    For ClamScan and ClamD, it will remain tied to --leave-temps /
    LeaveTemporaryFiles options.

    The new temp directory recursion option can be enabled with:

    cl_engine_set_num(engine, CL_ENGINE_TMPDIR_RECURSION, 1);

    GitHub pull request

  • libclamav: Added a class of scan callback functions that can be added with the
    following API function:

    void cl_engine_set_scan_callback(struct cl_engine *engine, clcb_scan callback, cl_scan_callback_t location);

    The scan callback location may be configured using the following five values:

    • CL_SCAN_CALLBACK_PRE_HASH: Occurs just after basic file-type detection and
      before any hashes have been calculated either for the cache or the gen-json
      metadata.
    • CL_SCAN_CALLBACK_PRE_SCAN: Occurs before parser modules run and before
      pattern matching.
    • CL_SCAN_CALLBACK_POST_SCAN: Occurs after pattern matching and after
      running parser modules. A.k.a. the scan is complete for this layer.
    • CL_SCAN_CALLBACK_ALERT: Occurs each time an alert (detection) would be
      triggered during a scan.
    • CL_SCAN_CALLBACK_FILE_TYPE: Occurs each time the file type determination
      is refined. This may happen more than once per layer.

    Each callback may alter scan behavior using the following return codes:

    • CL_BREAK: Scan aborted by callback. The rest of the scan is skipped.
      This does not mark the file as clean or infected, it just skips the rest of
      the scan.

    • CL_SUCCESS / CL_CLEAN: File scan will continue.

      For CL_SCAN_CALLBACK_ALERT: This means you want to ignore this specific
      alert and keep scanning.

      This is different than CL_VERIFIED because it does not affect prior or
      future alerts. Return CL_VERIFIED instead if you want to remove prior
      alerts for this layer and skip the rest of the scan for this layer.

    • CL_VIRUS: This means you do not trust the file. A new alert will be added.

      For CL_SCAN_CALLBACK_ALERT: This means you agree with the alert and no
      extra alert is needed.

    • CL_VERIFIED: Layer explicitly trusted by the callback and previous alerts
      removed for THIS layer. You might want to do this if you trust the hash or
      verified a digital signature. The rest of the scan will be skipped for THIS
      layer. For contained files, this does NOT mean that the parent or adjacent
      layers are trusted.

    Each callback is given a pointer to the current scan layer from which they can
    get previous layers, can get the layer's fmap, and then various attributes of
    the layer and of the fmap. To make this possible, there are new APIs to
    query scan-layer details and fmap details:

      cl_error_t cl_fmap_set_name(cl_fmap_t *map, const char *name);
      cl_error_t cl_fmap_get_name(cl_fmap_t *map, const char **name_out);
      cl_error_t cl_fmap_set_path(cl_fmap_t *map, const char *path);
      cl_error_t cl_fmap_get_path(cl_fmap_t *map, const char **path_out, size_t *offset_out, size_t *len_out);
      cl_error_t cl_fmap_get_fd(const cl_fmap_t *map, int *fd_out, size_t *offset_out, size_t *len_out);
      cl_error_t cl_fmap_get_size(const cl_fmap_t *map, size_t *size_out);
      cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, char hash);
      cl_error_t cl_fmap_have_hash(const cl_fmap_t *map, const char *hash_alg, bool *have_hash_out);
      cl_error_t cl_fmap_will_need_hash_later(const cl_fmap_t *map, const char *hash_alg);
      cl_error_t cl_fmap_get_hash(const cl_fmap_t *map, const char *hash_alg, char **hash_out);
      cl_error_t cl_fmap_get_data(const cl_fmap_t *map, size_t offset, size_t len, const uint8_t **data_out, size_t *data_len_out);
      cl_error_t cl_scan_layer_get_fmap(cl_scan_layer_t *layer, cl_fmap_t **fmap_out);
      cl_error_t cl_scan_layer_get_parent_layer(cl_scan_layer_t *layer, cl_scan_layer_t **parent_layer_out);
      cl_error_t cl_scan_layer_get_type(cl_scan_layer_t *layer, const char **type_out);
      cl_error_t cl_scan_layer_get_recursion_level(cl_scan_layer_t *layer, uint32_t *recursion_level_out);
      cl_error_t cl_scan_layer_get_object_id(cl_scan_layer_t *layer, uint64_t *object_id_out);
      cl_error_t cl_scan_layer_get_last_alert(cl_scan_layer_t *layer, const char **alert_name_out);
      cl_error_t cl_scan_layer_get_attributes(cl_scan_layer_t *layer, uint32_t *attributes_out);

    This deprecates, but does not immediately remove, the existing scan callbacks:

      void cl_engine_set_clcb_pre_cache(struct cl_engine *engine, clcb_pre_cache callback);
      void cl_engine_set_clcb_file_inspection(struct cl_engine *engine, clcb_file_inspection callback);
      void cl_engine_set_clcb_pre_scan(struct cl_engine *engine, clcb_pre_scan callback);
      void cl_engine_set_clcb_post_scan(struct cl_engine *engine, clcb_post_scan callback);
      void cl_engine_set_clcb_virus_found(struct cl_engine *engine, clcb_virus_found callback);
      void cl_engine_set_clcb_hash(struct cl_engine *engine, clcb_hash callback);

    There is an interactive test program to demonstrate the new callbacks.
    See: examples/ex_scan_callbacks.c

    GitHub pull request

  • Signature names that start with "Weak." will no longer alert.
    Instead, they will be tracked internally and can be found in scan metadata
    JSON. This is a step towards enabling alerting signatures to depend on prior
    Weak indicator matches in the current layer or in child layers.

    GitHub pull request

  • For the "Generate Metadata JSON" feature:

    • The "Viruses" array of alert names has been replaced by two new arrays that
      include additional details beyond just signature name:

      • "Indicators" records three types of indicators:
        • Strong indicators are for traditional alerting signature matches and
          will halt the scan, except in all-match mode.
        • Potentially Unwanted indicators will only cause an alert at the end of
          the scan unless a Strong indicator is found. They are treated the same
          as Strong indicators in all-match mode.
        • Weak indicators do not alert and will be leveraged in a future version
          as a condition for logical signature matches.
      • "Alerts" records only alerting indicators. Events that trust a file, such
        as false positive signatures, will remove affected indicators, and mark
        them as "Ignored" in the "Indicators" array.
    • Add new option to calculate and record additional hash types when the
      "generate metadata JSON" feature is enabled:

      • libclamav option: CL_SCAN_GENERAL_STORE_EXTRA_HASHES
      • ClamScan option: --json-store-extra-hashes (default off)
      • clamd.conf option: JsonStoreExtraHashes (default 'no')
    • The file hash is now stored as "sha2-256" instead of "FileMD5". If you
      enable the "extra hashes" option, then it will also record "md5" and "sha1".

    • Each object scanned now has a unique "Object ID".

    GitHub pull request

  • Sigtool: Renamed the sigtool option --sha256 to --sha2-256.
    The original option is still functional but is deprecated.

    GitHub pull request

Other improvements

  • Set a limit on the max-recursion config option. Users will no longer be
    able to set max-recursion higher than 100.
    This change prevents errors on start up or crashes if encountering
    a file with that many layers of recursion.

    GitHub pull request

  • Build system: CMake improvements to support compiling for the AIX platform.
    This change is courtesy of GitHub user KamathForAIX.

    GitHub pull request

  • Improve support for extracting malformed zip archives.
    This change is courtesy of Frederick Sell.

    GitHub pull request

  • Windows: Code quality improvement for the ClamScan and ClamDScan --move
    and --remove options.
    This change is courtesy of Maxim Suhanov.

    GitHub pull request

  • Added file type recognition for an initial set of AI model file types.

    The file type is accessible to applications using libclamav via the scan
    callback functions and as an optional output parameter to the scan functions:
    cl_scanfile_ex(), cl_scanmap_ex(), and cl_scandesc_ex().

    When scanning these files, type will now show "CL_TYPE_AI_MODEL" instead of
    "CL_TYPE_BINARY_DATA".

    GitHub pull request

  • Added support for inline comments in ClamAV configuration files.
    This change is courtesy of GitHub user userwiths.

    GitHub pull request

  • Disabled the MyDoom hardcoded/heuristic detection because of false positives.

    GitHub pull request

  • Sigtool: Added support for creating .cdiff and .script patch files for
    CVDs that have underscores in the CVD name.
    Also improved support for relative paths with the --diff command.

    GitHub pull request

  • Windows: Improved support for file names with UTF-8 characters not found in
    the ANSI or OEM code pages when printing scan results or showing activity in
    the ClamDTOP monitoring utility.
    Fixed a bug with opening files with such names with the Sigtool utility.

    GitHub pull request #1

    GitHub pull request #2

  • Improved the code quality of the ZIP module. Added inline documentation.

    GitHub pull request #1

    GitHub pull request #2

  • Always run scan callbacks for embedded files. Embedded files are found within
    other files through signature matches instead of by parsing. They will now
    be processed the same way and then they can trigger application callbacks
    (e.g., "pre-scan", "post-scan", etc.).

    This change will impact scans with both the "leave-temps" feature and the
    "force-to-disk" feature enabled, resulting in additional temporary files.

    GitHub pull request

  • Added DevContainer templates to the ClamAV Git repository in order to make it
    easier to set up AlmaLinux or Debian development environments.

    GitHub pull request

Bug fixes

  • Reduced email multipart message parser complexity.

    GitHub pull request

  • Fixed possible undefined behavior in inflate64 module.
    The inflate64 module is a modified version of the zlib library, taken from
    version 1.2.3 with some customization and with some cherry-picked fixes.
    This adds one additional fix from zlib 1.2.9.
    Thank you to TITAN Team for reporting this issue.

    GitHub pull request

  • Fixed a bug in ClamD that broke reporting of memory usage on Linux.
    The STATS command can be used to monitor ClamD directly or through ClamDTOP.
    The memory stats feature does not work on all platforms (e.g., Windows).

    GitHub pull request

  • Windows: Fixed a build issue when the same library dependency is found in
    two different locations.

    GitHub pull request

  • Fixed an infinite loop when scanning some email files in debug-mode.
    This fix is courtesy of Yoann Lecuyer.

    GitHub pull request

  • Fixed a stack buffer overflow bug in the phishing signature load process.
    This fix is courtesy of GitHub user Shivam7-1.

    GitHub pull request

  • Fixed a race condition in the Freshclam feature tests.
    This fix is courtesy of GitHub user rma-x.

    GitHub pull request

  • Windows: Fixed a 5-byte heap buffer overread in the Windows unit tests.
    This fix is courtesy of GitHub user Sophie0x2E.

    GitHub pull request

  • Fix double-extraction of OOXML-based office documents.

    GitHub pull request

  • ClamBC: Fixed crashes on startup.

    GitHub pull request

Acknowledgments

Special thanks to the following people for code contributions and bug reports:

  • b1tg
  • ChaoticByte
  • Frederick Sell
  • KamathForAIX
  • Mark Carey at SAP
  • Maxim Suhanov
  • rma-x
  • Shivam7-1
  • Sophie0x2E
  • TITAN Team
  • userwiths
  • Yoann Lecuyer

Don't miss a new clamav release

NewReleases is sending notifications on new releases.