GoogleCloudPlatform/gcsfuse v2.4.0 on GitHub

Parallel download:
- Accelerates reads of large files, by using the file cache directory as a prefetch buffer using multiple workers to download large files in parallel.
- This feature is useful for single threaded read scenarios that load large (>1GiB) files, such as model serving use cases and checkpoint restores.
- This feature is disabled by default. To enable this feature:
  - Enabling the file cache feature (GKE instructions) is a prerequisite for using the parallel download feature, which uses the cache directory as a prefetch buffer. Although a cache is typically associated with repeat reads, with parallel downloads even first reads of large files are accelerated.
    - The file being read must fit within the file cache directory’s available capacity, which can be controlled by max-size-mb. A value of “-1” allows it to use the cache volume’s entire capacity, or you can give it a value in Megabytes.
    - If the same file will be read multiple times, increase the ttl-secs value. A value of "-1" bypasses TTL expiration and serves the file from the cache if it's available.
  - Set enable-parallel-downloads:true to enable parallel downloads. The default is false.
  - Additional optional parameters:
    - parallel-downloads-per-file: The number of maximum workers to spawn per file to download the object from GCS into the file-cache. Default is 16.
    - max-parallel-downloads: The number of maximum workers that can be spawned at any given time across all the download jobs of files. The default is set to 2x the number of CPU cores on the machine. A value of -1 means no limit.
    - download-chunk-size-mb: The size of each read request in MiB that each goroutine makes to GCS when downloading the object into file-cache. Default is 50. A parallel download will only trigger if the file being read is => this value specified
  - Note: If your application does high read parallelism (>8 threads), a slight performance degradation may be observed if using this feature. High read parallelism is typically seen in training workloads so should not be used for training workloads, and is therefore recommended only for model serving and checkpoint restores, which are typically single threaded large file reads.
- Addresses #1300
Kernel-List-Cache
- List responses, that happen as a part of a readdir operation, are cached in the kernel page cache. This can significantly speed up AI/ML training runs, which do full directory listing first, by serving repeat ListObjects calls locally from the kernel page cache. Due to potential coherency/consistency issues, it is recommended to use on read only volumes, specifically for serving and training.
- This feature is disabled by default. To enable this feature:
  - Control cache invalidation via the --kernel-list-cache-ttl-secs cli flag or file-system:kernel-list-cache-ttl-secs config flag, where a value of:
    - 0 means disabled. This is the default value.
    - valid positive - represents the ttl (in seconds) to keep the directory list response in the kernel page-cache.
    - -1 to bypass a TTL expiration and serve the list response from the cache whenever it's available.
- Addresses #184
CLI-flags Take Precedence over Config, behavior change for logging-flags: Going forward command-line flags will always take precedence for all settings. Change in behavior for logging flags (--log-file & --log-format), where config was taking precedence but not cli-flags will take. This will affect only if the same settings are set in both CLI and config. - #2077

Dependency Upgrades / CVE fixes:

Upgraded dependencies for better stability (PR #2209, #2196, #2195, #2086)

GoogleCloudPlatform/gcsfuse v2.4.0 Gcsfuse v2.4.0 on GitHub

GoogleCloudPlatform/gcsfuse v2.4.0
Gcsfuse v2.4.0

on GitHub