What's Changed

ASH v3 Release by @scrthq, @awsmadi, @awsntheule , @rafaelpereyra and many more in #117

Full Changelog: v2.0.1...v3.0.0

ASH v3 Release

This PR includes the work comprising the next major version release of the Automated Security Helper.

Drivers
Breaking Changes
- aggregated_results.{txt,json} Structure
- Migration from git-secrets to detect-secrets
New Features / Enhancements
- SARIF as primary data structure for SAST reports
- CycloneDX as primary data structure for SBOM reports
- JSON output from ASHARP model for aggregated results
- Configuration Support
- Plugin Support / Extensibility

Feature Parity - Various Item Tracker

Offline mode in progress
aggregated_results.txt exists in progress
Documentation updates
- ASH configuration
  - Referencing environment variables from the config
  - Securely referencing protected values (e.g. scanner API keys) without exposing them in artifacts (WIP)
- Installing ASH in Python
- New command-line arguments
- Using previous ash_aggregated_results.json results to generate new report formats with
- Customizing ASH with Plugins
  - Using the inspect outputs to identify mapping gaps (WIP)
- ash_defaults built-in plugin modules
- AWS access during ASH invocation (e.g. custom Inspector scanner or custom S3 reporter)

Drivers

The core drivers for the changes in this release are:

Standardization of ASH results data structure:
- ASH should produce machine-readable outputs by default so the outputs can be better leveraged by users and organizations integrating ASH into their SDLC processes.
Support for industry standard output formats:
- ASH should be able to produce reports from its standardized data structure that align with industry standards for security scanning and test reporting, e.g. SARIF, CycloneDX, JUnitXML.
ASH reports should be easily actionable:
- Reviewing an ASH report and identifying the issues that need to be actioned on should be simple.
- ASH should support producing formats optimized for human-readability, e.g. HTML reports or text reports that display the findings in a way that focuses on what is important from the scan.
Extensibility and an overall better developer experience:
- ASH has historically been written mostly as shell scripts, with small amounts of various other languages being introduced over time depending on what was required at that time. This has made extensibility, development and testing overall difficult compared to focusing entirely on a language better suited for development such as Python.
- Extending/customizing ASH has also been something not easily accomplishable without having a deep understanding of ASH, often requiring internalization and additional administrative overhead.
Configurability:
- A feature request we've received often has been to surface a mechanism to configure ASH, e.g. providing custom path exclusions or providing configuration to underlying scanners.

Breaking Changes

The following changes in this release could impact how you currently use ASH.

`aggregated_results.{txt,json}` Structure

One of the primary goals with this release has been to improve how ASH collects, processes, formats the outputs it produces across the suite of scanners ASH employs. The output format up until this release has been raw stdout/stderr redirection from the scanners themselves. This makes scan result processing manual, often including a large amount of "noise" due to capturing all of the scanner output.

This release changes the output format for the aggregated results to a standardized data model named the "ASHARP" model (ASH Aggregated Results Parser). This model is emitted as a JSON file to the output directory named aggregated_results.json.

*If you are not currently parsing the aggregated_results.{txt,json} output of ASH, you are likely not going to be impacted by this change)

The output model JSON schema is available at src/automated_security_helper/schemas/ASHARPModel.json
The Pydantic model that generates the JSON schema is available at src/automated_security_helper/models/asharp_model.py

Migration from `git-secrets` to `detect-secrets`

detect-secrets currently provides a full Python interface and can have the version pinned within our pyproject.toml.
detect-secrets provides the ability to baseline a directory or file so acknowledged findings do not continue to raise false positives.
Within our testing, git-secrets found far less findings than detect-secrets has, with a sample directory showing 2 secrets detected by git-secrets (AWS key pair) vs 157 by detect-secrets (including the AWS key pair that git-secrets found)
- git-secrets only matching AWS credentials without custom rule/pattern authoring
- detect-secrets supports a large variety of predefined rules that greatly increase overall secret-type detection support:

$ detect-secrets scan --list-all-plugins
ArtifactoryDetector
AWSKeyDetector
AzureStorageKeyDetector
BasicAuthDetector
CloudantDetector
DiscordBotTokenDetector
GitHubTokenDetector
GitLabTokenDetector
Base64HighEntropyString
HexHighEntropyString
IbmCloudIamDetector
IbmCosHmacDetector
IPPublicDetector
JwtTokenDetector
KeywordDetector
MailchimpDetector
NpmDetector
OpenAIDetector
PrivateKeyDetector
PypiTokenDetector
SendGridDetector
SlackDetector
SoftlayerDetector
SquareOAuthDetector
StripeDetector
TelegramBotTokenDetector
TwilioKeyDetector

New Features / Enhancements

SARIF as primary data structure for SAST reports

The Static Analysis Results Interchange Format (SARIF) defines a standard format for the output of static analysis tools. ASH uses the SARIF 2.1.0 schema specification as an intermediary data format for SAST scanner results to emit reports from.

Along with being open source itself, SARIF has been chosen for ASH's SAST data format due to its broad ecosystem and existing integration support with common enterprise tooling.

Links:

CycloneDX as primary data structure for SBOM reports

Similar to SARIF, OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction.

Links:

OWASP CycloneDX home page
specification

JSON output from ASHARP model for aggregated results

The ASHARP model is a lightweight metadata wrapper that allows collection of all relevant data from a scan necessary to produce scan reports.

Configuration Support

ASH now has a local configuration format with a backing ASHConfig model JSON schema. The configuration can be authored in either JSON or YAML. ASH looks in the source directory of the scan for the following configuration file paths, if an explicit path was not provided by default:

The ASH_CONFIG environment variable, if set to a valid path to an ASH configuration file.
An ash.yaml or ash.yml in the root of the source directory of the scan.
An ash.json in the root of the source directory of the scan.

Plugin Support / Extensibility

ASH v3 introduces support for custom plugins in the form of Python modules extending the following module namespaces:

automated_security_helper.converters
- Converters are responsible for converting unscannable file formats into scannable ones.
- ASH currently includes the following ConverterPlugin implementations as of this release (checked means implemented, tested and ready to release):
  - ArchiveConverter: Identifies zip, tar, and tar.gz files in the source directory, searches for scannable files within the archive, and extracts the scannable files into the temporary working directory of the scan.
  - JupyterNotebookConverter: Identifies Jupyter Notebook (.ipynb) files and converts them to Python using nbconvert, outputting the convertable Python files to the temporary working directory of the scan.
automated_security_helper.scanners
- Scanners are the core of ASH and are the integration point for SAST and SBOM scanners.
- ASH currently includes the following ScannerPlugin implementations as of this release (checked means implemented, tested and ready to release):
  - BanditScanner: Runs bandit to perform SAST scanning against Python files.
  - CdkNagScanner: Evaluates rendered CloudFormation YAML/JSON templates against CDK Nag's provided NagPacks. Defaults to including the AWS Solutions NagPack, but allows enabling any other CDK NagPack: HIPAA Security, NIST 800-53 rev 4, NIST 800-53 rev 5, and PCI DSS 3.2.1 NagPacks.
  - CfnNagScanner: Runs cfn-nag against rendered CloudFormation templates for IaC analysis.
  - CheckovScanner: Runs checkov to perform IaC/SAST scanning against applicable content in the source directory.
  - DetectSecretsScanner: Runs detect-secrets tool against scannable files in the source directory to identify secrets in code. Replaces git-secrets in ASH's scanner stack.
  - NpmAuditScanner: Runs npm/yarn/pnpm audit based on which package lock(s) are discovered in the source directory.
  - SemgrepScanner: Runs semgrep to perform SAST scans.
  - GrypeScanner: Runs grype to perform SAST scans.
  - SyftScanner: Runs syft to perform SBOM scans.
  - CustomScanner: Configuration-driven implementation that allows easy integration of custom scanner tools that emit SARIF and/or CycloneDX outputs.
automated_security_helper.reporters
- Reporters are responsible for ingesting the ASHARPModel and outputting the data into different formats or data stores, e.g. to file or to a centralized security finding aggregation service like Amazon Security Hub.
- ASH currently includes the following ReporterPlugin implementations as of this release (checked means implemented, tested and ready to release):
  - ASFFReporter: Converts report to ASFF (Amazon Security Findings Format), saves as ash.asff in the output directory.
  - CSVReporter: Converts report to simple CSV format, saves as ash.csv in the output directory.
  - CycloneDXReporter: Converts SBOM report to CycloneDX JSON format, saves as ash.cdx.json in the output directory.
  - HTMLReporter: Converts report to simple HTML format, saves as ash.html in the output directory.
  - JSONReporter: Converts report to simple JSON format, saves as ash.json in the output directory.
  - JUnitXMLReporter: Converts report to JUnitXML format, saves as ash.junit.xml in the output directory.
  - MarkdownReporter: Converts report to Markdown format, saves as ash.md in the output directory. Provides useful top-level information around the scan results, including listing the file locations with based on finding count to identify hotspots to focus on.
  - OCSFReporter: Converts report to OCSF (Open Cybersecurity Schema Framework) format, saves as ash.ocsf in the output directory.
  - SARIFReporter: Converts Sreport to SARIF format, saves as ash.sarif in the output directory.
  - SPDXReporter: Converts SBOM report to SPDF JSON format, saves as ash.spdf.json in the output directory.
  - TextReporter: Converts report to a simple text-based report, saves as ash.txt in the output directory.
  - YAMLReporter: Converts report to simple YAML format, saves as ash.yaml in the output directory.

awslabs/automated-security-helper v3.0.0 ASH v3.0.0 Release on GitHub