Changelog
This project adheres to Semantic Versioning.
Unreleased
0.96.0 - 2022-06-03
Added
- Generic mode: new option
generic_ellipsis_max_span
for controlling
how many lines an ellipsis can match (#5211) - Generic mode: new option
generic_comment_style
for ignoring
comments that follow the specified syntax (C style, C++ style, or
Shell style) (#3428) - Metrics now include a list of features used during an execution.
Examples of such features are: languages scanned, CLI options passed, keys used in rules, or certain code paths reached, such as using an:include
instruction in a.semgrepignore
file.
These strings will NOT include user data or specific settings. As an example, withsemgrep scan --output=secret.txt
we might send"option/output"
but will NOT send"option/output=secret.txt"
.
Changed
- The output summarizing a scan's results has been simplified.
0.95.0 - 2022-06-02
Added
- Sarif output format now includes
fixes
section - Rust: added support for method chaining patterns.
r2c-internal-project-depends-on
: support for poetry and gradle lockfiles- M1 Mac support added to PyPi
- Accept
SEMGREP_BASELINE_REF
as alias forSEMGREP_BASELINE_COMMIT
r2c-internal-project-depends-on
:- pretty printing for SCA results
- support for poetry and gradle lockfiles
- taint-mode: Taint tracking will now analyze lambdas in their surrounding context.
Previously, if a variable became tainted outside a lambda, and this variable was
used inside the lambda causing the taint to reach a sink, this was not being
detected because any nested lambdas were "opaque" to the analysis. (Taint tracking
looked at lambdas but as isolated functions.) Now lambas are simply analyzed as if
they were statement blocks. However, taint tracking still does not follow the flow
of taint through the lambda's arguments! - Metrics now include an anonymous Event ID. This is an ID generated at send-time
and will be used to de-duplicate events that potentially get duplicated during transmission. - Metrics now include an anonymous User ID. This ID is stored in the ~/.semgrep/settings.yml file. If the ID disappears, the next run will generate a new one randomly. See the Anonymous User ID in PRIVACY.md for more details.
Fixed
- M1 Mac installed via pip now links tree-sitter properly
- Restore
--sca
Changed
-
The
ci
CLI command will now include ignored matches in output formats
that dictate they should always be included -
Previously, you could use
$X
in a message to interpolate the variable captured
by a metavariable named$X
, but there was no way to access the underlying value.
However, sometimes that value is more important than the captured variable.
Now you can use the syntaxvalue($X)
to interpolate the underlying
propagated value if it exists (if not, it will just use the variable name).Example:
Take a target file that looks like
x = 42 log(x)
Now take a rule to find that log command:
- id: example_log message: Logged $SECRET: value($SECRET) pattern: log(42) languages: [python]
Before, this would have given you the message
Logged x: value(x)
. Now, it
will give the messageLogged x: 42
. -
A parameter pattern without a default value can now match a parameter
with a default value (#5021)
Fixed
- Numerous improvements to PHP parsing by switching to tree-sitter-php
to parse PHP target code. Huge shoutout to Sjoerd Langkemper for most
of the heavy lifting work
(#3941, #2648, #2650, #3590, #3588, #3587, #3576, #3848, #3978, #4589) - TS: support number and boolean typed metavariables (#5350)
- When a rule from the registry fails to parse, suggest user upgrade to
latest version of semgrep - Scala: correctly handle
return
for taint analysis (#4975) - PHP: correctly handle namespace use declarations when they don't rename
the imported name (#3964) - Constant propagation is now faster and memory efficient when analyzing
large functions with lots of variables.