From 21bd3f302edf10a46871f2d3360c3be9d5bfdc3d Mon Sep 17 00:00:00 2001 From: Martin Jambon Date: Tue, 13 May 2025 21:09:49 -0700 Subject: [PATCH 1/3] Update documentation for paths.include/exclude --- docs/writing-rules/rule-syntax.md | 56 ++++++++++++++++++------------- 1 file changed, 33 insertions(+), 23 deletions(-) diff --git a/docs/writing-rules/rule-syntax.md b/docs/writing-rules/rule-syntax.md index 07b546a9de..561f06a96b 100644 --- a/docs/writing-rules/rule-syntax.md +++ b/docs/writing-rules/rule-syntax.md @@ -1154,19 +1154,10 @@ Provide a category for users of the rule. For example: `best-practice`, `correct ### Excluding a rule in paths To ignore a specific rule on specific files, set the `paths:` key with -one or more filters. The patterns apply to the full file paths -relative to the project root. - - +one or more filters. These filters are made of glob patterns that +apply to the file paths relative to the project root. This works +like the command-line filters `--exclude` and +`--include` but in a rule-specific manner. Example: @@ -1176,21 +1167,40 @@ rules: pattern: $X == $X paths: exclude: - - "src/**/*.jinja2" - - "*_test.go" - - "project/tests" - - project/static/*.js + - "*.jinja2" + - "**/backend/*_test.go" + - "/src/tests" + - /src/static/*.js ``` -When invoked with `semgrep -f rule.yaml project/`, the above rule runs on files inside `project/`, but no results are returned for: +When invoked with `semgrep -f rule.yaml src` from the Git project +root, the above rule runs +on file paths inside `src/`, but no results are returned for: + +- any file with a `.jinja2` file extension; +- any file whose name ends in `_test.go` that exists in a folder named + `backend` such as the file `src/backend/server_test.go`; +- any file inside `src/tests/` or its subdirectories, or a regular file + named `src/tests`; +- any file matching the `src/static/*.js` glob pattern such as + `src/static/hello.js` but not `old/src/static/hello.js`. -- any file with a `.jinja2` file extension -- any file whose name ends in `_test.go`, such as `project/backend/server_test.go` -- any file inside `project/tests` or its subdirectories -- any file matching the `project/static/*.js` glob pattern +The selected set of files to scan with the rule +is independent of the current work +folder. The commands `semgrep -f rule.yaml src` and +`(cd src; semgrep -f rule.yaml .)` will therefore scan the same files. :::note -The glob syntax is from [Python's `wcmatch`](https://pypi.org/project/wcmatch/) and is used to match against the given file and all its parent directories. +The glob syntax conforms to the +[Semgrepignore v2](https://semgrep.dev/docs/semgrepignore-v2-reference) +and Gitignore specifications. Patterns are +matched against the normalized file path relative to the project root as +well as all its parent directories. Beware that the presence of +a leading slash (e.g. `/*.c`) or a slash in the middle of the pattern +(e.g. `a/*.c`) anchors the pattern to the project root. `a/*.c` is +equivalent to `/a/*.c` and will match the path `/a/b.c` but won't +match `/x/a/b.c`. All other patterns are unanchored e.g. +`*.c` matches all of `/b.c`, `/a/b.c`, and `/x/a/b.c`. ::: ### Limiting a rule to paths From 951b08f100f8b34a3676e2ba986b300cd60e4ad9 Mon Sep 17 00:00:00 2001 From: Martin Jambon Date: Wed, 14 May 2025 13:59:05 -0700 Subject: [PATCH 2/3] Update docs/writing-rules/rule-syntax.md Co-authored-by: s-santillan --- docs/writing-rules/rule-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/writing-rules/rule-syntax.md b/docs/writing-rules/rule-syntax.md index 561f06a96b..38ae08839f 100644 --- a/docs/writing-rules/rule-syntax.md +++ b/docs/writing-rules/rule-syntax.md @@ -1196,7 +1196,7 @@ The glob syntax conforms to the and Gitignore specifications. Patterns are matched against the normalized file path relative to the project root as well as all its parent directories. Beware that the presence of -a leading slash (e.g. `/*.c`) or a slash in the middle of the pattern +a leading slash (such as `/*.c`) or a slash in the middle of the pattern (e.g. `a/*.c`) anchors the pattern to the project root. `a/*.c` is equivalent to `/a/*.c` and will match the path `/a/b.c` but won't match `/x/a/b.c`. All other patterns are unanchored e.g. From 50fe97426e68167402dca2ea3e01df62779a50c4 Mon Sep 17 00:00:00 2001 From: Martin Jambon Date: Wed, 14 May 2025 13:59:32 -0700 Subject: [PATCH 3/3] Update docs/writing-rules/rule-syntax.md Co-authored-by: s-santillan --- docs/writing-rules/rule-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/writing-rules/rule-syntax.md b/docs/writing-rules/rule-syntax.md index 38ae08839f..5c14fba4e9 100644 --- a/docs/writing-rules/rule-syntax.md +++ b/docs/writing-rules/rule-syntax.md @@ -1197,7 +1197,7 @@ and Gitignore specifications. Patterns are matched against the normalized file path relative to the project root as well as all its parent directories. Beware that the presence of a leading slash (such as `/*.c`) or a slash in the middle of the pattern -(e.g. `a/*.c`) anchors the pattern to the project root. `a/*.c` is +(such as `a/*.c`) anchors the pattern to the project root. `a/*.c` is equivalent to `/a/*.c` and will match the path `/a/b.c` but won't match `/x/a/b.c`. All other patterns are unanchored e.g. `*.c` matches all of `/b.c`, `/a/b.c`, and `/x/a/b.c`.