Skip to content

Commit e8950e6

Browse files
authored
feat(sandbox): add L7 query parameter matchers (#617)
* feat(sandbox): add L7 query parameter matchers Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com> * fix(sandbox): decode + as space in query params and validate glob syntax Three improvements from PR #617 review: 1. Decode + as space in query string values per the application/x-www-form-urlencoded convention. This matches Python's urllib.parse, JavaScript's URLSearchParams, Go's url.ParseQuery, and most HTTP frameworks. Literal + should be sent as %2B. 2. Add glob pattern syntax validation (warnings) for query matchers. Checks for unclosed brackets and braces in glob/any patterns. These are warnings (not errors) because OPA's glob.match is forgiving, but they surface likely typos during policy loading. 3. Add missing test cases: empty query values, keys without values, unicode after percent-decoding, empty query strings, and literal + via %2B encoding. * fix(sandbox): add missing query_params field in forward proxy L7 request info * style(sandbox): fix formatting in proxy L7 query param parsing --------- Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com> Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
1 parent 8c4b172 commit e8950e6

15 files changed

Lines changed: 1202 additions & 20 deletions

File tree

architecture/sandbox.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -968,7 +968,7 @@ flowchart LR
968968
| `EnforcementMode` | `Audit`, `Enforce` | What to do on L7 deny (log-only vs block) |
969969
| `L7EndpointConfig` | `{ protocol, tls, enforcement }` | Per-endpoint L7 configuration |
970970
| `L7Decision` | `{ allowed, reason, matched_rule }` | Result of L7 evaluation |
971-
| `L7RequestInfo` | `{ action, target }` | HTTP method + path for policy evaluation |
971+
| `L7RequestInfo` | `{ action, target, query_params }` | HTTP method, path, and decoded query multimap for policy evaluation |
972972

973973
### Access presets
974974

@@ -1047,7 +1047,7 @@ This enables credential injection on all HTTPS endpoints automatically, without
10471047

10481048
Implements `L7Provider` for HTTP/1.1:
10491049

1050-
- **`parse_request()`**: Reads up to 16 KiB of headers, parses the request line (method, path), determines body framing from `Content-Length` or `Transfer-Encoding: chunked` headers. Returns `L7Request` with raw header bytes (may include overflow body bytes).
1050+
- **`parse_request()`**: Reads up to 16 KiB of headers, parses the request line (method, path), decodes query parameters into a multimap, determines body framing from `Content-Length` or `Transfer-Encoding: chunked` headers. Returns `L7Request` with raw header bytes (may include overflow body bytes).
10511051

10521052
- **`relay()`**: Forwards request headers and body to upstream (handling Content-Length, chunked, and no-body cases), then reads and relays the full response back to the client.
10531053

@@ -1060,7 +1060,7 @@ Implements `L7Provider` for HTTP/1.1:
10601060
`relay_with_inspection()` in `crates/openshell-sandbox/src/l7/relay.rs` is the main relay loop:
10611061

10621062
1. Parse one HTTP request from client via the provider
1063-
2. Build L7 input JSON with `request.method`, `request.path`, plus the CONNECT-level context (host, port, binary, ancestors, cmdline)
1063+
2. Build L7 input JSON with `request.method`, `request.path`, `request.query_params`, plus the CONNECT-level context (host, port, binary, ancestors, cmdline)
10641064
3. Evaluate `data.openshell.sandbox.allow_request` and `data.openshell.sandbox.request_deny_reason`
10651065
4. Log the L7 decision (tagged `L7_REQUEST`)
10661066
5. If allowed (or audit mode): relay request to upstream and response back to client, then loop

architecture/security-policy.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -467,9 +467,14 @@ rules:
467467
- allow:
468468
method: GET
469469
path: "/repos/**"
470+
query:
471+
per_page: "1*"
470472
- allow:
471473
method: POST
472474
path: "/repos/*/issues"
475+
query:
476+
labels:
477+
any: ["bug*", "p1*"]
473478
```
474479

475480
#### `L7Allow`
@@ -479,8 +484,9 @@ rules:
479484
| `method` | `string` | HTTP method: `GET`, `HEAD`, `POST`, `PUT`, `DELETE`, `PATCH`, `OPTIONS`, or `*` (any). Case-insensitive matching. |
480485
| `path` | `string` | URL path glob pattern: `**` matches everything, otherwise `glob.match` with `/` delimiter. |
481486
| `command` | `string` | SQL command: `SELECT`, `INSERT`, `UPDATE`, `DELETE`, or `*` (any). Case-insensitive matching. For `protocol: sql` endpoints. |
487+
| `query` | `map` | Optional REST query rules keyed by decoded query param name. Value is either a glob string (for example, `tag: "foo-*"`) or `{ any: ["foo-*", "bar-*"] }`. |
482488

483-
Method and command fields use `*` as wildcard for "any". Path patterns use `**` for "match everything" and standard glob patterns with `/` as a delimiter otherwise. See `sandbox-policy.rego` -- `method_matches()`, `path_matches()`, `command_matches()`.
489+
Method and command fields use `*` as wildcard for "any". Path patterns use `**` for "match everything" and standard glob patterns with `/` as a delimiter otherwise. Query matching is case-sensitive and evaluates decoded values; when duplicate keys are present in the request, every value for that key must match the configured matcher. See `sandbox-policy.rego` -- `method_matches()`, `path_matches()`, `command_matches()`, `query_params_match()`.
484490

485491
#### Access Presets
486492

crates/openshell-policy/src/lib.rs

Lines changed: 92 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@ use std::path::Path;
1515

1616
use miette::{IntoDiagnostic, Result, WrapErr};
1717
use openshell_core::proto::{
18-
FilesystemPolicy, L7Allow, L7Rule, LandlockPolicy, NetworkBinary, NetworkEndpoint,
19-
NetworkPolicyRule, ProcessPolicy, SandboxPolicy,
18+
FilesystemPolicy, L7Allow, L7QueryMatcher, L7Rule, LandlockPolicy, NetworkBinary,
19+
NetworkEndpoint, NetworkPolicyRule, ProcessPolicy, SandboxPolicy,
2020
};
2121
use serde::{Deserialize, Serialize};
2222

@@ -120,6 +120,22 @@ struct L7AllowDef {
120120
path: String,
121121
#[serde(default, skip_serializing_if = "String::is_empty")]
122122
command: String,
123+
#[serde(default, skip_serializing_if = "BTreeMap::is_empty")]
124+
query: BTreeMap<String, QueryMatcherDef>,
125+
}
126+
127+
#[derive(Debug, Serialize, Deserialize)]
128+
#[serde(untagged)]
129+
enum QueryMatcherDef {
130+
Glob(String),
131+
Any(QueryAnyDef),
132+
}
133+
134+
#[derive(Debug, Serialize, Deserialize)]
135+
#[serde(deny_unknown_fields)]
136+
struct QueryAnyDef {
137+
#[serde(default, skip_serializing_if = "Vec::is_empty")]
138+
any: Vec<String>,
123139
}
124140

125141
#[derive(Debug, Serialize, Deserialize)]
@@ -176,6 +192,23 @@ fn to_proto(raw: PolicyFile) -> SandboxPolicy {
176192
method: r.allow.method,
177193
path: r.allow.path,
178194
command: r.allow.command,
195+
query: r
196+
.allow
197+
.query
198+
.into_iter()
199+
.map(|(key, matcher)| {
200+
let proto = match matcher {
201+
QueryMatcherDef::Glob(glob) => {
202+
L7QueryMatcher { glob, any: vec![] }
203+
}
204+
QueryMatcherDef::Any(any) => L7QueryMatcher {
205+
glob: String::new(),
206+
any: any.any,
207+
},
208+
};
209+
(key, proto)
210+
})
211+
.collect(),
179212
}),
180213
})
181214
.collect(),
@@ -275,6 +308,20 @@ fn from_proto(policy: &SandboxPolicy) -> PolicyFile {
275308
method: a.method,
276309
path: a.path,
277310
command: a.command,
311+
query: a
312+
.query
313+
.into_iter()
314+
.map(|(key, matcher)| {
315+
let yaml_matcher = if !matcher.any.is_empty() {
316+
QueryMatcherDef::Any(QueryAnyDef {
317+
any: matcher.any,
318+
})
319+
} else {
320+
QueryMatcherDef::Glob(matcher.glob)
321+
};
322+
(key, yaml_matcher)
323+
})
324+
.collect(),
278325
},
279326
}
280327
})
@@ -754,6 +801,49 @@ network_policies:
754801
assert_eq!(rule.binaries[0].path, "/usr/bin/curl");
755802
}
756803

804+
#[test]
805+
fn parse_l7_query_matchers_and_round_trip() {
806+
let yaml = r#"
807+
version: 1
808+
network_policies:
809+
query_test:
810+
name: query_test
811+
endpoints:
812+
- host: api.example.com
813+
port: 8080
814+
protocol: rest
815+
rules:
816+
- allow:
817+
method: GET
818+
path: /download
819+
query:
820+
slug: "my-*"
821+
tag:
822+
any: ["foo-*", "bar-*"]
823+
binaries:
824+
- path: /usr/bin/curl
825+
"#;
826+
let proto = parse_sandbox_policy(yaml).expect("parse failed");
827+
let allow = proto.network_policies["query_test"].endpoints[0].rules[0]
828+
.allow
829+
.as_ref()
830+
.expect("allow");
831+
assert_eq!(allow.query["slug"].glob, "my-*");
832+
assert_eq!(allow.query["slug"].any, Vec::<String>::new());
833+
assert_eq!(allow.query["tag"].any, vec!["foo-*", "bar-*"]);
834+
assert!(allow.query["tag"].glob.is_empty());
835+
836+
let yaml_out = serialize_sandbox_policy(&proto).expect("serialize failed");
837+
let proto_round_trip = parse_sandbox_policy(&yaml_out).expect("re-parse failed");
838+
let allow_round_trip = proto_round_trip.network_policies["query_test"].endpoints[0].rules
839+
[0]
840+
.allow
841+
.as_ref()
842+
.expect("allow");
843+
assert_eq!(allow_round_trip.query["slug"].glob, "my-*");
844+
assert_eq!(allow_round_trip.query["tag"].any, vec!["foo-*", "bar-*"]);
845+
}
846+
757847
#[test]
758848
fn parse_rejects_unknown_fields() {
759849
let yaml = "version: 1\nbogus_field: true\n";

crates/openshell-sandbox/data/sandbox-policy.rego

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,7 @@ request_allowed_for_endpoint(request, endpoint) if {
208208
rule.allow.method
209209
method_matches(request.method, rule.allow.method)
210210
path_matches(request.path, rule.allow.path)
211+
query_params_match(request, rule)
211212
}
212213

213214
# --- L7 rule matching: SQL command ---
@@ -235,6 +236,55 @@ path_matches(actual, pattern) if {
235236
glob.match(pattern, ["/"], actual)
236237
}
237238

239+
# Query matching:
240+
# - If no query rules are configured, allow any query params.
241+
# - For configured keys, all request values for that key must match.
242+
# - Matcher shape supports either `glob` or `any`.
243+
query_params_match(request, rule) if {
244+
query_rules := object.get(rule.allow, "query", {})
245+
not query_mismatch(request, query_rules)
246+
}
247+
248+
query_mismatch(request, query_rules) if {
249+
some key
250+
matcher := query_rules[key]
251+
not query_key_matches(request, key, matcher)
252+
}
253+
254+
query_key_matches(request, key, matcher) if {
255+
request_query := object.get(request, "query_params", {})
256+
values := object.get(request_query, key, null)
257+
values != null
258+
count(values) > 0
259+
not query_value_mismatch(values, matcher)
260+
}
261+
262+
query_value_mismatch(values, matcher) if {
263+
some i
264+
value := values[i]
265+
not query_value_matches(value, matcher)
266+
}
267+
268+
query_value_matches(value, matcher) if {
269+
is_string(matcher)
270+
glob.match(matcher, [], value)
271+
}
272+
273+
query_value_matches(value, matcher) if {
274+
is_object(matcher)
275+
glob_pattern := object.get(matcher, "glob", "")
276+
glob_pattern != ""
277+
glob.match(glob_pattern, [], value)
278+
}
279+
280+
query_value_matches(value, matcher) if {
281+
is_object(matcher)
282+
any_patterns := object.get(matcher, "any", [])
283+
count(any_patterns) > 0
284+
some i
285+
glob.match(any_patterns[i], [], value)
286+
}
287+
238288
# SQL command matching: "*" matches any; otherwise case-insensitive.
239289
command_matches(_, "*") if true
240290

0 commit comments

Comments
 (0)