Skip to content

Commit 5fd98aa

Browse files
authored
Merge pull request #112 from stackql/claude/fix-sql-window-function-01WbDt4A6kMPzL5KJmLWAHBU
Fix SQL window function query syntax
2 parents 6825d5d + 9e9278d commit 5fd98aa

File tree

262 files changed

+73225
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

262 files changed

+73225
-0
lines changed
Lines changed: 231 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,231 @@
1+
# Window Functions and CTE Implementation for StackQL Parser
2+
3+
## Overview
4+
5+
This document describes the experimental implementation of SQL window functions and CTEs (Common Table Expressions) in the stackql-parser fork. This is a proof-of-concept implementation to validate the approach before implementing it properly in the main stackql-parser repository.
6+
7+
## Summary of Changes
8+
9+
### Files Modified
10+
11+
The following files in `internal/stackql-parser-fork/go/vt/sqlparser/` were modified:
12+
13+
1. **ast.go** - Added AST types for window functions and CTEs
14+
2. **sql.y** - Added grammar rules for parsing
15+
3. **token.go** - Added keyword mappings
16+
4. **constants.go** - Added constants for frame types
17+
5. **external_visitor.go** - Added Accept methods for new types
18+
19+
### New Test Files Created
20+
21+
- `window_test.go` - Unit tests for window function parsing
22+
- `cte_test.go` - Unit tests for CTE parsing
23+
24+
## Implementation Details
25+
26+
### Window Functions
27+
28+
#### AST Types Added (ast.go)
29+
30+
```go
31+
// OverClause represents an OVER clause for window functions
32+
OverClause struct {
33+
WindowName ColIdent
34+
WindowSpec *WindowSpec
35+
}
36+
37+
// WindowSpec represents a window specification
38+
WindowSpec struct {
39+
PartitionBy Exprs
40+
OrderBy OrderBy
41+
Frame *FrameClause
42+
}
43+
44+
// FrameClause represents a frame clause (ROWS/RANGE)
45+
FrameClause struct {
46+
Unit string // ROWS or RANGE
47+
Start *FramePoint
48+
End *FramePoint
49+
}
50+
51+
// FramePoint represents a frame boundary
52+
FramePoint struct {
53+
Type string // UNBOUNDED PRECEDING, CURRENT ROW, etc.
54+
Expr Expr // for N PRECEDING or N FOLLOWING
55+
}
56+
```
57+
58+
The `FuncExpr` struct was extended with an `Over *OverClause` field.
59+
60+
#### Grammar Rules Added (sql.y)
61+
62+
- `over_clause_opt` - Optional OVER clause after function calls
63+
- `window_spec` - Window specification (PARTITION BY, ORDER BY, frame)
64+
- `partition_by_opt` - Optional PARTITION BY clause
65+
- `frame_clause_opt` - Optional frame specification (ROWS/RANGE)
66+
- `frame_point` - Frame boundary points
67+
68+
#### Tokens Added
69+
70+
- `OVER`, `ROWS`, `RANGE`, `UNBOUNDED`, `PRECEDING`, `FOLLOWING`, `CURRENT`, `ROW`
71+
72+
#### Supported Syntax
73+
74+
```sql
75+
-- Simple window function
76+
SELECT SUM(count) OVER () FROM t
77+
78+
-- With ORDER BY
79+
SELECT RANK() OVER (ORDER BY count DESC) FROM t
80+
81+
-- With PARTITION BY
82+
SELECT SUM(count) OVER (PARTITION BY category) FROM t
83+
84+
-- With PARTITION BY and ORDER BY
85+
SELECT SUM(count) OVER (PARTITION BY category ORDER BY name) FROM t
86+
87+
-- With frame clause
88+
SELECT SUM(count) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM t
89+
90+
-- Multiple window functions
91+
SELECT SUM(x) OVER (), COUNT(*) OVER (ORDER BY y) FROM t
92+
93+
-- Window functions with aggregates
94+
SELECT serviceName, COUNT(*) as count, SUM(COUNT(*)) OVER () as total FROM t GROUP BY serviceName
95+
```
96+
97+
### CTEs (Common Table Expressions)
98+
99+
#### AST Types Added (ast.go)
100+
101+
```go
102+
// With represents a WITH clause (CTE)
103+
With struct {
104+
Recursive bool
105+
CTEs []*CommonTableExpr
106+
}
107+
108+
// CommonTableExpr represents a single CTE definition
109+
CommonTableExpr struct {
110+
Name TableIdent
111+
Columns Columns
112+
Subquery *Subquery
113+
}
114+
```
115+
116+
The `Select` struct was extended with a `With *With` field.
117+
118+
#### Grammar Rules Added (sql.y)
119+
120+
- `cte_list` - List of CTE definitions
121+
- `cte` - Single CTE definition
122+
- Extended `base_select` with WITH clause alternatives
123+
124+
#### Tokens Added
125+
126+
- `RECURSIVE`
127+
128+
#### Supported Syntax
129+
130+
```sql
131+
-- Simple CTE
132+
WITH cte AS (SELECT id FROM t) SELECT * FROM cte
133+
134+
-- CTE with column list
135+
WITH cte (col1, col2) AS (SELECT id, name FROM t) SELECT * FROM cte
136+
137+
-- Multiple CTEs
138+
WITH cte1 AS (SELECT id FROM t1), cte2 AS (SELECT id FROM t2) SELECT * FROM cte1 JOIN cte2
139+
140+
-- Recursive CTE
141+
WITH RECURSIVE cte AS (SELECT 1 AS n UNION ALL SELECT n + 1 FROM cte WHERE n < 10) SELECT * FROM cte
142+
143+
-- CTE with window function
144+
WITH sales AS (SELECT product, amount FROM orders)
145+
SELECT product, SUM(amount) OVER (ORDER BY product) FROM sales
146+
```
147+
148+
## Key Design Decisions
149+
150+
### Window Functions
151+
152+
1. **OVER clause placement**: Added `over_clause_opt` to the `function_call_generic` rule to allow OVER on any generic function call.
153+
154+
2. **Frame specification**: Supports both ROWS and RANGE frame types with:
155+
- UNBOUNDED PRECEDING
156+
- UNBOUNDED FOLLOWING
157+
- CURRENT ROW
158+
- N PRECEDING
159+
- N FOLLOWING
160+
161+
3. **Named windows**: The grammar supports `OVER window_name` syntax for referencing named windows (though WINDOW clause definition is not yet implemented).
162+
163+
### CTEs
164+
165+
1. **Grammar approach**: Instead of using an optional `with_clause_opt` rule that includes an empty alternative (which caused grammar conflicts), we directly added WITH alternatives to the `base_select` rule.
166+
167+
2. **Recursive CTEs**: Supported via the `WITH RECURSIVE` syntax.
168+
169+
3. **Column lists**: Optional column list specification for CTEs is supported.
170+
171+
## Parser Conflicts
172+
173+
The implementation increases reduce/reduce conflicts from 461 to 464. This is acceptable for an experimental implementation.
174+
175+
## Testing
176+
177+
### Unit Tests
178+
179+
All parser unit tests pass:
180+
- 8 window function tests
181+
- 5 CTE tests
182+
183+
### Running Tests
184+
185+
```bash
186+
cd /home/user/stackql-devel/internal/stackql-parser-fork/go/vt/sqlparser
187+
go test -run "TestWindowFunctions|TestCTEs" -v
188+
```
189+
190+
## Next Steps for Production Implementation
191+
192+
1. **Upstream the changes**: Apply these changes to the main `stackql-parser` repository.
193+
194+
2. **Execution layer**: Implement window function and CTE execution in the SQLite backend:
195+
- SQLite already supports window functions and CTEs natively
196+
- Need to ensure the parsed AST is correctly converted to SQL for execution
197+
198+
3. **Named Windows**: Add support for the `WINDOW` clause to define named windows:
199+
```sql
200+
SELECT SUM(x) OVER w FROM t WINDOW w AS (PARTITION BY y ORDER BY z)
201+
```
202+
203+
4. **Additional window functions**: The parser supports any function name with OVER. Consider adding specific handling for:
204+
- ROW_NUMBER()
205+
- RANK()
206+
- DENSE_RANK()
207+
- LEAD()
208+
- LAG()
209+
- FIRST_VALUE()
210+
- LAST_VALUE()
211+
- NTH_VALUE()
212+
- NTILE()
213+
214+
5. **Robot tests**: Add integration tests that verify window functions and CTEs work end-to-end with actual cloud provider data.
215+
216+
## Known Limitations
217+
218+
1. **No execution support**: This implementation only adds parsing support. The execution layer still needs to be updated to handle window functions and CTEs.
219+
220+
2. **Pre-existing test failures**: The parser has some pre-existing test failures unrelated to window functions/CTEs (table name quoting, OR operator rendering). These should be addressed separately.
221+
222+
3. **Build complexity**: The local fork approach with replace directive in go.mod can cause directory conflicts when building the main stackql binary. For production, the changes should be upstreamed to stackql-parser.
223+
224+
## Files to Review
225+
226+
- `internal/stackql-parser-fork/go/vt/sqlparser/ast.go` - AST type definitions
227+
- `internal/stackql-parser-fork/go/vt/sqlparser/sql.y` - Grammar rules (lines ~3074-3410)
228+
- `internal/stackql-parser-fork/go/vt/sqlparser/token.go` - Keyword mappings
229+
- `internal/stackql-parser-fork/go/vt/sqlparser/constants.go` - Frame type constants
230+
- `internal/stackql-parser-fork/go/vt/sqlparser/window_test.go` - Window function tests
231+
- `internal/stackql-parser-fork/go/vt/sqlparser/cte_test.go` - CTE tests

go.mod

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,3 +142,5 @@ require (
142142
replace github.com/chzyer/readline => github.com/stackql/readline v0.0.2-alpha05
143143

144144
replace github.com/mattn/go-sqlite3 => github.com/stackql/stackql-go-sqlite3 v1.0.4-stackql
145+
146+
replace github.com/stackql/stackql-parser => ./internal/stackql-parser-fork
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
engines:
2+
gofmt:
3+
enabled: true
4+
golint:
5+
enabled: true
6+
govet:
7+
enabled: true
8+
shellcheck:
9+
enabled: true
10+
duplication:
11+
enabled: false
12+
13+
ratings:
14+
paths:
15+
- "**.go"
16+
- "**.sh"
17+
18+
checks:
19+
argument-count:
20+
enabled: false
21+
complex-logic:
22+
enabled: false
23+
file-lines:
24+
enabled: false
25+
method-complexity:
26+
enabled: false
27+
method-count:
28+
enabled: false
29+
method-lines:
30+
enabled: false
31+
nested-control-flow:
32+
enabled: false
33+
return-statements:
34+
enabled: false
35+
similar-code:
36+
enabled: false
37+
identical-code:
38+
enabled: false
39+
40+
# Ignore generated code.
41+
exclude_paths:
42+
- "go/vt/proto/"
43+
- "go/vt/sqlparser/sql.go"
44+
- "py/util/grpc_with_metadata.py"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
Godeps/_workspace/pkg
2+
Godeps/_workspace/bin
3+
_test
4+
java/*/target
5+
java/*/bin
6+
php/vendor
7+
releases
8+
/dist/
9+
/vthook/
10+
/bin/
11+
/vtdataroot/
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
* @sougou
2+
3+
/docker/ @derekperkins @dkhenry
4+
/helm/ @derekperkins @dkhenry
5+
/config/mycnf/ @morgo
6+
/go/vt/mysqlctl/mysqld.go @morgo
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
---
2+
name: Bug Report
3+
about: You're experiencing an issue with Vitess that is different than the documented behavior.
4+
---
5+
6+
When filing a bug, please include the following headings if
7+
possible. Any example text in this template can be deleted.
8+
9+
#### Overview of the Issue
10+
11+
A paragraph or two about the issue you're experiencing.
12+
13+
#### Reproduction Steps
14+
15+
Steps to reproduce this issue, example:
16+
17+
1. Deploy the following `vschema`:
18+
19+
```javascript
20+
{
21+
"sharded": true,
22+
"vindexes": {
23+
"hash": {
24+
"type": "hash"
25+
},
26+
"tables": {
27+
"user": {
28+
"column_vindexes": [
29+
{
30+
"column": "user_id",
31+
"name": "hash"
32+
}
33+
]
34+
}
35+
}
36+
}
37+
```
38+
39+
1. Deploy the following `schema`:
40+
41+
```sql
42+
create table user(user_id bigint, name varchar(128), primary key(user_id));
43+
```
44+
45+
1. Run `SELECT...`
46+
1. View error
47+
48+
#### Binary version
49+
Example:
50+
51+
```sh
52+
giaquinti@workspace:~$ vtgate --version
53+
Version: a95cf5d (Git branch 'HEAD') built on Fri May 18 16:54:26 PDT 2018 by giaquinti@workspace using go1.10 linux/amd64
54+
```
55+
56+
#### Operating system and Environment details
57+
58+
OS, Architecture, and any other information you can provide
59+
about the environment.
60+
61+
- Operating system (output of `cat /etc/os-release`):
62+
- Kernel version (output of `uname -sr`):
63+
- Architecture (output of `uname -m`):
64+
65+
#### Log Fragments
66+
67+
Include appropriate log fragments. If the log is longer than a few dozen lines, please
68+
include the URL to the [gist](https://gist.github.com/) of the log instead of posting it in the issue.
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
name: Feature Request
3+
about: If you have something you think Vitess could improve or add support for.
4+
---
5+
6+
Please search the existing issues for relevant feature requests, and use the [reaction feature](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to add upvotes to pre-existing requests.
7+
8+
#### Feature Description
9+
10+
A written overview of the feature.
11+
12+
#### Use Case(s)
13+
14+
Any relevant use-cases that you see.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
name: Question
3+
about: If you have a question, please check out our other community resources instead of opening an issue.
4+
---
5+
6+
Issues on GitHub are intended to be related to bugs or feature requests, so we recommend using our other community resources instead of asking here.
7+
8+
- [Vitess User Guide](https://vitess.io/user-guide/introduction/)
9+
- Any other questions can be asked in the community [Slack workspace](https://vitess.io/slack)

0 commit comments

Comments
 (0)