buster/api/libs/sql_analyzer/CLAUDE.md

81 lines
3.0 KiB
Markdown
Raw Normal View History

2025-03-22 13:44:49 +08:00
# SQL Analyzer Library
## Purpose
The SQL Analyzer library provides functionality to parse and analyze SQL queries, extracting tables, columns, joins, and CTEs with lineage tracing. It's designed for integration with a Tokio-based web server.
## Key Features
- Extracts tables with database/schema identifiers and aliases
- Links columns to their tables, deduplicating per table
- Identifies joins with lineage to base tables
- Recursively analyzes CTEs, tracking their lineage
- Flags vague references (unqualified columns or tables without schema)
- Provides non-blocking parsing for web server compatibility
## Internal Organization
### Module Structure
- **lib.rs**: Main entry point and public API
- **types.rs**: Data structures for tables, joins, CTEs, and query summaries
- **errors.rs**: Custom error types for SQL analysis
- **utils/mod.rs**: Core analysis logic using sqlparser
### Key Components
- **QuerySummary**: The main output structure containing tables, joins, and CTEs
- **TableInfo**: Information about tables including identifiers and columns
- **JoinInfo**: Describes joins between tables with conditions
- **CteSummary**: Contains a CTE's query summary and column mappings
- **SqlAnalyzerError**: Custom errors for parsing issues and vague references
- **QueryAnalyzer**: The worker class that implements the Visitor pattern
## Usage Patterns
### Basic Usage
```rust
use sql_analyzer::analyze_query;
let sql = "SELECT u.id FROM schema.users u JOIN schema.orders o ON u.id = o.user_id";
match analyze_query(sql.to_string()).await {
Ok(summary) => {
println!("Tables: {:?}", summary.tables);
println!("Joins: {:?}", summary.joins);
},
Err(e) => eprintln!("Error: {}", e),
}
```
### Working with CTEs
```rust
let sql = "WITH users_cte AS (SELECT id FROM schema.users)
SELECT o.* FROM schema.orders o JOIN users_cte ON o.user_id = users_cte.id";
let summary = analyze_query(sql.to_string()).await?;
// Access the CTE information
for cte in &summary.ctes {
println!("CTE: {}", cte.name);
println!("CTE base tables: {:?}", cte.summary.tables);
println!("Column mappings: {:?}", cte.column_mappings);
}
```
## Dependencies
- **sqlparser**: SQL parsing functionality
- **tokio**: Async runtime and blocking task management
- **anyhow**: Error handling
- **serde**: Serialization for query summary structures
- **thiserror**: Custom error definitions
## Testing
- Test cases should cover various SQL constructs
- Ensure tests for error cases (vague references)
- Test CTE lineage tracing and complex join scenarios
## Code Navigation Tips
- Start in lib.rs for the public API
- The core analysis logic is in utils/mod.rs
- The QueryAnalyzer struct implements the sqlparser Visitor trait
- The parsing of object names and column references is key to understanding the lineage tracking
## Common Pitfalls
- Vague references in SQL will cause errors by design
- Complex subqueries may need special handling
- Non-standard SQL dialects might not parse correctly with the generic dialect