Design Overview
SQLStream is designed as a lightweight, modular SQL query engine. It follows a classic database architecture but is optimized for querying files directly rather than managing storage.
Components
- Parser: Converts SQL strings into an Abstract Syntax Tree (AST).
- Planner: Converts the AST into a Logical Plan.
- Optimizer: Applies optimizations like predicate pushdown to the Logical Plan.
- Executor: Converts the optimized plan into a Physical Plan and executes it.
- Volcano Executor: Pure Python, streaming iterator-based.
- Pandas Executor: Vectorized, in-memory execution.
- Readers: Abstractions for reading data from various sources (CSV, Parquet, S3).
Data Flow
graph LR
SQL[SQL Query] --> Parser
Parser --> AST
AST --> Planner
Planner --> LogicalPlan
LogicalPlan --> Optimizer
Optimizer --> OptimizedPlan
OptimizedPlan --> Executor
Executor --> Results
Readers --> Executor