11 KiB
title | author | date | status | parent_prd | ticket |
---|---|---|---|---|---|
CLI Deployment Logic | Gemini Assistant | 2024-07-26 | Draft | semantic_layer_refactor_overview.md | N/A |
CLI Deployment Logic
Parent Project
This is a sub-PRD of the Semantic Layer and Deployment Refactor project. Please refer to the parent PRD for the overall project context, goals, and implementation plan.
Problem Statement
Once model files are discovered (as per prd_cli_config_and_discovery.md
), the CLI needs to:
- Parse these YAML files into the new Rust structs defined in
api/libs/semantic_layer/src/models.rs
. - Resolve the definitive
data_source_name
,database
, andschema
for each model. This involves checking the model file itself, then theProjectConfig
(if the model belongs to a project), and finally the globalBusterConfig
. - Construct the payload for the
/deploy_datasets
API endpoint, which will now expect an array of these semantic models. - Handle any pre-flight validations that make sense to perform on the CLI side before sending data to the API.
Current behavior:
- The CLI parses YAML into its own local structs (
cli/cli/src/commands/deploy.rs
'sModel
,Entity
, etc.). - Configuration resolution is less structured, primarily relying on global
buster.yml
settings or model-level overrides without a clear project-level intermediate step. - The API request (
DeployDatasetsRequest
) is constructed based on these local CLI structs, not a shared semantic layer definition.
Expected behavior:
- Each discovered
.yml
model file will be parsed usingserde_yaml::from_str
into thesemantic_layer::Model
struct (orVec<semantic_layer::Model>
if a single file can contain multiple model definitions under amodels:
key). - For each parsed
semantic_layer::Model
:- The
database
,schema
, anddata_source_name
will be determined by checking, in order:- Values present directly within the parsed
semantic_layer::Model
(i.e., defined in the YAML itself). - Values from the
ProjectConfig
associated with the model's path (if applicable). - Values from the global
BusterConfig
.
- Values present directly within the parsed
- If any of these (especially
data_source_name
,schema
) are stillNone
after this process, it should be an error, and the model should not be deployed. - The resolved
database
andschema
will be populated back into theOption<String>
fields on thesemantic_layer::Model
instance before sending to the API.
- The
- The CLI will construct a JSON array of these fully resolved
semantic_layer::Model
objects to send to the/deploy_datasets
API. - Basic validations (e.g., presence of model name, required configuration) will be performed by the CLI.
Goals
- Implement YAML parsing for model files into
semantic_layer::Model
structs. - Implement the configuration inheritance logic (Model File > ProjectConfig > Global BusterConfig) for
data_source_name
,database
, andschema
for each model. - Ensure that
database
andschema
are resolved and set on thesemantic_layer::Model
structs before they are sent to the API. - Construct the API request payload as a JSON array of
semantic_layer::Model
objects. - Perform essential CLI-side validations (e.g., model name presence, resolved configurations).
Non-Goals
- Deep validation of SQL expressions or business logic within the model (this is primarily the API/backend's responsibility).
- The actual HTTP call to the API (can be mocked or assumed for this PRD, focus is on payload construction).
Implementation Plan
Phase 1: Parsing and Configuration Resolution
Technical Design
1. Parsing Model Files:
- After
all_model_files_with_context: Vec<(PathBuf, Option<ProjectConfig>)>
is obtained from the discovery phase. - Iterate through this vector. For each
(yml_file_path, option_project_config)
:- Read the content of
yml_file_path
. - Attempt to parse it. A key decision: Does a single
.yml
file define oneModel
or aVec<Model>
(e.g., under a top-levelmodels:
key, like the currentBusterModel
wrapper in CLI)?- Assumption for now: A single
.yml
file maps to onesemantic_layer::Model
. If it can map toVec<Model>
, the parsing and iteration logic will adjust accordingly. let parsed_model: semantic_layer::Model = serde_yaml::from_str(&yml_content)?;
- Assumption for now: A single
- Read the content of
2. Configuration Resolution Logic:
- For each
parsed_model
and itsoption_project_config
and the globalbuster_config
:
// Conceptual logic in cli/cli/src/commands/deploy.rs
// ... assume parsed_model: semantic_layer::Model
// ... assume option_project_config: Option<ProjectConfig>
// ... assume global_buster_config: BusterConfig
fn resolve_model_configurations(
models_with_context: Vec<(semantic_layer::Model, Option<ProjectConfig>)>,
global_buster_config: &BusterConfig,
) -> Result<Vec<semantic_layer::Model>, anyhow::Error> {
let mut resolved_models = Vec::new();
for (mut model, proj_config_opt) in models_with_context {
// Resolve data_source_name
let resolved_ds_name = model.data_source_name.clone()
.or_else(|| proj_config_opt.as_ref().and_then(|pc| pc.data_source_name.clone()))
.or_else(|| global_buster_config.data_source_name.clone());
// Resolve schema
let resolved_schema = model.schema.clone()
.or_else(|| proj_config_opt.as_ref().and_then(|pc| pc.schema.clone()))
.or_else(|| global_buster_config.schema.clone());
// Resolve database
let resolved_database = model.database.clone()
.or_else(|| proj_config_opt.as_ref().and_then(|pc| pc.database.clone()))
.or_else(|| global_buster_config.database.clone());
// Validation: schema and data_source_name are essential for API processing
if resolved_ds_name.is_none() {
return Err(anyhow::anyhow!(
"Model '{}': data_source_name could not be resolved.", model.name
));
}
if resolved_schema.is_none() {
return Err(anyhow::anyhow!(
"Model '{}': schema could not be resolved.", model.name
));
}
// Database is also important for the API to correctly identify/qualify table names, especially for type inference.
// If the semantic_layer::Model has database as Option<String>, it implies API can handle it being None.
// However, for type inference, the API will likely need it. Let's assume for now it's good practice to resolve it if possible.
model.data_source_name = resolved_ds_name;
model.schema = resolved_schema;
model.database = resolved_database; // This is already Option<String> on semantic_layer::Model
// CLI Validations
if model.name.is_empty() {
return Err(anyhow::anyhow!("Found a model with an empty name."));
}
// Any other simple, fast validations...
resolved_models.push(model);
}
Ok(resolved_models)
}
Self-correction: The semantic_layer::Model
from api/libs/semantic_layer/src/models.rs
does not currently have data_source_name
. This field is part of the DeployDatasetsRequest
in the API and BusterConfig
in the CLI. The API request will need a top-level data_source_name
or each model in the array needs to carry its own. Given the project structure, it makes sense for data_source_name
to be resolvable per model too and be part of the semantic_layer::Model
struct sent to the API. This requires adding data_source_name: Option<String>
to semantic_layer::Model
(see prd_semantic_model_definition.md
).
3. API Payload Construction:
- The
resolved_models: Vec<semantic_layer::Model>
is then serialized to JSON. This JSON array is the body of the request to/deploy_datasets
.
// Conceptual: Sending to API
// let api_client = BusterClient::new(...);
// let response = api_client.deploy_datasets_raw(resolved_models).await?;
// The BusterClient method would take Vec<semantic_layer::Model> and serialize it.
Implementation Steps
- For each discovered model file path (and its associated
Option<ProjectConfig>
): a. Read file content. b. Parse YAML content intosemantic_layer::Model
(orVec<semantic_layer::Model>
if files can contain multiple). Handle parsing errors. - Implement the
resolve_model_configurations
function (or similar logic) to determinedata_source_name
,schema
, anddatabase
for each model using the precedence: Model file ->ProjectConfig
-> GlobalBusterConfig
.- (Depends on
data_source_name: Option<String>
being added tosemantic_layer::Model
)
- (Depends on
- Ensure the resolved
data_source_name
,schema
, anddatabase
are populated onto thesemantic_layer::Model
instances. - Perform CLI-side validations (e.g., model name is not empty, required configurations like
data_source_name
andschema
are present after resolution). - Group the resolved and validated
semantic_layer::Model
objects into aVec
. - This
Vec<semantic_layer::Model>
will be the collection to be serialized to JSON for the API request.
Tests
- Unit Tests for Parsing: ✅
- Valid YAML model file parses correctly into
semantic_layer::Model
. - Invalid YAML structure results in an error.
- File with multiple models (if supported) parses into
Vec<semantic_layer::Model>
.
- Valid YAML model file parses correctly into
- Unit Tests for Configuration Resolution: ✅
- Model gets config from its own file.
- Model gets config from
ProjectConfig
when not in file. - Model gets config from Global
BusterConfig
when not in file orProjectConfig
. - Error if required config (
data_source_name
,schema
) is missing after all checks. - Correct precedence is followed.
- Integration-like Tests (can use
deploy
command with mocked API client): ✅- End-to-end flow: discovery -> parsing -> config resolution -> payload for API is correctly formed.
- Scenario with
buster.yml
andprojects
structure. - Scenario with model-level overrides.
Success Criteria
- CLI can parse valid YAML model files into
semantic_layer::Model
structs. - Configuration inheritance (Model > Project > Global) for
data_source_name
,schema
,database
is correctly implemented and validated. - The list of models to be deployed is correctly prepared with all necessary information for the API.
- CLI performs basic validations before attempting to send to API.
- All tests pass.
Dependencies on Other Components
prd_semantic_model_definition.md
: Critical dependency for thesemantic_layer::Model
struct definition, including thedata_source_name
,database
, andschema
optional fields.prd_cli_config_and_discovery.md
: For obtaining the list of model files and their associatedProjectConfig
context.prd_api_request_handling.md
: The API endpoint must be ready to acceptVec<semantic_layer::Model>
as its payload.
Security Considerations
- Error handling during file reading and parsing should be robust to prevent crashes or information leakage from malformed files.
References
api/libs/semantic_layer/src/models.rs
cli/cli/src/commands/deploy.rs
(for existing deployment logic)serde_yaml
documentation.