refactor: deploy datasets much simpler now

This commit is contained in:
dal 2025-02-05 14:05:12 -07:00
parent cace5d438a
commit 66d9528619
No known key found for this signature in database
GPG Key ID: 16F4B0E1E9F61122
7 changed files with 1572 additions and 33 deletions

2
cli/.gitignore vendored
View File

@ -16,4 +16,4 @@ Cargo.lock
Makefile
.vscode/
/prds
/prds

View File

@ -3,6 +3,10 @@ name = "buster-cli"
version = "0.1.0"
edition = "2021"
[lib]
name = "buster_cli"
path = "src/lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
@ -21,3 +25,6 @@ serde_yaml = "0.9.34"
thiserror = "2.0.3"
tokio = { version = "1.36.0", features = ["full"] }
uuid = { version = "1.7.0", features = ["v4", "serde"] }
[dev-dependencies]
tempfile = "3.16.0"

View File

@ -1,30 +1,281 @@
# buster-cli
# Buster CLI
A CLI tool for creating and managing your semantic model in Buster.
A powerful command-line interface for managing semantic models in Buster. Deploy and manage your data models with ease, whether they're standalone or part of a dbt project.
This tool is two-way compatible with your dbt projects as well. We like dbt and think its a great tool,
## Features
- 🚀 Deploy semantic models directly from YAML files
- 🔄 Two-way compatibility with dbt projects
- 🎯 Simple configuration with smart defaults
- 📊 Support for dimensions, measures, and relationships
- 🛠️ Automatic SQL view generation
- 📝 Clear error reporting and validation
## Installation
TODO
## How does it work?
You can imagine Buster as a layer on top of your dbt project that allows you to create and manage semantic models.
```bash
cargo install buster-cli
```
## Quick Start
1. Obtain your Buster API key. You can create one [here](https://platform.buster.so/app/settings/api-keys).
1. **Authentication**
Initialize your project by running:
Get your API key from [Buster Platform](https://platform.buster.so/app/settings/api-keys) and authenticate:
```bash
buster init
```bash
buster auth
```
2. **Project Setup**
Initialize a new project:
```bash
buster init
```
This will:
- Create a `buster.yml` configuration file
- Set up the recommended directory structure
- Configure your data source connection
3. **Deploy Models**
```bash
buster deploy
```
## Project Structure
```
your-project/
├── buster.yml # Global configuration
├── models/ # Your semantic model definitions
│ ├── customers.yml
│ ├── orders.yml
│ └── products.yml
└── sql/ # SQL definitions (optional)
├── customers.sql
├── orders.sql
└── products.sql
```
This command will go through the following steps:
## Configuration
1. Authenticate with your Buster API key.
2. Checks to see if you have an existing dbt project. If you do, you will be prompted to use the existing project or create a new one.
### Global Configuration (buster.yml)
- If you choose to use the existing project, Buster will use the existing project to create semantic model files.
```yaml
# buster.yml
data_source_name: "my_warehouse" # Your default data source
schema: "analytics" # Default schema for models
```
### Model Definition (models/*.yml)
```yaml
# models/customers.yml
version: 1
models:
- name: customers
description: "Core customer data model"
data_source_name: "my_warehouse" # Optional, overrides global
schema: "analytics" # Optional, overrides global
# Define entities (for relationships)
entities:
- name: customer_id
expr: "id"
type: "primary"
description: "Primary customer identifier"
# Define dimensions
dimensions:
- name: email
expr: "email"
type: "string"
description: "Customer email address"
- name: signup_date
expr: "created_at::date"
type: "date"
description: "Date when customer signed up"
stored_values: true # Enable value caching
# Define measures
measures:
- name: total_customers
expr: "customer_id"
agg: "count_distinct"
description: "Total number of unique customers"
```
### SQL Definition (sql/*.sql)
```sql
-- sql/customers.sql
SELECT
id as customer_id,
email,
created_at
FROM raw.customers
WHERE deleted_at IS NULL
```
## Commands
### `buster deploy`
Deploy your semantic models to Buster.
```bash
# Deploy all models in the current directory
buster deploy
# Deploy a specific model
buster deploy models/customers.yml
# Deploy models in a specific directory
buster deploy models/
```
The deploy command will:
1. Find and validate all model files
2. Locate corresponding SQL files
3. Generate default SQL if none exists
4. Deploy to Buster with proper error handling
### `buster auth`
Manage your Buster authentication.
```bash
# Set up authentication
buster auth
# View current auth status
buster auth status
# Clear authentication
buster auth clear
```
### `buster init`
Initialize a new Buster project.
```bash
# Initialize in current directory
buster init
# Initialize in a specific directory
buster init my-project/
```
## Model Features
### Entity Relationships
Link models together using entity relationships:
```yaml
# models/orders.yml
models:
- name: orders
entities:
- name: customers # References customers.yml
expr: "customer_id"
type: "foreign"
description: "Link to customer"
```
### Stored Values
Enable value caching for better performance:
```yaml
dimensions:
- name: country
expr: "country_code"
type: "string"
stored_values: true # Values will be cached
```
### Default SQL Generation
If no SQL file exists, Buster generates a default SELECT statement:
```sql
SELECT * FROM schema.model_name
```
## Error Handling
The CLI provides clear error messages for common issues:
- Missing required fields
- Invalid relationships
- SQL syntax errors
- API communication issues
- Authentication problems
Example error output:
```
❌ Error processing customers.yml: data_source_name is required
⚠️ Warning: No SQL file found for 'customers', using default SELECT
```
## Best Practices
1. **Organization**
- Keep YAML files in `models/`
- Keep SQL files in `sql/`
- Use `buster.yml` for shared settings
2. **Naming**
- Use descriptive model names
- Match SQL and YAML file names
- Use lowercase with underscores
3. **Documentation**
- Add descriptions to all models
- Document dimensions and measures
- Explain relationships
4. **SQL**
- Keep SQL simple and focused
- Use CTEs for complex logic
- Add comments for clarity
## Known Limitations
- SQL files must be one directory up from YAML files
- Environment is fixed to "dev"
- No automatic relationship creation
- Simple SQL fallback may not suit all cases
## Troubleshooting
### Common Issues
1. **"Data source not found"**
- Verify data source exists in Buster
- Check data_source_name in config
- Ensure env='dev' is set
2. **"SQL file not found"**
- Check SQL file location
- Verify file naming matches
- Consider using default SQL
3. **"Invalid relationship"**
- Verify referenced model exists
- Check entity name matches
- Ensure proper file structure
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## License
MIT License - see [LICENSE](LICENSE) for details.

File diff suppressed because it is too large Load Diff

View File

@ -3,9 +3,11 @@ mod deploy;
mod generate;
mod import;
mod init;
mod deploy_v2;
pub use auth::auth;
pub use deploy::deploy;
pub use generate::generate;
pub use import::import;
pub use init::init;
pub use deploy_v2::deploy_v2;

4
cli/src/lib.rs Normal file
View File

@ -0,0 +1,4 @@
pub mod commands;
pub mod utils;
pub mod types;
pub mod error;

View File

@ -4,7 +4,7 @@ mod types;
mod utils;
use clap::{Parser, Subcommand};
use commands::{auth, deploy, generate, import, init};
use commands::{auth, deploy, deploy_v2, generate, import, init};
pub const APP_NAME: &str = "buster";
@ -16,16 +16,8 @@ pub enum Commands {
Generate,
Import,
Deploy {
#[arg(long)]
skip_dbt: bool,
#[arg(long)]
path: Option<String>,
#[arg(long)]
data_source_name: Option<String>,
#[arg(long)]
schema: Option<String>,
#[arg(long)]
env: Option<String>,
},
}
@ -45,13 +37,7 @@ async fn main() {
Commands::Auth => auth().await,
Commands::Generate => generate().await,
Commands::Import => import().await,
Commands::Deploy {
skip_dbt,
path,
data_source_name,
schema,
env,
} => deploy(skip_dbt, path.as_deref(), data_source_name.as_deref(), schema.as_deref(), env.as_deref()).await,
Commands::Deploy { path } => deploy_v2(path.as_deref()).await,
};
if let Err(e) = result {