Commit Graph

129 Commits

Author SHA1 Message Date
dal c6d70b62dc
ddl on query 2025-02-12 11:07:59 -07:00
dal 09ab45bbb5
cli release code 2025-02-12 09:00:02 -07:00
dal 37854342da
ok now generating descriptions ayo 2025-02-12 08:41:17 -07:00
dal 0c597e7b0f
snowflake type mapping correct 2025-02-12 08:19:38 -07:00
dal 59054cfefa
tweak to the diffing bug 2025-02-12 08:14:38 -07:00
dal 422b4c5da5
some temp fixes while we prep the new chat 2025-02-12 06:12:56 -07:00
dal 361c48fade
database_identifier during creation 2025-02-12 05:27:08 -07:00
dal ffaa373ec3
Merge remote-tracking branch 'origin/staging' into dal/simplify-deploy-endpoint 2025-02-12 05:25:09 -07:00
dal 61b6cfeffc
database mapping correct 2025-02-12 05:22:36 -07:00
dal c7a631de40
tweak on database from modelfiles 2025-02-12 05:04:34 -07:00
dal 2805d7ed70
implemented database name for snowflake warehouse. temp fix 2025-02-12 04:52:32 -07:00
dal 45739b73f2
get rid of unused code 2025-02-11 17:04:11 -07:00
dal 89912fed59
ITS WORKING 2025-02-11 16:57:44 -07:00
dal 29900149dd
almost there 2025-02-11 16:38:46 -07:00
dal d17cb7e49a
bulk work on deploy 2025-02-11 15:43:07 -07:00
dal e7fedd0a59
working except for bulk in the validation 2025-02-11 15:21:53 -07:00
dal 3ec059a99e
feat(prompts): Enhance SQL generation prompt with database identifier guidance
- Add instruction about paying attention to database identifier in SQL generation prompt
- Clarify potential cross-database referencing considerations
- Update SQL generation guidelines to improve clarity and flexibility
2025-02-06 16:21:53 -07:00
dal 33d5990907
feat(datasets): Add database_identifier support for dataset creation and deployment
- Extend Dataset model and schema to include optional database_identifier field
- Update dataset creation and deployment routes to handle new database_identifier parameter
- Modify dataset DDL generation to use database_identifier for schema resolution when available
2025-02-06 15:18:40 -07:00
dal c7a6e2788a
Merge branch 'staging' into dal/simplify-deploy-endpoint 2025-02-06 15:00:23 -07:00
dal 2ec0b7743f
feat(snowflake): Improve timestamp handling and JSON processing
- Add support for parsing Snowflake timestamp structs with epoch and fraction fields
- Implement handling of Snowflake timestamp logical types (with and without timezone)
- Enhance JSON value processing to detect and convert Snowflake timestamp objects
- Add error handling and logging for timestamp parsing
2025-02-06 14:09:39 -07:00
dal ef685d87a4
Fix x axis intervals (#105)
* xAxisTickinerval

* fix(visualization): Add x-axis time unit configuration for bar and combo charts

- Extend chart configuration to support optional x-axis time unit
- Update modify visualization agent to dynamically set x-axis time interval
- Modify bar line and combo chart prompts to include x_axis_time_unit parameter

* only use valid time units

* refactor(visualization): Simplify x-axis time unit configuration

- Modify modify_visualization_agent to extract and remove x-axis time unit more efficiently
- Update global styling result structure for x-axis time interval
- Adjust format_label_prompt comment to clarify date format default behavior

---------

Co-authored-by: Nate Kelley <nate@buster.so>
2025-02-06 11:58:25 -08:00
dal c27c27a7e8
fix: Improve column reference validation in model expression checks
- Update column validation to use model-defined columns instead of dataset columns
- Enhance error message to clarify column reference context
- Refine validation logic for expression column references
2025-02-06 10:46:12 -07:00
dal 4e2e9795b6
refactor: Simplify dataset deployment and validation process
- Restructure deploy_datasets function to separate concerns
- Improve column and relationship validation in dataset deployment
- Enhance error handling and validation result generation
- Add support for more comprehensive column and relationship checks
- Refactor validation logic to handle multiple error types
2025-02-05 23:42:51 -07:00
dal 08ecb44de1
Merge branch 'staging' into dal/simplify-deploy-endpoint 2025-02-05 21:56:45 -07:00
dal e144377ada
Tweaked the fix sql to return a json output so we don't get parse errors. 2025-02-05 21:55:19 -07:00
dal b872cf63a4
feat(prompts): Enhance SQL query generation and error handling instructions
- Update failed SQL fix prompt to emphasize query output format
- Add clarification to dataset selector prompt about selecting multiple datasets
2025-02-05 18:24:34 -07:00
dal fa480f6797
feat: enhance column metadata retrieval across database sources
- Add support for capturing source type (table, view, materialized view)
- Improve column metadata queries for Postgres, MySQL, BigQuery, and Snowflake
- Include more comprehensive column information during dataset import
- Extend DatasetColumnRecord to include source_type field
2025-02-05 18:21:40 -07:00
dal 6e5c299389
feat: improve dataset column validation and deployment process
- Add comprehensive column validation before dataset deployment
- Validate existence of all required columns in source database
- Simplify column type and nullability retrieval
- Enhance error reporting for missing columns
- Update deployment logic to use pre-validated column information
2025-02-05 17:20:11 -07:00
dal f081f3e16e
feat: enhance dataset validation and deployment error handling
- Add detailed validation error logging in CLI
- Improve type compatibility checks in dataset validation
- Modify deployment process to handle and report validation errors more comprehensively
- Add Hash derive for Verification enum
- Update API and CLI to support more informative validation results
2025-02-05 17:04:13 -07:00
dal fb75c1f554
fix(search): Add organization_id filter to semantic and terms search queries 2025-02-05 16:23:24 -07:00
dal 7bcd7d81bc
make sure the output of fix sql is delimited 2025-02-05 16:19:23 -07:00
dal 3c82ac0774
feat: add dataset validation and improved deployment process 2025-02-05 15:00:52 -07:00
dal 2c7ef16956
fix the search value table 2025-02-05 12:30:51 -07:00
dal d9973a13dd
bugfix(datasets): add delete dataset route
Implement a new DELETE route for removing datasets by their ID
2025-02-05 11:58:43 -07:00
dal 2be7383656
you learn something new every day... a schema in pg can't start with a number. 2025-02-05 11:36:31 -07:00
dal 960c89ab84
fix: janky check for values 2025-02-04 17:26:57 -07:00
dal 87a6225f1d
remove the limit query bc mixing things up 2025-02-04 17:18:02 -07:00
dal 0fde90b848
refactor(snowflake_query): optimize Arrow data processing with explicit row collection (#92) 2025-02-04 15:57:35 -08:00
dal d0ff21e10d
Merge pull request #90 from buster-so/dal/stored_values_enum_push_to_description
feat(stored_values): enhance column value processing with enum detect…
2025-02-04 15:10:54 -08:00
dal d4825c0ffe
bugfix(snowflake_query): add data processing helpers for query results (#88)
- Introduce helper functions for processing string and JSON values
- Implement case-insensitive string and JSON value transformations
- Add robust timestamp parsing with error handling
- Enhance Snowflake query result processing with consistent data normalization
2025-02-04 14:32:23 -08:00
dal 59049b5604
refactor(stored_values): improve background processing and error handling for stored column values (#85)
- Refactor stored values processing in dataset deployment to use background task
- Add `StoredValueColumn` struct to encapsulate column processing details
- Implement `process_stored_values_background` for parallel and resilient value storage
- Add logging for successful and failed stored value processing
- Update CLI to handle optional SQL definitions and improve file processing
2025-02-04 11:30:45 -08:00
dal daf4ec794f
Upgrade: Updated to o3-mini models
upgrade(ai): update OpenAI model configurations and add support for O…
2025-02-04 07:50:19 -08:00
dal 158f5ba0a9
Dal/stored values fix (#81)
* fix: add stored values support for dataset columns

This commit introduces stored values functionality for dataset columns, including:
- Adding a `stored_values` flag to column deployment requests
- Implementing a mechanism to store column values during dataset deployment
- Updating data analyst and SQL generation agents to leverage stored values
- Creating a new utility module for stored values search and management

* refactor(stored_values): improve stored values implementation and schema management

This commit enhances the stored values functionality with several key improvements:
- Update schema and table creation to use organization ID as schema name
- Modify stored values storage to include column ID
- Improve value extraction and embedding generation process
- Remove unnecessary distance calculation in search results
- Clean up unused values_engine module
2025-02-04 07:15:34 -08:00
dal f11bfa9941
fix(invite_users): simplify and streamline user invitation process (#78) 2025-02-03 12:41:00 -08:00
dal 2a183ca711
fix(dashboard): improve dashboard access and permission handling (#76) 2025-02-03 12:19:59 -08:00
dal 27016df995
chore: cleanup limit insertion 2025-01-28 12:50:31 -07:00
dal d7087e8cd5
chore: snowflake warehouse specification 2025-01-28 11:45:37 -07:00
dal 48447d5bc5
Snowflake limit to prevent memory issues 2025-01-28 11:24:37 -07:00
dal 6271e00b70
chore: Enhance SQL agent generation with data source type context (#69)
- Added support for retrieving and passing data source type during SQL agent generation
- Updated SQL generation prompts to include data source type in system messages
- Modified `generate_sql_agent` function to fetch and utilize data source type information
- Improved SQL generation context by dynamically incorporating data source type details
2025-01-28 07:35:57 -08:00
dal 5202438fa8
Dal/cli-updates-skip-dbt (#67)
* feat: enhance deploy command with skip_dbt option

- Updated the Deploy command to accept a `skip_dbt` boolean argument, allowing users to bypass the dbt run during deployment.
- Refactored the deploy function to conditionally execute the dbt command based on the `skip_dbt` flag, improving deployment flexibility.

* Refactor query engine and CLI commands for improved functionality and error handling

- Updated `get_bigquery_columns` and `get_snowflake_columns` functions to enhance column name handling and ensure proper error reporting.
- Modified `get_snowflake_client` to accept a database ID for better connection management.
- Enhanced the `deploy` command in the CLI to include additional parameters (`path`, `data_source_name`, `schema`, `env`) for more flexible deployments.
- Improved error handling and reporting in the `deploy` function, including detailed summaries of deployment errors and successful file processing.
- Updated `get_model_files` to accept a directory path and added checks for file existence, enhancing robustness.
- Adjusted model file structures to include schema information and refined the upload process to handle optional parameters more effectively.

These changes collectively improve the usability and reliability of the query engine and deployment process.

* Update dataset DDL generation to include optional YML file content

- Modified `generate_sql_agent` to append optional YML file content to dataset DDL
- Ensures more comprehensive dataset representation during SQL agent generation
- Handles cases where YML file might be present or absent gracefully
2025-01-24 16:00:38 -08:00