Commit Graph

253 Commits

Author SHA1 Message Date
dal e7fedd0a59
working except for bulk in the validation 2025-02-11 15:21:53 -07:00
dal bf24b249d4
Refactor message and thread handling with explicit UUID references 2025-02-11 12:46:21 -07:00
dal 8e8140246d
Update message transformation to include chat and message IDs 2025-02-11 12:34:06 -07:00
dal 8b96ec01fb
Refactor message transformation with new container types and UUIDs 2025-02-11 11:53:06 -07:00
dal bf05c7f06b
Add new thread events for chat generation progress tracking 2025-02-11 11:21:57 -07:00
dal e7b96d9bd5
make a chart card 2025-02-11 10:58:25 -07:00
dal 41a7146a8d
Add initial chat response for new thread creation 2025-02-11 10:05:20 -07:00
dal 6fcfc4fba6
ids matching for stream messages 2025-02-11 09:57:19 -07:00
dal 2376153459
ids and initial message repeat handled 2025-02-11 09:36:28 -07:00
dal 973e9b41ce
ids are set 2025-02-11 09:21:08 -07:00
dal fcc15a0e6f
fix tool name transform path 2025-02-11 08:23:58 -07:00
dal 8940fbf3b6
using name as id for file type reasoning message 2025-02-11 08:21:32 -07:00
dal 8de08323fa
consistent message id for text stream 2025-02-11 08:10:58 -07:00
dal bcf1ac1a65
file stream is working 👍🏼 for create file. 2025-02-11 07:49:22 -07:00
dal afb40c90be
ok restructured the transformer to get vecs back 2025-02-11 07:40:07 -07:00
dal 90931fc029
Add support for chat routes and update thread route handling 2025-02-11 07:27:56 -07:00
dal d75931dcb0
We did it joe 2025-02-11 07:14:06 -07:00
dal 43e2cf44f4
its working, but not sure if it's how I want it... 2025-02-10 17:09:01 -07:00
dal 61153020ba
save point 2025-02-10 16:08:08 -07:00
dal 8c6c1d83ee
still working on the darn stream 2025-02-10 14:49:22 -07:00
dal fd65408e30
clean up to try different impl 2025-02-10 14:22:13 -07:00
dal 2593536efa
So close, just a few more tweaks to clean up stream. 2025-02-10 13:35:57 -07:00
dal 118ef9c691
ok streaming it back 2025-02-10 12:29:13 -07:00
dal 8c8372b50e
ok everything sending back except create and modify 2025-02-10 12:15:21 -07:00
dal d0400b5226
transforms for events 2025-02-10 11:53:19 -07:00
dal 233b580e1c
data catalog search transformation 2025-02-10 11:29:08 -07:00
dal 6a67931667
message transform 2025-02-10 10:47:16 -07:00
dal 93206fcf5c
added in duration and passed back req params 2025-02-10 09:21:07 -07:00
dal 9169a1c9e8
small tweaks 2025-02-10 08:26:25 -07:00
dal c849a22b4f
added in progress on tools 2025-02-10 08:24:56 -07:00
dal 8bd14e0ee7
added in progress on stream messages 2025-02-10 08:17:14 -07:00
dal 789b22fe1e
Improve message streaming and tool call processing
- Enhance message streaming with more precise content and tool call handling
- Add logic to only send and store meaningful assistant messages and tool calls
- Prevent sending empty or redundant messages during stream processing
- Improve tool call and content update tracking in agent stream method
- Optimize message inclusion in recursive thread generation
2025-02-10 07:30:52 -07:00
dal 641527114c
refactor(agent): Enhance LLM request handling with improved tool call processing
- Add support for initial non-tool response and subsequent tool-enabled processing
- Implement `PendingToolCall` to manage incremental tool call updates
- Update `Delta` and related types to support more flexible tool call streaming
- Modify agent stream processing to handle tool calls with improved state management
- Add robust handling for tool call deltas and function call arguments
2025-02-10 07:17:14 -07:00
dal 373a83efea
refactor(agent): Extract agent prompt to constant and update date
- Move hardcoded agent prompt to a constant `AGENT_PROMPT` in the `agent_thread.rs` file
- Update the prompt's date from January 27, 2025 to February 7, 2025
- Simplify thread initialization by referencing the new constant
- Maintain existing prompt structure and guidelines
2025-02-07 14:18:02 -07:00
dal 4b743fe5ec
feat(agent): Add recursion depth limit to prevent infinite processing
- Implement a maximum recursion depth of 30 for agent thread processing
- Add recursion depth tracking to prevent potential infinite loops
- Provide user-friendly message when maximum recursion depth is reached
- Update debug logging to include current recursion depth
- Modify both synchronous and streaming thread processing methods
2025-02-07 11:35:13 -07:00
dal 8b51618afd
refactor(agent): Implement recursive stream processing with improved tool execution
- Refactor agent stream processing to use a recursive approach for handling tool calls
- Enhance tool execution with more robust error handling and result tracking
- Improve stream chunk processing with detailed state management
- Add support for recursive thread generation based on tool call results
- Implement cloning for LiteLLMClient to support stream processing tasks
2025-02-07 11:22:43 -07:00
dal bb4e4ca9d8
refactor(ai_tools): Standardize search tool response format and requirements
- Update search data catalog and file search tools to use consistent JSON response structure
- Enforce strict response format with mandatory "results" key and comprehensive field requirements
- Add explicit guidelines for result composition and handling of empty result sets
- Improve response predictability and parsing reliability for AI-powered search tools
2025-02-07 11:12:08 -07:00
dal 8054bedf1a
fix(ai_tools): Update LLM request parameters and improve response handling
- Add `stream: Some(false)` to file search and data catalog tools
- Make `json_schema` optional in `ResponseFormat` serialization
- Enhance logging in search file tool with debug and warning messages
- Improve error context when parsing LLM JSON responses
2025-02-07 11:05:39 -07:00
dal 864257bc24
refactor(agent): Improve thread-safe tool management with Arc
- Wrap agent tools in an Arc for safe concurrent access
- Modify tool addition methods to work with Arc-wrapped HashMap
- Ensure thread-safe tool registration and retrieval
- Update stream processing to use Arc-cloned tools reference
2025-02-07 10:53:26 -07:00
dal 2bf27a9eda
feat(agent): Enhance agent thread processing with improved debugging and user-specific streaming
- Add comprehensive debug logging for agent thread processing
- Modify agent thread to use user ID for streaming subscription
- Update stream processing to include detailed tool call and content logging
- Improve error handling and visibility in stream processing
- Add user context to agent thread initialization
- Remove redundant tool execution handling in stream processing
2025-02-07 10:39:15 -07:00
dal a2f3433555
refactor(tools): Implement ValueToolExecutor for generic tool output conversion
- Add `ValueToolExecutor` to convert tool outputs to `serde_json::Value`
- Introduce `IntoValueTool` trait for easy value type conversion
- Update agent tool addition methods to use new value conversion mechanism
- Simplify tool registration by automatically converting tool outputs
- Remove previous manual boxing and type conversion logic
2025-02-07 10:15:55 -07:00
dal 2090e0b7d7
refactor(post_thread): Simplify thread creation with agent-based approach
- Remove complex AI-driven thread generation modules
- Introduce a new `AgentThreadHandler` to manage thread creation
- Streamline post thread logic by delegating to agent handler
- Remove deprecated SQL generation and AI-related modules
- Reduce code complexity and improve maintainability
2025-02-07 09:15:58 -07:00
dal adba6d6954
feat(data_catalog): Implement AI-powered dataset search tool
- Add comprehensive dataset search functionality using LLM for intelligent dataset matching
- Implement search across datasets with relevance ranking based on YML content
- Create structured search result output with dataset metadata
- Add robust error handling, logging, and parsing for search operations
- Include test coverage for search result validation
- Enhance tool with flexible query parameter support and detailed response messages
2025-02-07 08:08:37 -07:00
dal 372694bf1f
feat(file_search): Implement advanced AI-powered file search tool
- Add comprehensive file search functionality using LLM for intelligent file matching
- Implement search across metric and dashboard files with relevance ranking
- Create structured search result output with file metadata
- Add robust error handling and logging for search operations
- Include test coverage for search result parsing
2025-02-07 07:47:35 -07:00
dal 94c1635a34
refactor(file_tools): Enhance file modification and creation processes
- Implement robust line-based content modification for metric and dashboard files
- Add comprehensive error handling and validation for file modifications
- Improve modification tracking with detailed modification results
- Optimize file processing with batch insertion and validation
- Add extensive test coverage for modification and validation logic
2025-02-07 01:29:49 -07:00
dal 0e9075ca2c
Added Line Number Formatting on File outputs 2025-02-07 00:29:11 -07:00
dal 900eb28a67
removed bool on create_file tool 2025-02-06 23:54:11 -07:00
dal 711bbe899a
refactor(tools): Enhance ToolExecutor trait and file-related tools
- Add generic Output type to ToolExecutor trait
- Update file tools to use strongly-typed output structs
- Modify agent and tool implementations to support generic output
- Improve error handling and result reporting in file-related tools
- Add more detailed status messages for file operations
2025-02-06 23:45:48 -07:00
dal 4ec6e78648
refactor(database): Update metric and dashboard file models and migrations
- Rename file type modules from `metric_file` and `dashboard_file` to `metric_yml` and `dashboard_yml`
- Modify metric files migration to use a verification enum instead of boolean
- Update messages_to_files junction table with UUID primary key and additional timestamp columns
- Adjust database models to support new file and message structures
- Refactor file creation utility to use new model structures
2025-02-06 17:07:52 -07:00
dal b0e40007b0
Merge branch 'staging' into dal/goat-chat 2025-02-06 16:23:12 -07:00
dal 3ec059a99e
feat(prompts): Enhance SQL generation prompt with database identifier guidance
- Add instruction about paying attention to database identifier in SQL generation prompt
- Clarify potential cross-database referencing considerations
- Update SQL generation guidelines to improve clarity and flexibility
2025-02-06 16:21:53 -07:00
dal 33d5990907
feat(datasets): Add database_identifier support for dataset creation and deployment
- Extend Dataset model and schema to include optional database_identifier field
- Update dataset creation and deployment routes to handle new database_identifier parameter
- Modify dataset DDL generation to use database_identifier for schema resolution when available
2025-02-06 15:18:40 -07:00
dal c7a6e2788a
Merge branch 'staging' into dal/simplify-deploy-endpoint 2025-02-06 15:00:23 -07:00
dal 2ec0b7743f
feat(snowflake): Improve timestamp handling and JSON processing
- Add support for parsing Snowflake timestamp structs with epoch and fraction fields
- Implement handling of Snowflake timestamp logical types (with and without timezone)
- Enhance JSON value processing to detect and convert Snowflake timestamp objects
- Add error handling and logging for timestamp parsing
2025-02-06 14:09:39 -07:00
dal ef685d87a4
Fix x axis intervals (#105)
* xAxisTickinerval

* fix(visualization): Add x-axis time unit configuration for bar and combo charts

- Extend chart configuration to support optional x-axis time unit
- Update modify visualization agent to dynamically set x-axis time interval
- Modify bar line and combo chart prompts to include x_axis_time_unit parameter

* only use valid time units

* refactor(visualization): Simplify x-axis time unit configuration

- Modify modify_visualization_agent to extract and remove x-axis time unit more efficiently
- Update global styling result structure for x-axis time interval
- Adjust format_label_prompt comment to clarify date format default behavior

---------

Co-authored-by: Nate Kelley <nate@buster.so>
2025-02-06 11:58:25 -08:00
dal c27c27a7e8
fix: Improve column reference validation in model expression checks
- Update column validation to use model-defined columns instead of dataset columns
- Enhance error message to clarify column reference context
- Refine validation logic for expression column references
2025-02-06 10:46:12 -07:00
dal 4e2e9795b6
refactor: Simplify dataset deployment and validation process
- Restructure deploy_datasets function to separate concerns
- Improve column and relationship validation in dataset deployment
- Enhance error handling and validation result generation
- Add support for more comprehensive column and relationship checks
- Refactor validation logic to handle multiple error types
2025-02-05 23:42:51 -07:00
dal 08ecb44de1
Merge branch 'staging' into dal/simplify-deploy-endpoint 2025-02-05 21:56:45 -07:00
dal e144377ada
Tweaked the fix sql to return a json output so we don't get parse errors. 2025-02-05 21:55:19 -07:00
dal b872cf63a4
feat(prompts): Enhance SQL query generation and error handling instructions
- Update failed SQL fix prompt to emphasize query output format
- Add clarification to dataset selector prompt about selecting multiple datasets
2025-02-05 18:24:34 -07:00
dal fa480f6797
feat: enhance column metadata retrieval across database sources
- Add support for capturing source type (table, view, materialized view)
- Improve column metadata queries for Postgres, MySQL, BigQuery, and Snowflake
- Include more comprehensive column information during dataset import
- Extend DatasetColumnRecord to include source_type field
2025-02-05 18:21:40 -07:00
dal 6e5c299389
feat: improve dataset column validation and deployment process
- Add comprehensive column validation before dataset deployment
- Validate existence of all required columns in source database
- Simplify column type and nullability retrieval
- Enhance error reporting for missing columns
- Update deployment logic to use pre-validated column information
2025-02-05 17:20:11 -07:00
dal f081f3e16e
feat: enhance dataset validation and deployment error handling
- Add detailed validation error logging in CLI
- Improve type compatibility checks in dataset validation
- Modify deployment process to handle and report validation errors more comprehensively
- Add Hash derive for Verification enum
- Update API and CLI to support more informative validation results
2025-02-05 17:04:13 -07:00
dal fb75c1f554
fix(search): Add organization_id filter to semantic and terms search queries 2025-02-05 16:23:24 -07:00
dal 7bcd7d81bc
make sure the output of fix sql is delimited 2025-02-05 16:19:23 -07:00
dal 3c82ac0774
feat: add dataset validation and improved deployment process 2025-02-05 15:00:52 -07:00
dal 2c7ef16956
fix the search value table 2025-02-05 12:30:51 -07:00
dal d9973a13dd
bugfix(datasets): add delete dataset route
Implement a new DELETE route for removing datasets by their ID
2025-02-05 11:58:43 -07:00
dal 2be7383656
you learn something new every day... a schema in pg can't start with a number. 2025-02-05 11:36:31 -07:00
dal 960c89ab84
fix: janky check for values 2025-02-04 17:26:57 -07:00
dal 87a6225f1d
remove the limit query bc mixing things up 2025-02-04 17:18:02 -07:00
dal 0fde90b848
refactor(snowflake_query): optimize Arrow data processing with explicit row collection (#92) 2025-02-04 15:57:35 -08:00
dal d0ff21e10d
Merge pull request #90 from buster-so/dal/stored_values_enum_push_to_description
feat(stored_values): enhance column value processing with enum detect…
2025-02-04 15:10:54 -08:00
dal d4825c0ffe
bugfix(snowflake_query): add data processing helpers for query results (#88)
- Introduce helper functions for processing string and JSON values
- Implement case-insensitive string and JSON value transformations
- Add robust timestamp parsing with error handling
- Enhance Snowflake query result processing with consistent data normalization
2025-02-04 14:32:23 -08:00
dal 59049b5604
refactor(stored_values): improve background processing and error handling for stored column values (#85)
- Refactor stored values processing in dataset deployment to use background task
- Add `StoredValueColumn` struct to encapsulate column processing details
- Implement `process_stored_values_background` for parallel and resilient value storage
- Add logging for successful and failed stored value processing
- Update CLI to handle optional SQL definitions and improve file processing
2025-02-04 11:30:45 -08:00
dal ef18eff61d
Merge branch 'staging' into dal/goat-chat 2025-02-04 09:26:16 -07:00
dal daf4ec794f
Upgrade: Updated to o3-mini models
upgrade(ai): update OpenAI model configurations and add support for O…
2025-02-04 07:50:19 -08:00
dal 158f5ba0a9
Dal/stored values fix (#81)
* fix: add stored values support for dataset columns

This commit introduces stored values functionality for dataset columns, including:
- Adding a `stored_values` flag to column deployment requests
- Implementing a mechanism to store column values during dataset deployment
- Updating data analyst and SQL generation agents to leverage stored values
- Creating a new utility module for stored values search and management

* refactor(stored_values): improve stored values implementation and schema management

This commit enhances the stored values functionality with several key improvements:
- Update schema and table creation to use organization ID as schema name
- Modify stored values storage to include column ID
- Improve value extraction and embedding generation process
- Remove unnecessary distance calculation in search results
- Clean up unused values_engine module
2025-02-04 07:15:34 -08:00
dal f11bfa9941
fix(invite_users): simplify and streamline user invitation process (#78) 2025-02-03 12:41:00 -08:00
dal 2a183ca711
fix(dashboard): improve dashboard access and permission handling (#76) 2025-02-03 12:19:59 -08:00
dal a665648308
feat: Implement file creation for metric and dashboard files
- Added implementation for creating metric files with database insertion
- Introduced `create_metric_file()` and `create_dashboard_file()` functions
- Updated `CreateFilesTool` to handle different file types
- Added `data_metadata` field to `MetricFile` struct
- Implemented basic file type validation and creation logic
2025-02-03 12:51:55 -07:00
dal a3ee00ff84
refactor: Enhance file creation tool with error handling and type validation
- Added robust error handling for JSON parameter parsing in `CreateFilesTool`
- Updated file type description to clarify naming conventions
- Prepared for file creation logic implementation
- Moved file-related types to a new `file_types` module
- Removed redundant `types.rs` file
2025-01-31 10:16:35 -07:00
dal 7d153e06af
refactor: Consolidate file-related tools into a single module
- Removed individual tool files for bulk file modifications, file creation, file opening, data catalog search, file search, and sending to user
- Created a new `file_tools` module in `tools/mod.rs` to centralize file-related tool implementations
- Commented out individual tool module imports in preparation for future implementation
- Simplified the tools module structure for better organization and maintainability
2025-01-30 14:58:41 -07:00
dal ec04a5e98e
refactor: Enhance Agent and Tool management with new methods and tests
- Added environment variable-based LLM client initialization in `Agent::new()`
- Introduced `add_tool()` and `add_tools()` methods for more flexible tool registration
- Implemented new `get_name()` method for `ToolExecutor` trait
- Added comprehensive test cases for Agent with and without tools
- Updated `AgentThread` with a convenient constructor method
- Temporarily commented out unused tool modules
- Added debug print in LiteLLM client for response logging
2025-01-30 14:12:59 -07:00
dal 27016df995
chore: cleanup limit insertion 2025-01-28 12:50:31 -07:00
dal 6a73b59aa1
deprecated old threads and messages table 2025-01-28 12:18:59 -07:00
dal 22d75ae0b6
refactor: Update messages table schema and database references
- Renamed existing messages table to `messages_deprecated`
- Created new `messages` table with updated schema and additional indexes
- Updated Diesel schema to reflect new table structure and relationships
- Added new foreign key constraints for threads and users
- Prepared for migration of existing message data
2025-01-28 12:03:11 -07:00
dal d7087e8cd5
chore: snowflake warehouse specification 2025-01-28 11:45:37 -07:00
dal 48447d5bc5
Snowflake limit to prevent memory issues 2025-01-28 11:24:37 -07:00
dal 9624bc33ad
feat: Add file junction table and update database schema
- Created `messages_to_files` junction table to link messages with dashboard and metric files
- Added foreign key constraints and indexes for efficient file-message relationships
- Updated Diesel schema to include new `messages_to_files`, `dashboard_files`, and `metric_files` tables
- Removed unnecessary timestamp triggers from migration files
2025-01-28 11:21:51 -07:00
dal f1879dc15c
new tables 2025-01-28 09:57:40 -07:00
dal 6271e00b70
chore: Enhance SQL agent generation with data source type context (#69)
- Added support for retrieving and passing data source type during SQL agent generation
- Updated SQL generation prompts to include data source type in system messages
- Modified `generate_sql_agent` function to fetch and utilize data source type information
- Improved SQL generation context by dynamically incorporating data source type details
2025-01-28 07:35:57 -08:00
dal e190ff9a14
I apolog 2025-01-28 08:29:43 -07:00
dal 0020e6ed4a
refactor: Rename and update dataset search tool
- Renamed `search_datasets.rs` to `search_data_catalog.rs`
- Updated `mod.rs` to reflect the new module and tool name
- Removed the placeholder `SearchDatasetsTool` implementation
- Prepared for future implementation of data catalog search functionality
2025-01-27 14:24:10 -07:00
dal 2bc68e8599
refactor: Move ToolExecutor trait to tools module
- Relocated `ToolExecutor` trait from `agent/types.rs` to `tools/mod.rs`
- Updated import paths in `agent/agent.rs` to reflect new trait location
- Simplified `types.rs` by removing unused imports and trait definition
- Prepared for more comprehensive tool management in the tools module
2025-01-26 19:31:51 -07:00
dal 692f8f7a1d
refactor: Update Agent and Thread types for improved message handling
- Renamed `Thread` to `AgentThread` for clarity
- Modified `ToolExecutor` to return `serde_json::Value` instead of `String`
- Updated message processing to handle new message structures
- Improved content handling in streaming and tool call scenarios
- Simplified message content extraction and serialization
2025-01-26 08:45:49 -07:00
dal aeb1a02ba1
refactor: Re-enable LiteLLM client and agent modules
- Uncommented and activated module implementations for `LiteLLMClient` and `Agent`
- Updated test cases to use new message and tool call structures
- Simplified test assertions for chat completion and tool call scenarios
- Restored module exports in `mod.rs` files
2025-01-26 08:38:20 -07:00
dal aaaa8ef544
refactor: Disable LiteLLM client and agent modules temporarily
- Commented out imports and module exports for `LiteLLMClient` and `Agent`
- Removed active module implementations in `mod.rs` files
- Simplified type definitions and test cases
- Prepared for potential future reimplementation or restructuring
2025-01-26 08:30:08 -07:00
dal 9d22a10983
refactor: Simplify LiteLLM types and improve error handling
- Simplified message and tool structures in `types.rs`
- Updated `agent.rs` to use new type structures
- Enhanced error handling in agent tests with detailed error printing
- Updated `.cursorrules` with debugging recommendation
- Removed redundant fields and improved type flexibility
- Streamlined serialization and deserialization of AI client types
2025-01-26 07:58:24 -07:00
dal c560b660bb
refactor: Enhance LiteLLM client types and improve type flexibility
- Updated `types.rs` to support more complex message and response structures
- Added support for multi-content messages with `Content` struct
- Introduced optional fields and improved serialization handling
- Enhanced type flexibility for AI client responses
- Updated test cases to reflect new type structures
- Added `Default` implementation for `LiteLLMClient`
- Improved support for tool calls and streaming responses
2025-01-25 17:24:07 -07:00