- Add comprehensive column validation before dataset deployment
- Validate existence of all required columns in source database
- Simplify column type and nullability retrieval
- Enhance error reporting for missing columns
- Update deployment logic to use pre-validated column information
- Add detailed validation error logging in CLI
- Improve type compatibility checks in dataset validation
- Modify deployment process to handle and report validation errors more comprehensively
- Add Hash derive for Verification enum
- Update API and CLI to support more informative validation results
- Introduce helper functions for processing string and JSON values
- Implement case-insensitive string and JSON value transformations
- Add robust timestamp parsing with error handling
- Enhance Snowflake query result processing with consistent data normalization
- Refactor stored values processing in dataset deployment to use background task
- Add `StoredValueColumn` struct to encapsulate column processing details
- Implement `process_stored_values_background` for parallel and resilient value storage
- Add logging for successful and failed stored value processing
- Update CLI to handle optional SQL definitions and improve file processing
* fix: add stored values support for dataset columns
This commit introduces stored values functionality for dataset columns, including:
- Adding a `stored_values` flag to column deployment requests
- Implementing a mechanism to store column values during dataset deployment
- Updating data analyst and SQL generation agents to leverage stored values
- Creating a new utility module for stored values search and management
* refactor(stored_values): improve stored values implementation and schema management
This commit enhances the stored values functionality with several key improvements:
- Update schema and table creation to use organization ID as schema name
- Modify stored values storage to include column ID
- Improve value extraction and embedding generation process
- Remove unnecessary distance calculation in search results
- Clean up unused values_engine module
- Added implementation for creating metric files with database insertion
- Introduced `create_metric_file()` and `create_dashboard_file()` functions
- Updated `CreateFilesTool` to handle different file types
- Added `data_metadata` field to `MetricFile` struct
- Implemented basic file type validation and creation logic
- Added robust error handling for JSON parameter parsing in `CreateFilesTool`
- Updated file type description to clarify naming conventions
- Prepared for file creation logic implementation
- Moved file-related types to a new `file_types` module
- Removed redundant `types.rs` file
- Removed individual tool files for bulk file modifications, file creation, file opening, data catalog search, file search, and sending to user
- Created a new `file_tools` module in `tools/mod.rs` to centralize file-related tool implementations
- Commented out individual tool module imports in preparation for future implementation
- Simplified the tools module structure for better organization and maintainability
- Added environment variable-based LLM client initialization in `Agent::new()`
- Introduced `add_tool()` and `add_tools()` methods for more flexible tool registration
- Implemented new `get_name()` method for `ToolExecutor` trait
- Added comprehensive test cases for Agent with and without tools
- Updated `AgentThread` with a convenient constructor method
- Temporarily commented out unused tool modules
- Added debug print in LiteLLM client for response logging
- Renamed existing messages table to `messages_deprecated`
- Created new `messages` table with updated schema and additional indexes
- Updated Diesel schema to reflect new table structure and relationships
- Added new foreign key constraints for threads and users
- Prepared for migration of existing message data
- Created `messages_to_files` junction table to link messages with dashboard and metric files
- Added foreign key constraints and indexes for efficient file-message relationships
- Updated Diesel schema to include new `messages_to_files`, `dashboard_files`, and `metric_files` tables
- Removed unnecessary timestamp triggers from migration files
- Added support for retrieving and passing data source type during SQL agent generation
- Updated SQL generation prompts to include data source type in system messages
- Modified `generate_sql_agent` function to fetch and utilize data source type information
- Improved SQL generation context by dynamically incorporating data source type details
- Renamed `search_datasets.rs` to `search_data_catalog.rs`
- Updated `mod.rs` to reflect the new module and tool name
- Removed the placeholder `SearchDatasetsTool` implementation
- Prepared for future implementation of data catalog search functionality
- Relocated `ToolExecutor` trait from `agent/types.rs` to `tools/mod.rs`
- Updated import paths in `agent/agent.rs` to reflect new trait location
- Simplified `types.rs` by removing unused imports and trait definition
- Prepared for more comprehensive tool management in the tools module
- Renamed `Thread` to `AgentThread` for clarity
- Modified `ToolExecutor` to return `serde_json::Value` instead of `String`
- Updated message processing to handle new message structures
- Improved content handling in streaming and tool call scenarios
- Simplified message content extraction and serialization
- Uncommented and activated module implementations for `LiteLLMClient` and `Agent`
- Updated test cases to use new message and tool call structures
- Simplified test assertions for chat completion and tool call scenarios
- Restored module exports in `mod.rs` files
- Commented out imports and module exports for `LiteLLMClient` and `Agent`
- Removed active module implementations in `mod.rs` files
- Simplified type definitions and test cases
- Prepared for potential future reimplementation or restructuring
- Simplified message and tool structures in `types.rs`
- Updated `agent.rs` to use new type structures
- Enhanced error handling in agent tests with detailed error printing
- Updated `.cursorrules` with debugging recommendation
- Removed redundant fields and improved type flexibility
- Streamlined serialization and deserialization of AI client types
- Updated `types.rs` to support more complex message and response structures
- Added support for multi-content messages with `Content` struct
- Introduced optional fields and improved serialization handling
- Enhanced type flexibility for AI client responses
- Updated test cases to reflect new type structures
- Added `Default` implementation for `LiteLLMClient`
- Improved support for tool calls and streaming responses
- Updated `.env.example` with new LLM configuration variables
- Modified `LiteLLMClient` to support API key and base URL from environment variables
- Enhanced client initialization to use env vars with optional parameter overrides
- Added comprehensive test case for environment variable configuration
- Updated test cases to use `mockito::Server::new_async()`
- Added request body validation in test mocks
- Improved test coverage for chat completion and streaming scenarios
- Updated .cursorrules with detailed testing guidelines
- Added dev dependencies for Mockito and Tokio async testing
- Updated .cursorrules with async test requirement
- Expanded utils modules with new serde_helpers and tools
- Added new AI client module for LiteLLM
* feat: enhance deploy command with skip_dbt option
- Updated the Deploy command to accept a `skip_dbt` boolean argument, allowing users to bypass the dbt run during deployment.
- Refactored the deploy function to conditionally execute the dbt command based on the `skip_dbt` flag, improving deployment flexibility.
* Refactor query engine and CLI commands for improved functionality and error handling
- Updated `get_bigquery_columns` and `get_snowflake_columns` functions to enhance column name handling and ensure proper error reporting.
- Modified `get_snowflake_client` to accept a database ID for better connection management.
- Enhanced the `deploy` command in the CLI to include additional parameters (`path`, `data_source_name`, `schema`, `env`) for more flexible deployments.
- Improved error handling and reporting in the `deploy` function, including detailed summaries of deployment errors and successful file processing.
- Updated `get_model_files` to accept a directory path and added checks for file existence, enhancing robustness.
- Adjusted model file structures to include schema information and refined the upload process to handle optional parameters more effectively.
These changes collectively improve the usability and reliability of the query engine and deployment process.
* Update dataset DDL generation to include optional YML file content
- Modified `generate_sql_agent` to append optional YML file content to dataset DDL
- Ensures more comprehensive dataset representation during SQL agent generation
- Handles cases where YML file might be present or absent gracefully
- Added `html-escape` crate to `Cargo.toml` for HTML escaping.
- Updated email template processing to escape HTML in message and button text, preventing potential XSS vulnerabilities.
- Modified test cases to include HTML content in email parameters, ensuring proper handling and escaping.
This change improves security by sanitizing user input in email communications.
* fix permission check on post_dataset rest
* refactor: enhance dataset overview access lineage and permission checks
- Updated the `get_dataset_overview` function to conditionally add default access lineage based on user roles and existing access paths.
- Simplified the logic for adding user roles to the lineage, ensuring clarity and maintainability.
- Improved handling for the `RestrictedQuerier` role to include checks for existing access before adding default lineage, enhancing permission accuracy.
- Streamlined code by removing redundant checks and consolidating role handling, optimizing overall readability.
* feat: Enhance permission group handling and data retrieval
- Introduced a new `PermissionGroupInfo` struct to encapsulate detailed information about permission groups, including user and dataset counts.
- Updated the `get_permission_group` and `list_permission_groups` functions to improve data retrieval and error handling.
- Refactored SQL queries in `list_permission_groups` to include additional joins for counting users and datasets associated with permission groups, enhancing the overall functionality and clarity of the API.
- Streamlined code for better readability and maintainability, ensuring consistent handling of user and permission group data.
* refactor: Improve dataset access handling and permission checks
- Enhanced the `get_restricted_user_datasets` and `get_restricted_user_datasets_with_metadata` functions to include additional permission checks for dataset groups and permission groups.
- Consolidated SQL queries to ensure proper filtering of deleted records and improved clarity in dataset retrieval logic.
- Introduced new joins and filters to handle dataset group permissions, ensuring accurate access control for users.
- Streamlined code for better readability and maintainability, enhancing overall functionality in dataset access management.
* fix: Update SQL migration and seed data for user attributes
- Modified the SQL migration to specify the schema for the `users` table, ensuring clarity in the update statement.
- Adjusted the seed data for `users_to_organizations` to change the `organization_id` from 'public' to 'none', reflecting a more accurate state for user roles and organization associations.
- Ensured consistency in the formatting of SQL insert statements for better readability.
* fix: Prevent users from updating their own profiles
- Added a check in the `update_user_handler` to prevent users from updating their own information, returning an error if they attempt to do so.
- This change enhances security by ensuring that users cannot modify their own records, which could lead to unauthorized changes.
* refactor: Simplify dashboard permission queries by removing team-based joins
- Removed left joins with `teams_to_users` table in dashboard permission queries
- Simplified permission checks to only filter by direct user ID
- Updated queries in `get_user_dashboard_permission`, `get_bulk_user_dashboard_permission`, and `list_dashboards_handler`
- Streamlined SQL query logic for more direct and efficient permission checks
- Enhanced the `get_restricted_user_datasets` and `get_restricted_user_datasets_with_metadata` functions to include additional permission checks for dataset groups and permission groups.
- Consolidated SQL queries to ensure proper filtering of deleted records and improved clarity in dataset retrieval logic.
- Introduced new joins and filters to handle dataset group permissions, ensuring accurate access control for users.
- Streamlined code for better readability and maintainability, enhancing overall functionality in dataset access management.
- Introduced a new `PermissionGroupInfo` struct to encapsulate detailed information about permission groups, including user and dataset counts.
- Updated the `get_permission_group` and `list_permission_groups` functions to improve data retrieval and error handling.
- Refactored SQL queries in `list_permission_groups` to include additional joins for counting users and datasets associated with permission groups, enhancing the overall functionality and clarity of the API.
- Streamlined code for better readability and maintainability, ensuring consistent handling of user and permission group data.
- Updated the `get_dataset_overview` function to conditionally add default access lineage based on user roles and existing access paths.
- Simplified the logic for adding user roles to the lineage, ensuring clarity and maintainability.
- Improved handling for the `RestrictedQuerier` role to include checks for existing access before adding default lineage, enhancing permission accuracy.
- Streamlined code by removing redundant checks and consolidating role handling, optimizing overall readability.
- Updated the `get_snowflake_client` function to no longer require `warehouse_id` and `database_id`, simplifying the connection process.
- This change enhances flexibility in client initialization and aligns with recent updates to Snowflake API handling.
- Downgraded the `base64` crate version in `Cargo.toml` from `0.22.1` to `0.21`.
- Refactored the `snowflake_query` function in `snowflake_query.rs` to improve data type handling, including support for additional Arrow data types and enhanced null value checks.
- Updated the `route_to_query` function in `query_router.rs` to use mutable `snowflake_client` for better state management during query execution.
- Improved error handling for closing the Snowflake client session, ensuring proper logging of any issues encountered.
- Streamlined the `get_dataset_overview` function to improve access control for the `RestrictedQuerier` role, ensuring more precise permission checks.
- Updated the `get_user_information` function to optimize dataset processing, categorizing datasets based on direct access and permission group access.
- Removed redundant code and improved readability by consolidating logic for user roles, enhancing maintainability.
- Enhanced lineage tracking for datasets, providing a clearer representation of user permissions across different access types.
- Enhanced the `get_dataset_overview` function to refine access control for the `RestrictedQuerier` role, allowing for more granular permission checks based on various access paths.
- Updated the `get_user_information` function to streamline dataset processing, ensuring that datasets are categorized correctly based on direct access and permission group access.
- Removed redundant code and improved readability by consolidating logic for user roles, enhancing maintainability of both functions.
- Improved lineage tracking for datasets, providing a clearer representation of user permissions across different access types.
- Introduced a default access lineage for users, ensuring consistent representation of user permissions.
- Simplified the addition of user roles to the lineage, consolidating logic for WorkspaceAdmin, DataAdmin, Querier, and Viewer roles.
- Enhanced lineage tracking for RestrictedQuerier role to include direct dataset access and permission group lineage, improving granularity of dataset permissions.
- Removed redundant code related to dataset and permission group lineage, optimizing readability and maintainability of the `get_dataset_overview` function.
- Simplified access control logic for datasets based on user roles, consolidating conditions for WorkspaceAdmin, DataAdmin, Querier, Viewer, and RestrictedQuerier.
- Enhanced dataset lineage tracking to provide clearer representation of user permissions across various dataset access types.
- Removed redundant code related to dataset processing, improving readability and maintainability of the `get_user_information` function.
- Ensured that datasets are correctly categorized based on direct access, permission group access, and organization datasets, optimizing the overall data retrieval process.
- Updated `get_dataset_overview` to include dataset group access and permission group to dataset group access, improving the granularity of dataset permissions.
- Introduced new queries to fetch dataset groups and their associated permissions, enhancing the dataset overview for users.
- Refactored `get_user_information` to concurrently retrieve dataset groups and permission group datasets, optimizing performance with `tokio::spawn`.
- Enhanced lineage tracking for datasets, allowing for better representation of user permissions across dataset groups and permission groups.
- Improved error handling during database queries to ensure robust data retrieval.
- Added new structs `DatasetLineage` and `DatasetInfo` to represent dataset details and lineage.
- Updated `UserResponse` to include a list of datasets associated with the user.
- Refactored `get_user_information` function to concurrently fetch user info, direct datasets, permission group datasets, and organization datasets using `tokio::spawn` for improved performance.
- Implemented logic to compile datasets based on direct access and permission group access, including lineage tracking for better data representation.
- Enhanced error handling during database queries to ensure robust user information retrieval.
- Introduced a new `DatasetToDatasetGroup` struct to represent the relationship between datasets and dataset groups, including fields for timestamps and optional deletion.
- Updated the database schema to include `updated_at` and `deleted_at` fields for the `datasets_to_dataset_groups` table, enhancing data tracking capabilities.
- Refactored the routing in `mod.rs` to include a nested router for assets, improving the organization of dataset group routes.
- Introduced new PUT routes for managing users and dataset groups in the assets module.
- Updated the router to support PUT requests for `/users`, `/dataset_groups`, and `/datasets`, enhancing the API's functionality for resource updates.
- Improved modularity by organizing related routes within the assets module.
- Changed the parameter in the SQL query from `user.id` to `user_id` for consistency with the updated user ID parameter naming convention.
- Enhanced the SQL query to count distinct dataset permissions and utilize `bool_or` for identity checks, improving accuracy and performance.
- Cleaned up the grouping in the SQL query by removing unnecessary fields, streamlining the data retrieval process.
- Added a new `assets` module to organize related routes.
- Updated the routing in `mod.rs` to nest the `assets` router under the `/:permission_group_id` path, enhancing the structure and clarity of the API.
- Maintained existing routes for managing permission groups while improving modularity.
- Introduced a new `organization_id` field in the `DatasetGroupPermission` struct to associate permissions with specific organizations.
- Updated the `put_dataset_groups_handler` to include `organization_id` when creating or updating dataset group permissions, enhancing the API's capability to manage permissions at the organizational level.
- Improved SQL query formatting for better readability in the handler.
- Introduced a new `TeamInfoRole` enum to represent user roles within teams, replacing the previous boolean `assigned` field.
- Updated the `list_teams` handler to return team roles instead of assignment status, improving clarity on user roles.
- Refactored the `put_teams` handler to support role-based assignments, allowing for more granular control over team memberships.
- Added new PUT routes for dataset groups and permission groups in the user assets router, enhancing API capabilities.
- Improved SQL queries for team assignments to utilize role information, streamlining database interactions.
- Updated all user-related route handlers to use `user_id` instead of `id` for better clarity and consistency.
- Modified the routing definitions in `mod.rs` to reflect the new parameter naming convention.
- Enhanced the `list_permission_groups` function to accept `user_id` as a parameter, improving clarity in the handler's signature.
- Ensured all relevant functions now consistently handle the `user_id` parameter, streamlining the codebase and improving maintainability.
- Reformatted imports in `mod.rs` for better readability.
- Commented out the PUT route for `/teams` in the user assets router, indicating a potential future change or deprecation.
- Updated the `put_teams` handler to return a `NoContent` response upon successful execution, enhancing clarity in API responses.
- Improved error handling in the `put_teams` function for better logging and response management.
- Introduced a new module `put_teams` to handle updates for teams.
- Added a PUT route for `/teams` in the user assets router, allowing for team modifications.
- Enhanced the routing capabilities of the user assets API to support both GET and PUT requests for teams.
- Expanded the `allow_columns_to_appear_in_same_group_by_clause!` macro in `models.rs` to include additional columns for datasets and users, improving query flexibility.
- Refactored the `list_permission_groups` function to include dataset count and assigned status, enhancing the information returned for each permission group.
- Updated SQL queries in `list_permission_groups` to utilize left joins for better data retrieval and to ensure accurate permission checks.
- Removed redundant column allowances in various files, streamlining the codebase and improving maintainability.
- Added user authorization checks in `list_attributes`, `list_dataset_groups`, `list_datasets`, `list_permission_groups`, and `list_teams` functions to ensure only users with appropriate roles can access these resources.
- Refactored the `list_teams_handler` to accept `user_id` as a parameter, improving clarity and consistency across user-related functions.
- Updated SQL queries to utilize the new authorization checks, enhancing security and data integrity.
- Removed redundant column allowances in `list_teams` permissions, streamlining the codebase.
- Updated the user role attribute key from "role" to "organization_role" for accurate role retrieval.
- Introduced a read-only flag for specific user attributes, improving data integrity by clearly indicating which attributes should not be modified.
- Enhanced error handling for user role retrieval, ensuring robust responses for missing or incorrect attributes.
- Updated the `list_datasets` function to accept an additional `id` parameter for filtering datasets based on user permissions.
- Enhanced the SQL query to join with the `dataset_permissions` table, allowing retrieval of permission details for each dataset.
- Refactored the `DatasetInfo` struct to include an `assigned` field, improving clarity in the dataset representation.
- Improved error handling for dataset retrieval, ensuring robust logging and response management.
- Modified the `list_dataset_groups` function to accept an additional `id` parameter for filtering dataset groups based on user permissions.
- Updated the SQL query to join with the `dataset_groups_permissions` table, allowing retrieval of permission counts for each dataset group.
- Refactored the `DatasetGroupInfo` struct to replace `permission_id` with `permission_count`, enhancing clarity and accuracy in the data representation.
- Ensured that the query groups by the new permission structure, improving the functionality and security of dataset group listings.
- Introduced a new `DatasetGroupPermission` struct in `models.rs` to represent permissions associated with dataset groups.
- Updated the database schema in `schema.rs` to include the `dataset_groups_permissions` table, defining its structure and relationships.
- Modified the `is_user_workspace_admin_or_data_admin` function in `checks.rs` to correctly reference the user's organization role, enhancing role validation logic.
- Enhanced the `list_dataset_groups` function to join with the `dataset_permissions` table, allowing retrieval of permission details for each dataset group.
- Modified the `DatasetGroupInfo` struct to include `permission_id` and `assigned` fields, reflecting the new data structure.
- Refactored the SQL query to group by necessary fields and ensure accurate permission data is returned, improving the functionality and security of dataset group listings.
- Updated the `list_attributes_handler` to include authorization checks for user roles and organization IDs.
- Implemented error handling for unauthorized access to user attributes.
- Refactored the SQL query to retrieve user attributes based on the authenticated user's organization, improving security and data integrity.
- This change ensures that only authorized users can list attributes, enhancing the overall security of the API.
- Introduced a new `assets` module to handle asset-related routes.
- Updated the user router to nest the `assets` routes under the user ID path, enhancing the organization of API endpoints.
- This change improves the structure and maintainability of the user-related routes in the API.
- Refactored dataset listing logic to incorporate user organization roles, allowing for more granular access control based on user permissions.
- Introduced new role checks for `WorkspaceAdmin`, `DataAdmin`, `Querier`, `RestrictedQuerier`, and `Viewer` to determine dataset visibility.
- Updated database queries to fetch datasets based on user roles and organization associations, improving data retrieval efficiency.
- Removed deprecated functions and streamlined the dataset fetching process, ensuring clarity and maintainability in the codebase.
These changes improve the API's security and usability by enforcing role-based access control for dataset operations.
- Updated the user update route to require a user ID in the URL, ensuring the correct user is updated based on the provided ID.
- Improved clarity and functionality of the `update_user` function by extracting the user ID from the path.
These changes align the user update endpoint with standard REST conventions, enhancing overall API usability.
- Added permission validation to the `deploy_datasets` and `post_dataset` functions to ensure only users with workspace admin or data admin roles can execute these actions.
- Enhanced error handling for permission checks, returning appropriate HTTP status codes and messages for insufficient permissions and internal errors.
- Updated imports to include the new security checks module for consistency across routes.
These changes improve security by enforcing role-based access control in critical dataset operations.
- Changed the user update route to require a user ID in the URL, enhancing RESTful practices.
- Updated the `update_user` function to extract the user ID from the path, ensuring the correct user is updated based on the provided ID.
These changes improve the clarity and functionality of the user update endpoint, aligning it with standard REST conventions.
- Enhanced the `update_user` endpoint to accept and process user role updates alongside name changes.
- Introduced a new `UserResponse` struct for improved response handling.
- Updated the `update_user_handler` to handle changes in both user name and organization role, improving the flexibility of user management.
- Adjusted response type to return no content upon successful updates, aligning with RESTful practices.
These changes enhance the user management capabilities by allowing for more comprehensive updates to user information.
- Removed the public modifier from `get_user` and `update_user` modules to encapsulate them within the module.
- Added a new route to the user router for fetching a user by their ID, enhancing the API's functionality.
- This change improves the user management capabilities by allowing retrieval of specific user details based on their unique identifier.
- Updated the `list_assets` function to include organization ID filtering in dataset permissions queries.
- Removed redundant organization ID filters from the dataset permissions queries to streamline the logic.
- Ensured that only relevant dataset assets are returned based on the user's organization, improving data security and relevance.
These changes enhance the API's ability to serve organization-specific data, aligning with recent improvements in dataset asset APIs.
- Introduced a new `is_simple` flag in the `deploy_datasets` function to differentiate between full and simple dataset deployments.
- Updated the `deploy_datasets_handler` to accept the `is_simple` parameter, allowing for conditional processing of inserted datasets.
- Modified the `DeployDatasetsRequest` struct to include an optional `id` and `type_` field, enhancing the request's flexibility.
- Adjusted the handling of the `yml_file` field to be optional in the `DeployDatasetsRequest` struct.
- Updated the `process_batch` function to handle "USER-DEFINED" data types in addition to existing types.
These changes improve the dataset deployment process by allowing for more granular control and flexibility in handling different dataset types.
- Added functionality to retrieve the user's organization ID in both `get_dataset_overview` and `list_assets` endpoints.
- Updated database queries to filter users and permissions based on the organization ID, ensuring that only relevant data is returned for the user's organization.
- Improved error handling for organization ID retrieval, logging errors appropriately.
These changes improve data security and relevance by ensuring that users only access assets associated with their organization.
- Added the user's email and name to the UserOverviewItem struct for improved clarity in user details.
- Updated the database query to select the user's name alongside their ID and email, ensuring comprehensive user information is retrieved.
- Refactored the mapping logic to accommodate the new name field, enhancing the dataset overview response.
These changes improve the dataset overview API by providing more detailed user information, facilitating better understanding of user access and roles.
- Modified the role adjustment logic in the teams_to_users table to treat 'admin' roles as 'manager'.
- Set the default role to 'member' for all other cases, improving clarity in role assignments.
These changes enhance the migration process for dataset groups and permissions management.
- Simplified certificate handling logic by allowing the COPY command to proceed without failure if cert.pem is missing.
- Updated the Dockerfile to ensure that the update-ca-certificates command is run unconditionally after copying the certificate.
- Maintained the existing build process for the bi_api application.
These changes improve the Docker image build process by making certificate handling more robust and less dependent on the environment.
- Updated the SQL migration to enforce a unique constraint on the combination of `database_name` and `data_source_id` in the datasets table, ensuring data integrity.
- Refactored the `deploy_datasets_handler` to separate datasets with and without IDs, allowing for concurrent upsert operations based on their presence.
- Enhanced the upsert logic to handle datasets more efficiently, improving performance during dataset deployment.
These changes improve the robustness and efficiency of the dataset deployment process within the API.
- Introduced a new `/sql/run` endpoint for executing SQL queries against datasets and data sources.
- Created a dedicated `sql` module and a `run_sql` handler to manage SQL execution logic.
- Implemented access checks to ensure users have the necessary permissions to execute SQL queries.
- Enhanced data retrieval and metadata processing for SQL results, improving overall API functionality.
These changes expand the API's capabilities by allowing users to run custom SQL queries, facilitating more flexible data interactions.
- Added a new `post_dataset` module and corresponding route to handle dataset creation.
- Updated the router to include the new POST endpoint for datasets, improving API functionality.
- Maintained existing routes while ensuring modular organization of dataset-related logic.
These changes improve the API's capabilities for dataset management by providing a dedicated endpoint for dataset creation, enhancing overall usability.
- Replaced the existing `post_datasets` endpoint with a new `deploy_datasets` endpoint to better reflect its purpose.
- Deleted the `post_datasets` module and its associated logic, streamlining the codebase.
- Updated the request and response structures to use `DeployDatasetsRequest` and related types, enhancing clarity and maintainability.
- Adjusted the BusterClient to utilize the new endpoint for deploying datasets, ensuring consistency across the API.
These changes improve the API's functionality by providing a clearer and more focused approach to dataset deployment, facilitating better data management.
- Introduced a new route to the datasets API for fetching data samples associated with a specific dataset.
- Added the `get_dataset_data_sample` module to handle the logic for retrieving dataset data samples.
- Updated the router configuration to include the new endpoint, enhancing the API's functionality for dataset management.
These changes improve the API's capabilities by allowing users to access sample data for datasets, facilitating better data exploration and analysis.
- Simplified the GetDatasetResponse struct by removing unnecessary fields and renaming existing ones for clarity.
- Updated the dataset retrieval logic to focus on essential dataset attributes, enhancing performance and readability.
- Improved user role checks for dataset access, ensuring clearer error messages for permission issues.
- Removed unused imports and streamlined the code for better maintainability.
These changes enhance the API's efficiency in retrieving dataset information and improve the clarity of user permissions related to dataset access.
- Introduced a new optional `model` field in the Dataset struct to store model references.
- Updated the dataset routes to include a new endpoint for retrieving datasets by ID.
- Modified dataset creation logic to accommodate the new `model` field.
- Refactored dataset queries to utilize `datasets::all_columns` for improved readability and maintainability.
These changes enhance the dataset management capabilities by allowing the association of models with datasets, improving data organization and retrieval.
- Introduced a new optional `yml_file` field in the Dataset model to store YAML file references.
- Updated the database schema to include the `yml_file` column in the datasets table.
- Modified various API request and response structures to accommodate the new `yml_file` field.
- Enhanced dataset handling functions to support the inclusion of `yml_file` in dataset operations.
These changes improve the dataset management capabilities by allowing the association of YAML files with datasets, facilitating better data organization and retrieval.
- Added queries to retrieve datasets and permission groups associated with users for improved access tracking in the dataset overview response.
- Implemented logic to include direct dataset access and permission group lineage in the user overview, enhancing clarity on user permissions.
- Improved error handling for database interactions related to dataset and permission group queries.
- Added debug print statements for datasets and permission groups queries to facilitate troubleshooting.
These changes improve the API's ability to manage and report user permissions effectively, providing a clearer overview of user access to datasets and permission groups.
- Updated the UserPermissionLineage and UserOverviewItem structs to provide a clearer representation of user permissions and lineage in the dataset overview response.
- Simplified the get_dataset_overview function by removing redundant queries and consolidating user access checks.
- Improved error handling for database interactions related to user permissions.
- Streamlined the overall structure of the dataset overview response to focus on user details and their access capabilities.
These changes enhance the API's clarity and efficiency in managing and reporting user permissions.
- Introduced UserPermissionLineage struct to provide detailed user access information in the dataset overview response.
- Updated get_dataset_overview function to include comprehensive checks for user permissions, dataset group access, and direct access.
- Improved error handling for database queries related to user permissions and access checks.
- Added TODO comments in list_dataset_assets and put_dataset_assets routes to address future dataset group integration.
These changes enhance the API's capability to manage and report on user permissions effectively.
- Introduced a new DatasetPermission model in the database schema to manage dataset access permissions, including fields for organization_id, dataset_id, and permission_type.
- Updated the API routes to nest asset-related routes under dataset_id, enhancing the organization of dataset-related functionalities.
These changes improve the structure for managing dataset permissions and streamline the API for asset handling.
- Created dataset_groups and dataset_permissions tables in the database schema, including organization_id as a foreign key with ON DELETE CASCADE.
- Added corresponding indexes for organization_id in both tables to optimize query performance.
- Updated the Rust models and schema to reflect the new tables and their relationships.
- Integrated dataset_groups into the API routes for improved data organization and management.
These changes enhance the database structure and facilitate better handling of dataset-related permissions and groupings.
- Changed the role of a user in the teams_to_users table from 'admin' to 'manager' for better role clarity.
- Refactored the users_to_organizations table to include a new 'status' field and updated multiple user roles to align with the new role structure (workspace_admin, querier, data_admin).
- Added a new permission_groups module to the API routes for improved permission management.
- Updated the security module to include a new checks module for enhanced security handling.
- Integrated dotenv in the Next.js configuration to manage environment variables more effectively.
These changes improve the clarity and functionality of user roles and permissions within the application.
- Introduced a new UserOrganizationStatus enum to manage user organization statuses (Active, Inactive, Pending, Guest) in the database schema.
- Updated the UserToOrganization model to include a status field.
- Refactored role checks across various routes to replace the previous UserOrganizationRole values (Owner, Admin) with new roles (WorkspaceAdmin, DataAdmin) for better role management.
- Enhanced data source handling in multiple routes to align with the updated role structure.
These changes improve the clarity and functionality of user organization management within the application.
- Updated the UserOrganizationRole enum to include new roles: WorkspaceAdmin, DataAdmin, Querier, RestrictedQuerier, and Viewer, replacing the previous roles of Owner, Member, and Admin.
- Modified the TeamToUserRole enum to change the Owner role to Manager.
- Added new database tables for dataset_groups, dataset_permissions, datasets_to_dataset_groups, and permission_groups_to_users to support enhanced data management.
- Introduced UserOrganizationStatusEnum to the schema for better organization status tracking.
These changes improve role management and expand the database schema for better data organization and permissions handling.
- Deleted the `package-lock.json` file from the root directory.
- Modified the `dev` target in the API Makefile to start Redis using Docker Compose from the parent directory, improving service orchestration.
- Updated the `next.config.mjs` to load environment variables from the parent directory during development.
- Added `dotenv` as a dependency in both `package.json` and `package-lock.json` to manage environment variables effectively.
These changes streamline the development setup and enhance the management of environment variables.
- Simplified the API service build configuration in `docker-compose.yml` by consolidating the build context and Dockerfile path.
- Added `diesel_migrations` dependency to `Cargo.toml` for database migration management.
- Implemented database migration logic in `main.rs`, including error handling and logging for migration success or failure.
- Introduced a new mail service in `supabase/docker-compose.yml` for handling SMTP, POP3, and web interface.
- Removed version specification from `supabase/dev/docker-compose.dev.yml` for cleaner configuration.
These changes improve the overall structure and functionality of the application, facilitating better database management and service orchestration.
- Consolidated Redis service into the main `docker-compose.yml`, removing the separate API Docker Compose file.
- Added health checks for Redis and API services to ensure proper service readiness.
- Updated API router to include a public health check endpoint.
- Cleaned up the web Dockerfile by removing unnecessary environment variable copying.
These changes enhance service orchestration and improve the reliability of the application during development.
- Removed version specification from `docker-compose.yml` for simplicity.
- Eliminated the `env_file` directive in the `web` service to streamline environment variable management.
- Updated the `Dockerfile` for the API to conditionally copy SSL certificates based on the environment, enhancing flexibility for local and production setups.
These changes aim to simplify the configuration and improve the development workflow.
- Expanded `.env.example` with additional environment variables for local development, including AWS credentials, database connection strings, and API keys.
- Removed the `api/.env.example` file as its contents have been consolidated into the main `.env.example`.
These changes enhance the local development setup by providing a comprehensive example of required environment variables.
- Updated the logic to determine final_permission by checking if permission is Some, simplifying the condition.
- Improved error handling for non-public threads when no permissions are provided.
- Removed unused imports for `serde_json` and `tokio::task::JoinSet` in the `post_datasets.rs` file.
- This cleanup improves code readability and reduces unnecessary dependencies.
- Added steps to configure AWS credentials and download the Postgres SSL certificate from S3 in the GitHub Actions workflow.
- Updated the Dockerfile to reflect the new path for the SSL certificate, ensuring it is correctly copied to the container.
These changes improve the security and organization of SSL certificate handling during the deployment process.
- Changed the directory structure for SSL certificates in the GitHub Actions workflow, creating a new path `certs/cert` for better organization.
- Updated the Dockerfile to reference the new certificate path, ensuring the SSL certificate is correctly installed in the container.
These changes improve the clarity and maintainability of the deployment process.
- Added steps to create a directory for SSL certificates and copy the downloaded certificate to the new location in the GitHub Actions workflow.
- Updated the Dockerfile to reference the new certificate path, ensuring proper installation of the SSL certificate in the container.
These changes improve the organization of SSL certificates and enhance the deployment process.