buster/packages/data-source
dal f469b2a152
Update PostgreSQL adapter SSL configuration to allow self-signed certificates
- Modified the SSL configuration in both the PostgreSQL adapter and its tests to use { rejectUnauthorized: false } instead of a boolean true value.
- Ensured consistency in handling SSL settings across the adapter and its tests.
2025-08-25 00:26:29 -06:00
..
scripts Use tsx and .ts files for validation 2025-07-21 16:07:14 -06:00
src Update PostgreSQL adapter SSL configuration to allow self-signed certificates 2025-08-25 00:26:29 -06:00
.env.example Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
.gitignore Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
CLAUDE.md Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
README.md Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
biome.json Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
env.d.ts Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
package.json Update AWS SDK dependencies and enhance metric data handling 2025-08-22 12:45:05 -06:00
tsconfig.json fix all of the bugs 2025-07-12 22:14:08 -06:00
turbo.json Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00
vitest.config.ts Mastra braintrust (#391) 2025-07-02 14:33:40 -07:00

README.md

@buster/data-source

A TypeScript library for connecting to, querying, and introspecting multiple data source types including Snowflake, BigQuery, PostgreSQL, MySQL, SQL Server, Redshift, and Databricks.

Features

  • Multi-Database Support: Connect to 7+ different database types with a unified interface
  • Query Routing: Route queries to specific data sources or use intelligent defaults
  • Database Introspection: Discover database structure, tables, columns, and statistics
  • Type Safety: Full TypeScript support with comprehensive type definitions
  • Connection Management: Automatic connection pooling and lifecycle management
  • Error Handling: Graceful error handling with detailed error information

Supported Data Sources

  • Snowflake - Full introspection support with clustering information
  • PostgreSQL - Full introspection support
  • MySQL - Full introspection support
  • BigQuery - Basic support (introspection placeholder)
  • SQL Server - Basic support (introspection placeholder)
  • Redshift - Basic support (introspection placeholder)
  • Databricks - Basic support (introspection placeholder)

Installation

npm install @buster/data-source

Quick Start

Basic Usage

import { DataSource, DataSourceType } from '@buster/data-source';

// Configure your data sources
const dataSource = new DataSource({
  dataSources: [
    {
      name: 'snowflake-prod',
      type: DataSourceType.Snowflake,
      credentials: {
        type: DataSourceType.Snowflake,
        account_id: 'your-account',
        username: 'your-username',
        password: 'your-password',
        warehouse_id: 'your-warehouse',
        default_database: 'your-database',
      },
    },
    {
      name: 'postgres-dev',
      type: DataSourceType.PostgreSQL,
      credentials: {
        type: DataSourceType.PostgreSQL,
        host: 'localhost',
        port: 5432,
        database: 'dev_db',
        username: 'dev_user',
        password: 'dev_password',
      },
    },
  ],
  defaultDataSource: 'snowflake-prod',
});

// Execute queries
const result = await dataSource.execute({
  sql: 'SELECT * FROM users LIMIT 10',
  warehouse: 'snowflake-prod', // Optional: specify data source
});

console.log(result.rows);

Database Introspection

// Get all databases
const databases = await dataSource.getDatabases('snowflake-prod');
console.log('Databases:', databases.map(db => db.name));

// Get schemas in a database
const schemas = await dataSource.getSchemas('snowflake-prod', 'ANALYTICS_DB');
console.log('Schemas:', schemas.map(s => s.name));

// Get tables in a schema
const tables = await dataSource.getTables('snowflake-prod', 'ANALYTICS_DB', 'PUBLIC');
console.log('Tables:', tables.map(t => ({ name: t.name, type: t.type, rows: t.rowCount })));

// Get columns in a table
const columns = await dataSource.getColumns('snowflake-prod', 'ANALYTICS_DB', 'PUBLIC', 'USERS');
console.log('Columns:', columns.map(c => ({ name: c.name, type: c.dataType, nullable: c.isNullable })));

// Get table statistics (Snowflake)
const stats = await dataSource.getTableStatistics('ANALYTICS_DB', 'PUBLIC', 'USERS', 'snowflake-prod');
console.log('Table stats:', {
  rowCount: stats.rowCount,
  sizeBytes: stats.sizeBytes,
  columnStats: stats.columnStatistics.length,
});

// Get comprehensive introspection
const fullIntrospection = await dataSource.getFullIntrospection('snowflake-prod');
console.log('Full catalog:', {
  databases: fullIntrospection.databases.length,
  schemas: fullIntrospection.schemas.length,
  tables: fullIntrospection.tables.length,
  columns: fullIntrospection.columns.length,
});

Advanced Usage

// Direct introspector access
const introspector = await dataSource.introspect('snowflake-prod');
const databases = await introspector.getDatabases();

// Add data sources dynamically
await dataSource.addDataSource({
  name: 'mysql-analytics',
  type: DataSourceType.MySQL,
  credentials: {
    type: DataSourceType.MySQL,
    host: 'mysql.example.com',
    database: 'analytics',
    username: 'analyst',
    password: 'secret',
  },
});

// Test connections
const connectionStatus = await dataSource.testAllDataSources();
console.log('Connection status:', connectionStatus);

// Clean up
await dataSource.close();

Configuration

Data Source Configuration

interface DataSourceConfig {
  name: string;                    // Unique identifier
  type: DataSourceType;           // Database type
  credentials: Credentials;       // Type-specific credentials
  config?: Record<string, unknown>; // Additional options
}

Snowflake Credentials

interface SnowflakeCredentials {
  type: DataSourceType.Snowflake;
  account_id: string;             // Account identifier
  warehouse_id: string;           // Warehouse for compute
  username: string;
  password: string;
  role?: string;                  // Optional role
  default_database: string;
  default_schema?: string;
}

PostgreSQL Credentials

interface PostgreSQLCredentials {
  type: DataSourceType.PostgreSQL;
  host: string;
  port?: number;                  // Default: 5432
  database: string;
  username: string;
  password: string;
  schema?: string;                // Default schema
  ssl?: boolean | SSLConfig;      // SSL configuration
  connection_timeout?: number;    // Connection timeout in ms
}

Introspection Types

Database Structure

interface Database {
  name: string;
  owner?: string;
  comment?: string;
  created?: Date;
  lastModified?: Date;
  metadata?: Record<string, unknown>;
}

interface Schema {
  name: string;
  database: string;
  owner?: string;
  comment?: string;
  created?: Date;
  lastModified?: Date;
}

interface Table {
  name: string;
  schema: string;
  database: string;
  type: 'TABLE' | 'VIEW' | 'MATERIALIZED_VIEW' | 'EXTERNAL_TABLE' | 'TEMPORARY_TABLE';
  rowCount?: number;
  sizeBytes?: number;
  comment?: string;
  created?: Date;
  lastModified?: Date;
  clusteringKeys?: string[];      // Snowflake clustering keys
}

interface Column {
  name: string;
  table: string;
  schema: string;
  database: string;
  position: number;
  dataType: string;
  isNullable: boolean;
  defaultValue?: string;
  maxLength?: number;
  precision?: number;
  scale?: number;
  comment?: string;
  isPrimaryKey?: boolean;
  isForeignKey?: boolean;
}

Statistics

interface TableStatistics {
  table: string;
  schema: string;
  database: string;
  rowCount?: number;
  sizeBytes?: number;
  columnStatistics: ColumnStatistics[];
  clusteringInfo?: ClusteringInfo;  // Snowflake-specific
  lastUpdated?: Date;
}

interface ColumnStatistics {
  columnName: string;
  distinctCount?: number;
  nullCount?: number;
  minValue?: unknown;
  maxValue?: unknown;
  avgValue?: number;
  topValues?: Array<{ value: unknown; frequency: number }>;
}

Error Handling

// Query results include success status and error details
const result = await dataSource.execute({
  sql: 'SELECT * FROM non_existent_table',
});

if (!result.success) {
  console.error('Query failed:', result.error?.message);
  console.error('Error code:', result.error?.code);
}

Backward Compatibility

The package maintains backward compatibility with the previous QueryRouter class:

import { QueryRouter } from '@buster/data-source';

// This still works
const router = new QueryRouter({ dataSources: [...] });

Examples

See the examples directory for comprehensive usage examples:

Development

Running Tests

# Run all tests
npm test

# Run specific test suite
npm test -- tests/integration/adapters/snowflake.test.ts

# Type checking
npm run typecheck

Environment Variables for Testing

# PostgreSQL
TEST_POSTGRES_HOST=localhost
TEST_POSTGRES_PORT=5432
TEST_POSTGRES_DATABASE=test_db
TEST_POSTGRES_USERNAME=test_user
TEST_POSTGRES_PASSWORD=test_password

# Snowflake
TEST_SNOWFLAKE_ACCOUNT_ID=your_account
TEST_SNOWFLAKE_USERNAME=your_username
TEST_SNOWFLAKE_PASSWORD=your_password
TEST_SNOWFLAKE_WAREHOUSE_ID=your_warehouse
TEST_SNOWFLAKE_DATABASE=your_database

# MySQL
TEST_MYSQL_HOST=localhost
TEST_MYSQL_PORT=3306
TEST_MYSQL_DATABASE=test_db
TEST_MYSQL_USERNAME=test_user
TEST_MYSQL_PASSWORD=test_password

# BigQuery
TEST_BIGQUERY_PROJECT_ID=your_project
TEST_BIGQUERY_SERVICE_ACCOUNT_KEY=path/to/key.json

Architecture

@buster/data-source
├── src/
│   ├── adapters/           # Database-specific adapters
│   │   ├── base.ts        # Base adapter interface
│   │   ├── snowflake.ts   # Snowflake implementation
│   │   ├── postgresql.ts  # PostgreSQL implementation
│   │   └── ...
│   ├── introspection/     # Database introspection
│   │   ├── base.ts        # Base introspector interface
│   │   ├── snowflake.ts   # Snowflake introspection
│   │   └── ...
│   ├── types/             # Type definitions
│   │   ├── credentials.ts # Credential interfaces
│   │   ├── query.ts       # Query types
│   │   └── introspection.ts # Introspection types
│   ├── data-source.ts     # Main DataSource class
│   └── index.ts           # Public API exports
├── tests/                 # Test suites
└── examples/              # Usage examples

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

License

MIT License - see LICENSE for details.

Scoped Full Introspection

You can now scope full introspection to specific databases, schemas, or tables:

// Get introspection for specific databases
const result = await dataSource.getFullIntrospection('myDataSource', {
  databases: ['sales_db', 'analytics_db']
});

// Get introspection for specific schemas
const result = await dataSource.getFullIntrospection('myDataSource', {
  schemas: ['public', 'reporting']
});

// Get introspection for specific tables
const result = await dataSource.getFullIntrospection('myDataSource', {
  tables: ['customers', 'orders', 'products']
});

// Combine filters - get specific tables from specific schemas
const result = await dataSource.getFullIntrospection('myDataSource', {
  schemas: ['public'],
  tables: ['customers', 'orders']
});

The scoping works hierarchically:

  • If databases is specified, only schemas, tables, columns, and views from those databases are included
  • If schemas is specified, only tables, columns, and views from those schemas are included
  • If tables is specified, only those specific tables and their columns are included
  • Filters can be combined for more precise scoping

This is particularly useful for large data sources where you only need to introspect a subset of the available objects.