mirror of https://github.com/buster-so/buster.git
|
||
---|---|---|
.. | ||
scripts | ||
src | ||
.env.example | ||
.gitignore | ||
CLAUDE.md | ||
README.md | ||
biome.json | ||
env.d.ts | ||
package.json | ||
tsconfig.json | ||
turbo.json | ||
vitest.config.ts |
README.md
Data Source Package
Secure, isolated connections to customer data sources. This package handles all external database connections with a security-first approach.
Installation
pnpm add @buster/data-source
Overview
@buster/data-source
provides:
- Secure connections to customer databases (PostgreSQL, MySQL, BigQuery, Snowflake, etc.)
- Data source introspection and schema discovery
- Secure query execution with timeouts and limits
- Connection pooling and management
- Query result transformation
Security Principles
🔒 SECURITY IS PARAMOUNT 🔒
This package handles sensitive customer data and MUST:
- Never log credentials or sensitive data
- Always use encrypted connections
- Implement query timeouts and resource limits
- Validate and sanitize all inputs
- Use read-only connections where possible
- Implement proper connection pooling
- Handle credentials securely (never in code)
Architecture
Apps → @buster/data-source → Customer Databases
↓
Adapters
(DB-specific logic)
Supported Data Sources
- PostgreSQL - Full introspection and query support
- MySQL - Full introspection and query support
- Snowflake - Full support with clustering information
- BigQuery - Google Cloud data warehouse
- Redshift - AWS data warehouse
- SQL Server - Microsoft SQL Server
- Databricks - Unified analytics platform
Usage
Creating a Connection
import { createConnection } from '@buster/data-source';
const connection = await createConnection({
type: 'postgresql',
host: 'localhost',
port: 5432,
database: 'mydb',
username: 'user',
password: encryptedPassword, // Always encrypted
ssl: true,
connectionTimeout: 30000,
queryTimeout: 60000,
maxConnections: 10
});
Executing Queries
import { executeQuery } from '@buster/data-source';
const result = await executeQuery({
dataSourceId: 'source-123',
query: 'SELECT * FROM users',
maxRows: 1000,
timeout: 60000
});
// Result is automatically limited and sanitized
console.info(`Retrieved ${result.rowCount} rows`);
Database Introspection
import { introspectDatabase } from '@buster/data-source';
const schema = await introspectDatabase('source-123');
// Get table and column information
schema.tables.forEach(table => {
console.info(`Table: ${table.name}`);
table.columns.forEach(column => {
console.info(` - ${column.name}: ${column.type}`);
});
});
Adapter Pattern
Each data source type has its own adapter:
export interface DataSourceAdapter {
connect(config: unknown): Promise<void>;
disconnect(): Promise<void>;
executeQuery(query: string, params?: unknown[]): Promise<QueryResult>;
introspect(): Promise<IntrospectionResult>;
testConnection(): Promise<boolean>;
}
PostgreSQL Adapter Example
import { PostgreSQLAdapter } from '@buster/data-source';
const adapter = new PostgreSQLAdapter();
await adapter.connect({
host: 'localhost',
port: 5432,
database: 'mydb',
username: 'user',
password: encryptedPassword,
ssl: true
});
const result = await adapter.executeQuery('SELECT NOW()');
Security Features
Connection Security
// All connections use SSL by default
const connection = await createConnection({
type: 'postgresql',
ssl: true, // Default
// SSL options
ssl: {
rejectUnauthorized: true,
ca: certificateAuthority,
cert: clientCertificate,
key: clientKey
}
});
Query Limits
// Automatic row limiting
const result = await executeQuery({
query: 'SELECT * FROM large_table',
maxRows: 1000 // Enforced limit
});
// Query timeout
const result = await executeQuery({
query: 'SELECT * FROM slow_query',
timeout: 30000 // 30 second timeout
});
Read-Only Connections
// Use read-only connections for safety
const connection = await createConnection({
type: 'postgresql',
readOnly: true, // Sets transaction to read-only
options: '-c default_transaction_read_only=on'
});
Error Handling
import { DataSourceError } from '@buster/data-source';
try {
await executeQuery({
dataSourceId: 'source-123',
query: 'SELECT * FROM users'
});
} catch (error) {
if (error instanceof DataSourceError) {
// Handle known errors
console.error(`Query failed: ${error.message}`);
// error.code contains error code
// No sensitive information exposed
} else {
// Unknown error
console.error('Unexpected error occurred');
}
}
Connection Pooling
// Connections are automatically pooled
const pool = await createConnectionPool({
type: 'postgresql',
min: 2,
max: 10,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000
});
// Use connection from pool
const result = await pool.query('SELECT * FROM users');
Testing
Unit Tests
describe('PostgreSQLAdapter', () => {
it('should validate connection config', () => {
const invalidConfig = {
host: 'localhost',
port: 'not-a-number' // Invalid
};
expect(() => {
PostgreSQLConfigSchema.parse(invalidConfig);
}).toThrow();
});
it('should enforce query timeout', async () => {
const adapter = new PostgreSQLAdapter();
const longQuery = 'SELECT pg_sleep(10)';
await expect(
adapter.executeQuery(longQuery, { timeout: 1000 })
).rejects.toThrow('Query timeout');
});
});
Integration Tests
describe('data-source.int.test.ts', () => {
it('should connect to database', async () => {
const connection = await createConnection(testConfig);
const result = await connection.testConnection();
expect(result).toBe(true);
await connection.disconnect();
});
});
Best Practices
DO:
- Always use encrypted connections
- Implement connection pooling
- Set query and connection timeouts
- Limit result set sizes
- Validate all inputs with Zod
- Use read-only connections when possible
- Clear sensitive data from memory
- Log errors internally, sanitize for users
DON'T:
- Log credentials or query results
- Expose internal error details
- Allow unlimited result sets
- Trust user input without validation
- Keep connections open indefinitely
- Store passwords in plain text
- Expose connection details in errors
Development
# Build
turbo build --filter=@buster/data-source
# Test
turbo test:unit --filter=@buster/data-source
turbo test:integration --filter=@buster/data-source
# Lint
turbo lint --filter=@buster/data-source
This package is critical for customer data security. Always prioritize security over performance or convenience.