mirror of https://github.com/kortix-ai/suna.git
cleaup files
This commit is contained in:
parent
299f5c6482
commit
c6ba38eb7f
|
@ -1,349 +0,0 @@
|
|||
# Backend Architecture for Agent Workflow System
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the scalable, modular, and extensible backend architecture for the agent workflow system, designed to work like Zapier but optimized for AI agent workflows.
|
||||
|
||||
## Architecture Principles
|
||||
|
||||
1. **Scalability**: Horizontal scaling through microservices and queue-based processing
|
||||
2. **Modularity**: Clear separation of concerns with pluggable components
|
||||
3. **Extensibility**: Easy to add new triggers, nodes, and integrations
|
||||
4. **Reliability**: Fault tolerance, retries, and graceful degradation
|
||||
5. **Performance**: Async processing, caching, and efficient resource usage
|
||||
|
||||
## System Components
|
||||
|
||||
### 1. API Gateway Layer
|
||||
|
||||
- **Load Balancer**: Distributes traffic across API instances
|
||||
- **Authentication**: JWT-based auth with role-based access control
|
||||
- **Rate Limiting**: Per-user and per-IP rate limits
|
||||
- **Request Routing**: Routes to appropriate services
|
||||
|
||||
### 2. Trigger System
|
||||
|
||||
#### Supported Trigger Types:
|
||||
|
||||
- **Webhook Triggers**
|
||||
- Unique URLs per workflow
|
||||
- HMAC signature validation
|
||||
- Custom header validation
|
||||
- Request/response transformation
|
||||
|
||||
- **Schedule Triggers**
|
||||
- Cron-based scheduling
|
||||
- Timezone support
|
||||
- Execution windows
|
||||
- Missed execution handling
|
||||
|
||||
- **Event Triggers**
|
||||
- Real-time event bus (Redis Pub/Sub)
|
||||
- Event filtering and routing
|
||||
- Event replay capability
|
||||
|
||||
- **Polling Triggers**
|
||||
- Configurable intervals
|
||||
- Change detection
|
||||
- Rate limiting
|
||||
|
||||
- **Manual Triggers**
|
||||
- UI-based execution
|
||||
- API-based execution
|
||||
- Bulk execution support
|
||||
|
||||
### 3. Workflow Engine
|
||||
|
||||
#### Core Components:
|
||||
|
||||
- **Workflow Orchestrator**
|
||||
- Manages workflow lifecycle
|
||||
- Handles execution flow
|
||||
- Manages dependencies
|
||||
- Error handling and retries
|
||||
|
||||
- **Workflow Executor**
|
||||
- Executes individual nodes
|
||||
- Manages parallel execution
|
||||
- Resource allocation
|
||||
- Performance monitoring
|
||||
|
||||
- **State Manager**
|
||||
- Distributed state management (Redis)
|
||||
- Execution context persistence
|
||||
- Checkpoint and recovery
|
||||
- Real-time status updates
|
||||
|
||||
### 4. Node Types
|
||||
|
||||
- **Agent Nodes**: AI-powered processing with multiple models
|
||||
- **Tool Nodes**: Integration with external services
|
||||
- **Transform Nodes**: Data manipulation and formatting
|
||||
- **Condition Nodes**: If/else and switch logic
|
||||
- **Loop Nodes**: For/while iterations
|
||||
- **Parallel Nodes**: Concurrent execution branches
|
||||
- **Webhook Nodes**: HTTP requests to external services
|
||||
- **Delay Nodes**: Time-based delays
|
||||
|
||||
### 5. Data Flow
|
||||
|
||||
```
|
||||
Trigger → Queue → Orchestrator → Executor → Node → Output
|
||||
↓ ↓ ↓
|
||||
State Manager Tool Service Results
|
||||
```
|
||||
|
||||
### 6. Storage Architecture
|
||||
|
||||
- **PostgreSQL**: Workflow definitions, configurations, audit logs
|
||||
- **Redis**: Execution state, queues, caching, pub/sub
|
||||
- **S3/Blob Storage**: Large files, logs, execution artifacts
|
||||
- **TimescaleDB**: Time-series data, metrics, analytics
|
||||
|
||||
### 7. Queue System
|
||||
|
||||
- **RabbitMQ**: Task queuing, priority queues, dead letter queues
|
||||
- **Kafka**: Event streaming, audit trail, real-time analytics
|
||||
|
||||
## Execution Flow
|
||||
|
||||
### 1. Trigger Phase
|
||||
```python
|
||||
1. Trigger fires (webhook/schedule/event/etc)
|
||||
2. Validate trigger configuration
|
||||
3. Create ExecutionContext
|
||||
4. Queue workflow for execution
|
||||
```
|
||||
|
||||
### 2. Orchestration Phase
|
||||
```python
|
||||
1. Load workflow definition
|
||||
2. Build execution graph
|
||||
3. Determine execution order
|
||||
4. Initialize state management
|
||||
```
|
||||
|
||||
### 3. Execution Phase
|
||||
```python
|
||||
1. Execute nodes in topological order
|
||||
2. Handle parallel branches
|
||||
3. Manage data flow between nodes
|
||||
4. Update execution state
|
||||
```
|
||||
|
||||
### 4. Completion Phase
|
||||
```python
|
||||
1. Aggregate results
|
||||
2. Execute post-processing
|
||||
3. Trigger downstream workflows
|
||||
4. Clean up resources
|
||||
```
|
||||
|
||||
## Scalability Features
|
||||
|
||||
### Horizontal Scaling
|
||||
- Stateless API servers
|
||||
- Distributed queue workers
|
||||
- Shared state via Redis
|
||||
- Database read replicas
|
||||
|
||||
### Performance Optimization
|
||||
- Connection pooling
|
||||
- Result caching
|
||||
- Batch processing
|
||||
- Async I/O throughout
|
||||
|
||||
### Resource Management
|
||||
- Worker pool management
|
||||
- Memory limits per execution
|
||||
- CPU throttling
|
||||
- Concurrent execution limits
|
||||
|
||||
## Security
|
||||
|
||||
### Authentication & Authorization
|
||||
- JWT tokens with refresh
|
||||
- API key authentication
|
||||
- OAuth2 integration
|
||||
- Role-based permissions
|
||||
|
||||
### Data Security
|
||||
- Encryption at rest
|
||||
- TLS for all communications
|
||||
- Secret management (Vault)
|
||||
- Audit logging
|
||||
|
||||
### Webhook Security
|
||||
- HMAC signature validation
|
||||
- IP whitelisting
|
||||
- Rate limiting
|
||||
- Request size limits
|
||||
|
||||
## Monitoring & Observability
|
||||
|
||||
### Metrics
|
||||
- Prometheus metrics
|
||||
- Custom business metrics
|
||||
- Performance tracking
|
||||
- Resource utilization
|
||||
|
||||
### Logging
|
||||
- Structured logging
|
||||
- Centralized log aggregation
|
||||
- Log levels and filtering
|
||||
- Correlation IDs
|
||||
|
||||
### Tracing
|
||||
- Distributed tracing (OpenTelemetry)
|
||||
- LLM monitoring (Langfuse)
|
||||
- Execution visualization
|
||||
- Performance profiling
|
||||
|
||||
### Alerting
|
||||
- Error rate monitoring
|
||||
- SLA tracking
|
||||
- Resource alerts
|
||||
- Custom alerts
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Retry Strategies
|
||||
- Exponential backoff
|
||||
- Circuit breakers
|
||||
- Dead letter queues
|
||||
- Manual retry options
|
||||
|
||||
### Failure Modes
|
||||
- Node-level failures
|
||||
- Workflow-level failures
|
||||
- System-level failures
|
||||
- Graceful degradation
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Workflow Management
|
||||
```
|
||||
POST /api/workflows # Create workflow
|
||||
GET /api/workflows/:id # Get workflow
|
||||
PUT /api/workflows/:id # Update workflow
|
||||
DELETE /api/workflows/:id # Delete workflow
|
||||
POST /api/workflows/:id/activate # Activate workflow
|
||||
POST /api/workflows/:id/pause # Pause workflow
|
||||
```
|
||||
|
||||
### Execution Management
|
||||
```
|
||||
POST /api/workflows/:id/execute # Manual execution
|
||||
GET /api/executions/:id # Get execution status
|
||||
POST /api/executions/:id/cancel # Cancel execution
|
||||
GET /api/executions/:id/logs # Get execution logs
|
||||
```
|
||||
|
||||
### Trigger Management
|
||||
```
|
||||
GET /api/workflows/:id/triggers # List triggers
|
||||
POST /api/workflows/:id/triggers # Add trigger
|
||||
PUT /api/triggers/:id # Update trigger
|
||||
DELETE /api/triggers/:id # Remove trigger
|
||||
```
|
||||
|
||||
### Webhook Endpoints
|
||||
```
|
||||
POST /webhooks/:path # Webhook receiver
|
||||
GET /api/webhooks # List webhooks
|
||||
```
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Core Tables
|
||||
|
||||
```sql
|
||||
-- Workflows table
|
||||
CREATE TABLE workflows (
|
||||
id UUID PRIMARY KEY,
|
||||
name VARCHAR(255),
|
||||
description TEXT,
|
||||
project_id UUID,
|
||||
status VARCHAR(50),
|
||||
definition JSONB,
|
||||
created_at TIMESTAMP,
|
||||
updated_at TIMESTAMP
|
||||
);
|
||||
|
||||
-- Workflow executions
|
||||
CREATE TABLE workflow_executions (
|
||||
id UUID PRIMARY KEY,
|
||||
workflow_id UUID,
|
||||
status VARCHAR(50),
|
||||
started_at TIMESTAMP,
|
||||
completed_at TIMESTAMP,
|
||||
context JSONB,
|
||||
result JSONB,
|
||||
error TEXT
|
||||
);
|
||||
|
||||
-- Triggers
|
||||
CREATE TABLE triggers (
|
||||
id UUID PRIMARY KEY,
|
||||
workflow_id UUID,
|
||||
type VARCHAR(50),
|
||||
config JSONB,
|
||||
is_active BOOLEAN
|
||||
);
|
||||
|
||||
-- Webhook registrations
|
||||
CREATE TABLE webhook_registrations (
|
||||
id UUID PRIMARY KEY,
|
||||
workflow_id UUID,
|
||||
path VARCHAR(255) UNIQUE,
|
||||
secret VARCHAR(255),
|
||||
config JSONB
|
||||
);
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
### Docker Compose (Development)
|
||||
```yaml
|
||||
services:
|
||||
api:
|
||||
build: .
|
||||
ports:
|
||||
- "8000:8000"
|
||||
depends_on:
|
||||
- postgres
|
||||
- redis
|
||||
- rabbitmq
|
||||
|
||||
worker:
|
||||
build: .
|
||||
command: python -m workflow_engine.worker
|
||||
depends_on:
|
||||
- postgres
|
||||
- redis
|
||||
- rabbitmq
|
||||
|
||||
scheduler:
|
||||
build: .
|
||||
command: python -m workflow_engine.scheduler
|
||||
depends_on:
|
||||
- postgres
|
||||
- redis
|
||||
```
|
||||
|
||||
### Kubernetes (Production)
|
||||
- Deployment manifests for each service
|
||||
- Horizontal Pod Autoscaling
|
||||
- Service mesh (Istio)
|
||||
- Persistent volume claims
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Workflow Versioning**: Track and manage workflow versions
|
||||
2. **A/B Testing**: Test different workflow variations
|
||||
3. **Workflow Templates**: Pre-built workflow templates
|
||||
4. **Advanced Analytics**: Detailed execution analytics
|
||||
5. **Multi-tenancy**: Full isolation between projects
|
||||
6. **Workflow Marketplace**: Share and monetize workflows
|
||||
7. **Visual Debugging**: Step-through debugging
|
||||
8. **Performance Optimization**: ML-based optimization
|
|
@ -1,212 +0,0 @@
|
|||
# Workflow System Migration Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This guide explains how to apply the database migrations for the new workflow system.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Supabase CLI installed
|
||||
- Access to your Supabase project
|
||||
- Database backup (recommended)
|
||||
|
||||
## Migration Files
|
||||
|
||||
1. **`20250115000000_workflow_system.sql`** - Main migration that creates all workflow tables
|
||||
2. **`20250115000001_workflow_system_rollback.sql`** - Rollback migration (if needed)
|
||||
|
||||
## Tables Created
|
||||
|
||||
The migration creates the following tables:
|
||||
|
||||
### Core Tables
|
||||
- `workflows` - Workflow definitions and configurations
|
||||
- `workflow_executions` - Execution history and status
|
||||
- `triggers` - Trigger configurations
|
||||
- `webhook_registrations` - Webhook endpoints
|
||||
- `scheduled_jobs` - Cron-based schedules
|
||||
- `workflow_templates` - Pre-built templates
|
||||
- `workflow_execution_logs` - Detailed execution logs
|
||||
- `workflow_variables` - Workflow-specific variables
|
||||
|
||||
### Enum Types
|
||||
- `workflow_status` - draft, active, paused, disabled, archived
|
||||
- `execution_status` - pending, running, completed, failed, cancelled, timeout
|
||||
- `trigger_type` - webhook, schedule, event, polling, manual, workflow
|
||||
- `node_type` - trigger, agent, tool, condition, loop, parallel, etc.
|
||||
- `connection_type` - data, tool, processed_data, action, condition
|
||||
|
||||
## Applying the Migration
|
||||
|
||||
### Local Development
|
||||
|
||||
```bash
|
||||
# Navigate to your backend directory
|
||||
cd backend
|
||||
|
||||
# Apply the migration locally
|
||||
supabase migration up
|
||||
|
||||
# Or apply specific migration
|
||||
supabase db push --file supabase/migrations/20250115000000_workflow_system.sql
|
||||
```
|
||||
|
||||
### Production
|
||||
|
||||
```bash
|
||||
# Link to your production project
|
||||
supabase link --project-ref your-project-ref
|
||||
|
||||
# Apply migrations to production
|
||||
supabase db push
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
After applying the migration, verify the tables were created:
|
||||
|
||||
```sql
|
||||
-- Check if tables exist
|
||||
SELECT table_name
|
||||
FROM information_schema.tables
|
||||
WHERE table_schema = 'public'
|
||||
AND table_name LIKE 'workflow%';
|
||||
|
||||
-- Check enum types
|
||||
SELECT typname
|
||||
FROM pg_type
|
||||
WHERE typname IN ('workflow_status', 'execution_status', 'trigger_type', 'node_type', 'connection_type');
|
||||
```
|
||||
|
||||
## Row Level Security (RLS)
|
||||
|
||||
The migration includes RLS policies that ensure:
|
||||
|
||||
1. Users can only access workflows in their projects
|
||||
2. Service role has full access for system operations
|
||||
3. Workflow templates are publicly viewable
|
||||
4. Execution logs are project-scoped
|
||||
|
||||
## Post-Migration Steps
|
||||
|
||||
### 1. Update Environment Variables
|
||||
|
||||
Add these to your `.env` file:
|
||||
|
||||
```env
|
||||
# Workflow system configuration
|
||||
WORKFLOW_EXECUTION_TIMEOUT=3600
|
||||
WORKFLOW_MAX_RETRIES=3
|
||||
WEBHOOK_BASE_URL=https://api.yourdomain.com
|
||||
```
|
||||
|
||||
### 2. Initialize Workflow Engine
|
||||
|
||||
The workflow engine needs to be initialized on API startup:
|
||||
|
||||
```python
|
||||
# In your api.py
|
||||
from workflow_engine.api import initialize as init_workflow_engine
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
# ... existing initialization
|
||||
|
||||
# Initialize workflow engine
|
||||
await init_workflow_engine()
|
||||
|
||||
yield
|
||||
# ... cleanup
|
||||
```
|
||||
|
||||
### 3. Add API Routes
|
||||
|
||||
Include the workflow API routes:
|
||||
|
||||
```python
|
||||
# In your api.py
|
||||
from workflow_engine import api as workflow_api
|
||||
|
||||
app.include_router(workflow_api.router, prefix="/api")
|
||||
```
|
||||
|
||||
### 4. Update Docker Compose
|
||||
|
||||
Add workflow worker service:
|
||||
|
||||
```yaml
|
||||
# In docker-compose.yml
|
||||
workflow-worker:
|
||||
build: .
|
||||
command: python -m workflow_engine.worker
|
||||
environment:
|
||||
- DATABASE_URL=${DATABASE_URL}
|
||||
- REDIS_URL=${REDIS_URL}
|
||||
depends_on:
|
||||
- postgres
|
||||
- redis
|
||||
- rabbitmq
|
||||
```
|
||||
|
||||
## Rollback Instructions
|
||||
|
||||
If you need to rollback the migration:
|
||||
|
||||
```bash
|
||||
# Apply the rollback migration
|
||||
supabase db push --file supabase/migrations/20250115000001_workflow_system_rollback.sql
|
||||
```
|
||||
|
||||
**Warning**: This will delete all workflow data. Make sure to backup any important data first.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Foreign key constraints fail**
|
||||
- Ensure the `projects` and `project_members` tables exist
|
||||
- Check that `auth.users` is accessible
|
||||
|
||||
2. **Enum type already exists**
|
||||
- Drop the existing type: `DROP TYPE IF EXISTS workflow_status CASCADE;`
|
||||
|
||||
3. **Permission denied**
|
||||
- Ensure you're using the service role key for migrations
|
||||
|
||||
### Checking Migration Status
|
||||
|
||||
```sql
|
||||
-- View applied migrations
|
||||
SELECT * FROM supabase_migrations.schema_migrations;
|
||||
|
||||
-- Check for any failed migrations
|
||||
SELECT * FROM supabase_migrations.schema_migrations WHERE success = false;
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
The migration includes several indexes for optimal performance:
|
||||
|
||||
- Workflow lookups by project_id
|
||||
- Execution queries by status and time
|
||||
- Webhook path lookups
|
||||
- Scheduled job next run times
|
||||
|
||||
Monitor query performance and add additional indexes as needed.
|
||||
|
||||
## Security Notes
|
||||
|
||||
1. All tables have RLS enabled
|
||||
2. Webhook secrets are stored encrypted
|
||||
3. Workflow variables can be marked as secrets
|
||||
4. API authentication required for all operations
|
||||
|
||||
## Next Steps
|
||||
|
||||
After successful migration:
|
||||
|
||||
1. Test workflow creation via API
|
||||
2. Verify webhook endpoints are accessible
|
||||
3. Test scheduled job creation
|
||||
4. Monitor execution performance
|
||||
5. Set up cleanup jobs for old execution logs
|
Loading…
Reference in New Issue