diff --git a/backend/ARCHITECTURE.md b/backend/ARCHITECTURE.md deleted file mode 100644 index cb7f92ea..00000000 --- a/backend/ARCHITECTURE.md +++ /dev/null @@ -1,349 +0,0 @@ -# Backend Architecture for Agent Workflow System - -## Overview - -This document describes the scalable, modular, and extensible backend architecture for the agent workflow system, designed to work like Zapier but optimized for AI agent workflows. - -## Architecture Principles - -1. **Scalability**: Horizontal scaling through microservices and queue-based processing -2. **Modularity**: Clear separation of concerns with pluggable components -3. **Extensibility**: Easy to add new triggers, nodes, and integrations -4. **Reliability**: Fault tolerance, retries, and graceful degradation -5. **Performance**: Async processing, caching, and efficient resource usage - -## System Components - -### 1. API Gateway Layer - -- **Load Balancer**: Distributes traffic across API instances -- **Authentication**: JWT-based auth with role-based access control -- **Rate Limiting**: Per-user and per-IP rate limits -- **Request Routing**: Routes to appropriate services - -### 2. Trigger System - -#### Supported Trigger Types: - -- **Webhook Triggers** - - Unique URLs per workflow - - HMAC signature validation - - Custom header validation - - Request/response transformation - -- **Schedule Triggers** - - Cron-based scheduling - - Timezone support - - Execution windows - - Missed execution handling - -- **Event Triggers** - - Real-time event bus (Redis Pub/Sub) - - Event filtering and routing - - Event replay capability - -- **Polling Triggers** - - Configurable intervals - - Change detection - - Rate limiting - -- **Manual Triggers** - - UI-based execution - - API-based execution - - Bulk execution support - -### 3. Workflow Engine - -#### Core Components: - -- **Workflow Orchestrator** - - Manages workflow lifecycle - - Handles execution flow - - Manages dependencies - - Error handling and retries - -- **Workflow Executor** - - Executes individual nodes - - Manages parallel execution - - Resource allocation - - Performance monitoring - -- **State Manager** - - Distributed state management (Redis) - - Execution context persistence - - Checkpoint and recovery - - Real-time status updates - -### 4. Node Types - -- **Agent Nodes**: AI-powered processing with multiple models -- **Tool Nodes**: Integration with external services -- **Transform Nodes**: Data manipulation and formatting -- **Condition Nodes**: If/else and switch logic -- **Loop Nodes**: For/while iterations -- **Parallel Nodes**: Concurrent execution branches -- **Webhook Nodes**: HTTP requests to external services -- **Delay Nodes**: Time-based delays - -### 5. Data Flow - -``` -Trigger → Queue → Orchestrator → Executor → Node → Output - ↓ ↓ ↓ - State Manager Tool Service Results -``` - -### 6. Storage Architecture - -- **PostgreSQL**: Workflow definitions, configurations, audit logs -- **Redis**: Execution state, queues, caching, pub/sub -- **S3/Blob Storage**: Large files, logs, execution artifacts -- **TimescaleDB**: Time-series data, metrics, analytics - -### 7. Queue System - -- **RabbitMQ**: Task queuing, priority queues, dead letter queues -- **Kafka**: Event streaming, audit trail, real-time analytics - -## Execution Flow - -### 1. Trigger Phase -```python -1. Trigger fires (webhook/schedule/event/etc) -2. Validate trigger configuration -3. Create ExecutionContext -4. Queue workflow for execution -``` - -### 2. Orchestration Phase -```python -1. Load workflow definition -2. Build execution graph -3. Determine execution order -4. Initialize state management -``` - -### 3. Execution Phase -```python -1. Execute nodes in topological order -2. Handle parallel branches -3. Manage data flow between nodes -4. Update execution state -``` - -### 4. Completion Phase -```python -1. Aggregate results -2. Execute post-processing -3. Trigger downstream workflows -4. Clean up resources -``` - -## Scalability Features - -### Horizontal Scaling -- Stateless API servers -- Distributed queue workers -- Shared state via Redis -- Database read replicas - -### Performance Optimization -- Connection pooling -- Result caching -- Batch processing -- Async I/O throughout - -### Resource Management -- Worker pool management -- Memory limits per execution -- CPU throttling -- Concurrent execution limits - -## Security - -### Authentication & Authorization -- JWT tokens with refresh -- API key authentication -- OAuth2 integration -- Role-based permissions - -### Data Security -- Encryption at rest -- TLS for all communications -- Secret management (Vault) -- Audit logging - -### Webhook Security -- HMAC signature validation -- IP whitelisting -- Rate limiting -- Request size limits - -## Monitoring & Observability - -### Metrics -- Prometheus metrics -- Custom business metrics -- Performance tracking -- Resource utilization - -### Logging -- Structured logging -- Centralized log aggregation -- Log levels and filtering -- Correlation IDs - -### Tracing -- Distributed tracing (OpenTelemetry) -- LLM monitoring (Langfuse) -- Execution visualization -- Performance profiling - -### Alerting -- Error rate monitoring -- SLA tracking -- Resource alerts -- Custom alerts - -## Error Handling - -### Retry Strategies -- Exponential backoff -- Circuit breakers -- Dead letter queues -- Manual retry options - -### Failure Modes -- Node-level failures -- Workflow-level failures -- System-level failures -- Graceful degradation - -## API Endpoints - -### Workflow Management -``` -POST /api/workflows # Create workflow -GET /api/workflows/:id # Get workflow -PUT /api/workflows/:id # Update workflow -DELETE /api/workflows/:id # Delete workflow -POST /api/workflows/:id/activate # Activate workflow -POST /api/workflows/:id/pause # Pause workflow -``` - -### Execution Management -``` -POST /api/workflows/:id/execute # Manual execution -GET /api/executions/:id # Get execution status -POST /api/executions/:id/cancel # Cancel execution -GET /api/executions/:id/logs # Get execution logs -``` - -### Trigger Management -``` -GET /api/workflows/:id/triggers # List triggers -POST /api/workflows/:id/triggers # Add trigger -PUT /api/triggers/:id # Update trigger -DELETE /api/triggers/:id # Remove trigger -``` - -### Webhook Endpoints -``` -POST /webhooks/:path # Webhook receiver -GET /api/webhooks # List webhooks -``` - -## Database Schema - -### Core Tables - -```sql --- Workflows table -CREATE TABLE workflows ( - id UUID PRIMARY KEY, - name VARCHAR(255), - description TEXT, - project_id UUID, - status VARCHAR(50), - definition JSONB, - created_at TIMESTAMP, - updated_at TIMESTAMP -); - --- Workflow executions -CREATE TABLE workflow_executions ( - id UUID PRIMARY KEY, - workflow_id UUID, - status VARCHAR(50), - started_at TIMESTAMP, - completed_at TIMESTAMP, - context JSONB, - result JSONB, - error TEXT -); - --- Triggers -CREATE TABLE triggers ( - id UUID PRIMARY KEY, - workflow_id UUID, - type VARCHAR(50), - config JSONB, - is_active BOOLEAN -); - --- Webhook registrations -CREATE TABLE webhook_registrations ( - id UUID PRIMARY KEY, - workflow_id UUID, - path VARCHAR(255) UNIQUE, - secret VARCHAR(255), - config JSONB -); -``` - -## Deployment - -### Docker Compose (Development) -```yaml -services: - api: - build: . - ports: - - "8000:8000" - depends_on: - - postgres - - redis - - rabbitmq - - worker: - build: . - command: python -m workflow_engine.worker - depends_on: - - postgres - - redis - - rabbitmq - - scheduler: - build: . - command: python -m workflow_engine.scheduler - depends_on: - - postgres - - redis -``` - -### Kubernetes (Production) -- Deployment manifests for each service -- Horizontal Pod Autoscaling -- Service mesh (Istio) -- Persistent volume claims - -## Future Enhancements - -1. **Workflow Versioning**: Track and manage workflow versions -2. **A/B Testing**: Test different workflow variations -3. **Workflow Templates**: Pre-built workflow templates -4. **Advanced Analytics**: Detailed execution analytics -5. **Multi-tenancy**: Full isolation between projects -6. **Workflow Marketplace**: Share and monetize workflows -7. **Visual Debugging**: Step-through debugging -8. **Performance Optimization**: ML-based optimization \ No newline at end of file diff --git a/backend/MIGRATION_GUIDE.md b/backend/MIGRATION_GUIDE.md deleted file mode 100644 index 31e04a75..00000000 --- a/backend/MIGRATION_GUIDE.md +++ /dev/null @@ -1,212 +0,0 @@ -# Workflow System Migration Guide - -## Overview - -This guide explains how to apply the database migrations for the new workflow system. - -## Prerequisites - -- Supabase CLI installed -- Access to your Supabase project -- Database backup (recommended) - -## Migration Files - -1. **`20250115000000_workflow_system.sql`** - Main migration that creates all workflow tables -2. **`20250115000001_workflow_system_rollback.sql`** - Rollback migration (if needed) - -## Tables Created - -The migration creates the following tables: - -### Core Tables -- `workflows` - Workflow definitions and configurations -- `workflow_executions` - Execution history and status -- `triggers` - Trigger configurations -- `webhook_registrations` - Webhook endpoints -- `scheduled_jobs` - Cron-based schedules -- `workflow_templates` - Pre-built templates -- `workflow_execution_logs` - Detailed execution logs -- `workflow_variables` - Workflow-specific variables - -### Enum Types -- `workflow_status` - draft, active, paused, disabled, archived -- `execution_status` - pending, running, completed, failed, cancelled, timeout -- `trigger_type` - webhook, schedule, event, polling, manual, workflow -- `node_type` - trigger, agent, tool, condition, loop, parallel, etc. -- `connection_type` - data, tool, processed_data, action, condition - -## Applying the Migration - -### Local Development - -```bash -# Navigate to your backend directory -cd backend - -# Apply the migration locally -supabase migration up - -# Or apply specific migration -supabase db push --file supabase/migrations/20250115000000_workflow_system.sql -``` - -### Production - -```bash -# Link to your production project -supabase link --project-ref your-project-ref - -# Apply migrations to production -supabase db push -``` - -## Verification - -After applying the migration, verify the tables were created: - -```sql --- Check if tables exist -SELECT table_name -FROM information_schema.tables -WHERE table_schema = 'public' -AND table_name LIKE 'workflow%'; - --- Check enum types -SELECT typname -FROM pg_type -WHERE typname IN ('workflow_status', 'execution_status', 'trigger_type', 'node_type', 'connection_type'); -``` - -## Row Level Security (RLS) - -The migration includes RLS policies that ensure: - -1. Users can only access workflows in their projects -2. Service role has full access for system operations -3. Workflow templates are publicly viewable -4. Execution logs are project-scoped - -## Post-Migration Steps - -### 1. Update Environment Variables - -Add these to your `.env` file: - -```env -# Workflow system configuration -WORKFLOW_EXECUTION_TIMEOUT=3600 -WORKFLOW_MAX_RETRIES=3 -WEBHOOK_BASE_URL=https://api.yourdomain.com -``` - -### 2. Initialize Workflow Engine - -The workflow engine needs to be initialized on API startup: - -```python -# In your api.py -from workflow_engine.api import initialize as init_workflow_engine - -@asynccontextmanager -async def lifespan(app: FastAPI): - # ... existing initialization - - # Initialize workflow engine - await init_workflow_engine() - - yield - # ... cleanup -``` - -### 3. Add API Routes - -Include the workflow API routes: - -```python -# In your api.py -from workflow_engine import api as workflow_api - -app.include_router(workflow_api.router, prefix="/api") -``` - -### 4. Update Docker Compose - -Add workflow worker service: - -```yaml -# In docker-compose.yml -workflow-worker: - build: . - command: python -m workflow_engine.worker - environment: - - DATABASE_URL=${DATABASE_URL} - - REDIS_URL=${REDIS_URL} - depends_on: - - postgres - - redis - - rabbitmq -``` - -## Rollback Instructions - -If you need to rollback the migration: - -```bash -# Apply the rollback migration -supabase db push --file supabase/migrations/20250115000001_workflow_system_rollback.sql -``` - -**Warning**: This will delete all workflow data. Make sure to backup any important data first. - -## Troubleshooting - -### Common Issues - -1. **Foreign key constraints fail** - - Ensure the `projects` and `project_members` tables exist - - Check that `auth.users` is accessible - -2. **Enum type already exists** - - Drop the existing type: `DROP TYPE IF EXISTS workflow_status CASCADE;` - -3. **Permission denied** - - Ensure you're using the service role key for migrations - -### Checking Migration Status - -```sql --- View applied migrations -SELECT * FROM supabase_migrations.schema_migrations; - --- Check for any failed migrations -SELECT * FROM supabase_migrations.schema_migrations WHERE success = false; -``` - -## Performance Considerations - -The migration includes several indexes for optimal performance: - -- Workflow lookups by project_id -- Execution queries by status and time -- Webhook path lookups -- Scheduled job next run times - -Monitor query performance and add additional indexes as needed. - -## Security Notes - -1. All tables have RLS enabled -2. Webhook secrets are stored encrypted -3. Workflow variables can be marked as secrets -4. API authentication required for all operations - -## Next Steps - -After successful migration: - -1. Test workflow creation via API -2. Verify webhook endpoints are accessible -3. Test scheduled job creation -4. Monitor execution performance -5. Set up cleanup jobs for old execution logs \ No newline at end of file