cleaup files

This commit is contained in:
Soumyadas15 2025-06-24 14:43:49 +05:30
parent 299f5c6482
commit c6ba38eb7f
2 changed files with 0 additions and 561 deletions

View File

@ -1,349 +0,0 @@
# Backend Architecture for Agent Workflow System
## Overview
This document describes the scalable, modular, and extensible backend architecture for the agent workflow system, designed to work like Zapier but optimized for AI agent workflows.
## Architecture Principles
1. **Scalability**: Horizontal scaling through microservices and queue-based processing
2. **Modularity**: Clear separation of concerns with pluggable components
3. **Extensibility**: Easy to add new triggers, nodes, and integrations
4. **Reliability**: Fault tolerance, retries, and graceful degradation
5. **Performance**: Async processing, caching, and efficient resource usage
## System Components
### 1. API Gateway Layer
- **Load Balancer**: Distributes traffic across API instances
- **Authentication**: JWT-based auth with role-based access control
- **Rate Limiting**: Per-user and per-IP rate limits
- **Request Routing**: Routes to appropriate services
### 2. Trigger System
#### Supported Trigger Types:
- **Webhook Triggers**
- Unique URLs per workflow
- HMAC signature validation
- Custom header validation
- Request/response transformation
- **Schedule Triggers**
- Cron-based scheduling
- Timezone support
- Execution windows
- Missed execution handling
- **Event Triggers**
- Real-time event bus (Redis Pub/Sub)
- Event filtering and routing
- Event replay capability
- **Polling Triggers**
- Configurable intervals
- Change detection
- Rate limiting
- **Manual Triggers**
- UI-based execution
- API-based execution
- Bulk execution support
### 3. Workflow Engine
#### Core Components:
- **Workflow Orchestrator**
- Manages workflow lifecycle
- Handles execution flow
- Manages dependencies
- Error handling and retries
- **Workflow Executor**
- Executes individual nodes
- Manages parallel execution
- Resource allocation
- Performance monitoring
- **State Manager**
- Distributed state management (Redis)
- Execution context persistence
- Checkpoint and recovery
- Real-time status updates
### 4. Node Types
- **Agent Nodes**: AI-powered processing with multiple models
- **Tool Nodes**: Integration with external services
- **Transform Nodes**: Data manipulation and formatting
- **Condition Nodes**: If/else and switch logic
- **Loop Nodes**: For/while iterations
- **Parallel Nodes**: Concurrent execution branches
- **Webhook Nodes**: HTTP requests to external services
- **Delay Nodes**: Time-based delays
### 5. Data Flow
```
Trigger → Queue → Orchestrator → Executor → Node → Output
↓ ↓ ↓
State Manager Tool Service Results
```
### 6. Storage Architecture
- **PostgreSQL**: Workflow definitions, configurations, audit logs
- **Redis**: Execution state, queues, caching, pub/sub
- **S3/Blob Storage**: Large files, logs, execution artifacts
- **TimescaleDB**: Time-series data, metrics, analytics
### 7. Queue System
- **RabbitMQ**: Task queuing, priority queues, dead letter queues
- **Kafka**: Event streaming, audit trail, real-time analytics
## Execution Flow
### 1. Trigger Phase
```python
1. Trigger fires (webhook/schedule/event/etc)
2. Validate trigger configuration
3. Create ExecutionContext
4. Queue workflow for execution
```
### 2. Orchestration Phase
```python
1. Load workflow definition
2. Build execution graph
3. Determine execution order
4. Initialize state management
```
### 3. Execution Phase
```python
1. Execute nodes in topological order
2. Handle parallel branches
3. Manage data flow between nodes
4. Update execution state
```
### 4. Completion Phase
```python
1. Aggregate results
2. Execute post-processing
3. Trigger downstream workflows
4. Clean up resources
```
## Scalability Features
### Horizontal Scaling
- Stateless API servers
- Distributed queue workers
- Shared state via Redis
- Database read replicas
### Performance Optimization
- Connection pooling
- Result caching
- Batch processing
- Async I/O throughout
### Resource Management
- Worker pool management
- Memory limits per execution
- CPU throttling
- Concurrent execution limits
## Security
### Authentication & Authorization
- JWT tokens with refresh
- API key authentication
- OAuth2 integration
- Role-based permissions
### Data Security
- Encryption at rest
- TLS for all communications
- Secret management (Vault)
- Audit logging
### Webhook Security
- HMAC signature validation
- IP whitelisting
- Rate limiting
- Request size limits
## Monitoring & Observability
### Metrics
- Prometheus metrics
- Custom business metrics
- Performance tracking
- Resource utilization
### Logging
- Structured logging
- Centralized log aggregation
- Log levels and filtering
- Correlation IDs
### Tracing
- Distributed tracing (OpenTelemetry)
- LLM monitoring (Langfuse)
- Execution visualization
- Performance profiling
### Alerting
- Error rate monitoring
- SLA tracking
- Resource alerts
- Custom alerts
## Error Handling
### Retry Strategies
- Exponential backoff
- Circuit breakers
- Dead letter queues
- Manual retry options
### Failure Modes
- Node-level failures
- Workflow-level failures
- System-level failures
- Graceful degradation
## API Endpoints
### Workflow Management
```
POST /api/workflows # Create workflow
GET /api/workflows/:id # Get workflow
PUT /api/workflows/:id # Update workflow
DELETE /api/workflows/:id # Delete workflow
POST /api/workflows/:id/activate # Activate workflow
POST /api/workflows/:id/pause # Pause workflow
```
### Execution Management
```
POST /api/workflows/:id/execute # Manual execution
GET /api/executions/:id # Get execution status
POST /api/executions/:id/cancel # Cancel execution
GET /api/executions/:id/logs # Get execution logs
```
### Trigger Management
```
GET /api/workflows/:id/triggers # List triggers
POST /api/workflows/:id/triggers # Add trigger
PUT /api/triggers/:id # Update trigger
DELETE /api/triggers/:id # Remove trigger
```
### Webhook Endpoints
```
POST /webhooks/:path # Webhook receiver
GET /api/webhooks # List webhooks
```
## Database Schema
### Core Tables
```sql
-- Workflows table
CREATE TABLE workflows (
id UUID PRIMARY KEY,
name VARCHAR(255),
description TEXT,
project_id UUID,
status VARCHAR(50),
definition JSONB,
created_at TIMESTAMP,
updated_at TIMESTAMP
);
-- Workflow executions
CREATE TABLE workflow_executions (
id UUID PRIMARY KEY,
workflow_id UUID,
status VARCHAR(50),
started_at TIMESTAMP,
completed_at TIMESTAMP,
context JSONB,
result JSONB,
error TEXT
);
-- Triggers
CREATE TABLE triggers (
id UUID PRIMARY KEY,
workflow_id UUID,
type VARCHAR(50),
config JSONB,
is_active BOOLEAN
);
-- Webhook registrations
CREATE TABLE webhook_registrations (
id UUID PRIMARY KEY,
workflow_id UUID,
path VARCHAR(255) UNIQUE,
secret VARCHAR(255),
config JSONB
);
```
## Deployment
### Docker Compose (Development)
```yaml
services:
api:
build: .
ports:
- "8000:8000"
depends_on:
- postgres
- redis
- rabbitmq
worker:
build: .
command: python -m workflow_engine.worker
depends_on:
- postgres
- redis
- rabbitmq
scheduler:
build: .
command: python -m workflow_engine.scheduler
depends_on:
- postgres
- redis
```
### Kubernetes (Production)
- Deployment manifests for each service
- Horizontal Pod Autoscaling
- Service mesh (Istio)
- Persistent volume claims
## Future Enhancements
1. **Workflow Versioning**: Track and manage workflow versions
2. **A/B Testing**: Test different workflow variations
3. **Workflow Templates**: Pre-built workflow templates
4. **Advanced Analytics**: Detailed execution analytics
5. **Multi-tenancy**: Full isolation between projects
6. **Workflow Marketplace**: Share and monetize workflows
7. **Visual Debugging**: Step-through debugging
8. **Performance Optimization**: ML-based optimization

View File

@ -1,212 +0,0 @@
# Workflow System Migration Guide
## Overview
This guide explains how to apply the database migrations for the new workflow system.
## Prerequisites
- Supabase CLI installed
- Access to your Supabase project
- Database backup (recommended)
## Migration Files
1. **`20250115000000_workflow_system.sql`** - Main migration that creates all workflow tables
2. **`20250115000001_workflow_system_rollback.sql`** - Rollback migration (if needed)
## Tables Created
The migration creates the following tables:
### Core Tables
- `workflows` - Workflow definitions and configurations
- `workflow_executions` - Execution history and status
- `triggers` - Trigger configurations
- `webhook_registrations` - Webhook endpoints
- `scheduled_jobs` - Cron-based schedules
- `workflow_templates` - Pre-built templates
- `workflow_execution_logs` - Detailed execution logs
- `workflow_variables` - Workflow-specific variables
### Enum Types
- `workflow_status` - draft, active, paused, disabled, archived
- `execution_status` - pending, running, completed, failed, cancelled, timeout
- `trigger_type` - webhook, schedule, event, polling, manual, workflow
- `node_type` - trigger, agent, tool, condition, loop, parallel, etc.
- `connection_type` - data, tool, processed_data, action, condition
## Applying the Migration
### Local Development
```bash
# Navigate to your backend directory
cd backend
# Apply the migration locally
supabase migration up
# Or apply specific migration
supabase db push --file supabase/migrations/20250115000000_workflow_system.sql
```
### Production
```bash
# Link to your production project
supabase link --project-ref your-project-ref
# Apply migrations to production
supabase db push
```
## Verification
After applying the migration, verify the tables were created:
```sql
-- Check if tables exist
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
AND table_name LIKE 'workflow%';
-- Check enum types
SELECT typname
FROM pg_type
WHERE typname IN ('workflow_status', 'execution_status', 'trigger_type', 'node_type', 'connection_type');
```
## Row Level Security (RLS)
The migration includes RLS policies that ensure:
1. Users can only access workflows in their projects
2. Service role has full access for system operations
3. Workflow templates are publicly viewable
4. Execution logs are project-scoped
## Post-Migration Steps
### 1. Update Environment Variables
Add these to your `.env` file:
```env
# Workflow system configuration
WORKFLOW_EXECUTION_TIMEOUT=3600
WORKFLOW_MAX_RETRIES=3
WEBHOOK_BASE_URL=https://api.yourdomain.com
```
### 2. Initialize Workflow Engine
The workflow engine needs to be initialized on API startup:
```python
# In your api.py
from workflow_engine.api import initialize as init_workflow_engine
@asynccontextmanager
async def lifespan(app: FastAPI):
# ... existing initialization
# Initialize workflow engine
await init_workflow_engine()
yield
# ... cleanup
```
### 3. Add API Routes
Include the workflow API routes:
```python
# In your api.py
from workflow_engine import api as workflow_api
app.include_router(workflow_api.router, prefix="/api")
```
### 4. Update Docker Compose
Add workflow worker service:
```yaml
# In docker-compose.yml
workflow-worker:
build: .
command: python -m workflow_engine.worker
environment:
- DATABASE_URL=${DATABASE_URL}
- REDIS_URL=${REDIS_URL}
depends_on:
- postgres
- redis
- rabbitmq
```
## Rollback Instructions
If you need to rollback the migration:
```bash
# Apply the rollback migration
supabase db push --file supabase/migrations/20250115000001_workflow_system_rollback.sql
```
**Warning**: This will delete all workflow data. Make sure to backup any important data first.
## Troubleshooting
### Common Issues
1. **Foreign key constraints fail**
- Ensure the `projects` and `project_members` tables exist
- Check that `auth.users` is accessible
2. **Enum type already exists**
- Drop the existing type: `DROP TYPE IF EXISTS workflow_status CASCADE;`
3. **Permission denied**
- Ensure you're using the service role key for migrations
### Checking Migration Status
```sql
-- View applied migrations
SELECT * FROM supabase_migrations.schema_migrations;
-- Check for any failed migrations
SELECT * FROM supabase_migrations.schema_migrations WHERE success = false;
```
## Performance Considerations
The migration includes several indexes for optimal performance:
- Workflow lookups by project_id
- Execution queries by status and time
- Webhook path lookups
- Scheduled job next run times
Monitor query performance and add additional indexes as needed.
## Security Notes
1. All tables have RLS enabled
2. Webhook secrets are stored encrypted
3. Workflow variables can be marked as secrets
4. API authentication required for all operations
## Next Steps
After successful migration:
1. Test workflow creation via API
2. Verify webhook endpoints are accessible
3. Test scheduled job creation
4. Monitor execution performance
5. Set up cleanup jobs for old execution logs