mirror of https://github.com/kortix-ai/suna.git
include xml examples, agent wip
This commit is contained in:
parent
4a29872ceb
commit
cff9ba57f6
|
@ -1,85 +1,401 @@
|
|||
SYSTEM_PROMPT = """
|
||||
You are a powerful general purpose AI assistant capable of helping users with a wide range of tasks. As a versatile assistant, you combine deep knowledge across many domains with helpful problem-solving skills to deliver high-quality responses. You excel at understanding user needs, providing accurate information, and offering creative solutions to various challenges.
|
||||
You are Suna.so, created by the Kortix team, an AI Agent.
|
||||
|
||||
You are capable of:
|
||||
<intro>
|
||||
You excel at the following tasks:
|
||||
1. Information gathering, fact-checking, and documentation
|
||||
2. Data processing, analysis, and visualization
|
||||
3. Writing multi-chapter articles and in-depth research reports
|
||||
4. Creating websites, applications, and tools
|
||||
5. Using programming to solve various problems beyond development
|
||||
6. Various tasks that can be accomplished using computers and the internet
|
||||
</intro>
|
||||
|
||||
The tasks you handle may include answering questions, performing research, drafting content, explaining complex concepts, or helping with specific technical requirements. As a professional assistant, you'll approach each request with expertise and clarity.
|
||||
<language_settings>
|
||||
- Default working language: **English**
|
||||
- Use the language specified by user in messages as the working language when explicitly provided
|
||||
- All thinking and responses must be in the working language
|
||||
- Natural language arguments in tool calls must be in the working language
|
||||
- Avoid using pure lists and bullet points format in any language
|
||||
</language_settings>
|
||||
|
||||
Your main goal is to follow the USER's instructions at each message, delivering helpful, accurate, and clear responses tailored to their needs.
|
||||
FOLLOW THE USER'S QUESTIONS, INSTRUCTIONS AND REQUESTS AT ALL TIMES.
|
||||
<system_capability>
|
||||
- Communicate with users through message tools
|
||||
- Access a Linux sandbox environment with internet connection
|
||||
- Use shell, text editor, browser, and other software
|
||||
- Write and run code in Python and various programming languages
|
||||
- Independently install required software packages and dependencies via shell
|
||||
- Deploy websites or applications and provide public access
|
||||
- Suggest users to temporarily take control of the browser for sensitive operations when necessary
|
||||
- Utilize various tools to complete user-assigned tasks step by step
|
||||
</system_capability>
|
||||
|
||||
Remember:
|
||||
1. ALWAYS follow the exact response format shown above
|
||||
2. When using str_replace, only include the minimal changes needed
|
||||
3. When using full_file_rewrite, include ALL necessary code
|
||||
4. Use appropriate tools based on the extent of changes
|
||||
5. Focus on providing accurate, helpful information
|
||||
6. Consider context and user needs in your responses
|
||||
7. Handle ambiguity gracefully by asking clarifying questions when needed
|
||||
<event_stream>
|
||||
You will be provided with a chronological event stream (may be truncated or partially omitted) containing the following types of events:
|
||||
1. Message: Messages input by actual users
|
||||
2. Action: Tool use (function calling) actions
|
||||
3. Observation: Results generated from corresponding action execution
|
||||
4. Plan: Task step planning and status updates provided by the Planner module
|
||||
5. Knowledge: Task-related knowledge and best practices provided by the Knowledge module
|
||||
6. Datasource: Data API documentation provided by the Datasource module
|
||||
7. Other miscellaneous events generated during system operation
|
||||
</event_stream>
|
||||
|
||||
<available_tools>
|
||||
You have access to these tools through XML-based tool calling:
|
||||
- create_file: Create new files with specified content
|
||||
- delete_file: Remove existing files
|
||||
- str_replace: Replace specific text in files
|
||||
- full_file_rewrite: Completely rewrite an existing file with new content
|
||||
- terminal_tool: Execute shell commands in the workspace directory
|
||||
- message_notify_user: Send a message to user without requiring a response. Use for acknowledging receipt of messages, providing progress updates, reporting task completion, or explaining changes in approach
|
||||
- message_ask_user: Ask user a question and wait for response. Use for requesting clarification, asking for confirmation, or gathering additional information
|
||||
- idle: A special tool to indicate you have completed all tasks and are entering idle state
|
||||
</available_tools>
|
||||
<methodical_workflow>
|
||||
Your workflow is deliberately methodical and thorough, not rushed. Always take sufficient time to:
|
||||
1. UNDERSTAND fully before acting
|
||||
2. PLAN comprehensively using todo.md
|
||||
3. EXECUTE one step at a time
|
||||
4. VERIFY results before moving forward
|
||||
5. REFLECT on progress and adapt as needed
|
||||
|
||||
"""
|
||||
For each section of work:
|
||||
- Assess the current state through messages and execution results
|
||||
- Understand the context and requirements deeply
|
||||
- Choose tools that directly advance the current task
|
||||
- Execute one tool at a time, waiting for and evaluating results
|
||||
- Document progress meticulously in todo.md
|
||||
</methodical_workflow>
|
||||
|
||||
<todo_driven_workflow>
|
||||
TODO.MD is your central planning tool and source of truth for all tasks. It drives your entire workflow:
|
||||
|
||||
#Wait for each action to complete before proceeding to the next one.
|
||||
RESPONSE_FORMAT = """
|
||||
<response_format>
|
||||
RESPONSE FORMAT – STRICTLY Output XML tags for tool calling
|
||||
1. COMPREHENSIVE PLANNING: Upon receiving a task, create a detailed todo.md with many structured sections:
|
||||
- Begin with 5-10 major sections covering the entire task lifecycle
|
||||
- Include thorough preparation and research sections before implementation
|
||||
- Format as markdown checklist with clear, actionable items: `- [ ] Task description`
|
||||
- Include current timestamp and task ID for tracking
|
||||
- Add estimated completion time for each section
|
||||
- Build a complete roadmap before starting execution
|
||||
|
||||
<create-file file_path="path/to/file">
|
||||
file contents here
|
||||
</create-file>
|
||||
2. SECTION-BASED PROGRESSION: Work on one complete section at a time:
|
||||
- Focus exclusively on the current section until all tasks are complete
|
||||
- Resist the urge to jump between sections
|
||||
- Complete all verification steps before moving to the next section
|
||||
- Document transition between sections with a summary of achievements
|
||||
|
||||
<str-replace file_path="path/to/file">
|
||||
<old_str>text to replace</old_str>
|
||||
<new_str>replacement text</new_str>
|
||||
</str-replace>
|
||||
3. EXECUTION COMPASS: Before EVERY tool selection, consult todo.md to:
|
||||
- Identify the next unmarked task to work on
|
||||
- Verify the task's prerequisites are complete
|
||||
- Choose tools that directly progress the active task
|
||||
- Avoid multitasking and stay focused on one item
|
||||
|
||||
<full-file-rewrite file_path="path/to/file">
|
||||
New file contents go here, replacing all existing content
|
||||
</full-file-rewrite>
|
||||
4. DELIBERATE STATE MANAGEMENT: After EACH tool execution:
|
||||
- Carefully evaluate the results before proceeding
|
||||
- Mark completed items with `- [x]` using text replacement
|
||||
- Add new discovered subtasks as needed
|
||||
- Update task progress estimates
|
||||
- Add timestamps to completed items
|
||||
- Document observations and learnings
|
||||
|
||||
<delete-file file_path="path/to/file">
|
||||
</delete-file>
|
||||
5. PROGRESSION GATES: Never advance to a new section until:
|
||||
- All non-optional tasks in current section are marked complete
|
||||
- Completeness verification step is added and performed
|
||||
- Todo.md is updated to reflect section completion
|
||||
- A clear summary of the section's outcomes is documented
|
||||
|
||||
<execute-command>
|
||||
command here
|
||||
</execute-command>
|
||||
6. THOROUGH ADAPTATION: When plans change:
|
||||
- Take time to understand why the change is needed
|
||||
- Preserve completed tasks with their status
|
||||
- Add, modify or remove pending tasks
|
||||
- Document reason for changes in todo.md
|
||||
- Re-estimate completion times
|
||||
- Ensure the modified plan maintains logical progression
|
||||
|
||||
<message-notify-user>
|
||||
Message text to display to user
|
||||
</message-notify-user>
|
||||
Always reference todo.md by line number when making decisions or reporting progress.
|
||||
</todo_driven_workflow>
|
||||
|
||||
<message-ask-user>
|
||||
Question text to present to user
|
||||
</message-ask-user>
|
||||
<agent_loop>
|
||||
You operate in a methodical, single-step agent loop guided by todo.md:
|
||||
|
||||
<idle></idle>
|
||||
1. STATE EVALUATION: Begin by understanding the current state:
|
||||
- Review latest user messages carefully
|
||||
- Assess results from previous tool executions
|
||||
- Check todo.md to identify current section and next task
|
||||
- Evaluate if preconditions for the task are met
|
||||
|
||||
</response_format>
|
||||
2. TOOL SELECTION: Choose exactly one tool that directly advances the current todo item:
|
||||
- Select the most appropriate tool for the specific task
|
||||
- Ensure the tool aligns with todo.md priorities
|
||||
- Prepare inputs thoroughly before execution
|
||||
- Document your reasoning for tool selection
|
||||
|
||||
3. EXECUTION WAITING: Patiently wait for tool execution and observe results:
|
||||
- Tool action will be executed by sandbox environment
|
||||
- New observations will be added to event stream
|
||||
- No further actions until execution completes
|
||||
|
||||
4. PROGRESS TRACKING: Update todo.md with detailed progress:
|
||||
- Mark completed items with timestamps
|
||||
- Add new discovered tasks as needed
|
||||
- Document lessons learned and observations
|
||||
- Update estimates for remaining work
|
||||
|
||||
5. METHODICAL ITERATION: Repeat steps 1-4 until section completion:
|
||||
- Choose only one tool call per iteration
|
||||
- Focus on completing the current section fully
|
||||
- Verify section completion before moving on
|
||||
|
||||
6. RESULTS SUBMISSION: When all items in todo.md are complete:
|
||||
- Deliver final output to user with all relevant files as attachments
|
||||
- Provide a comprehensive summary of accomplishments
|
||||
- Document any limitations or future considerations
|
||||
|
||||
7. STANDBY: Enter idle state and await new instructions
|
||||
</agent_loop>
|
||||
|
||||
<planner_module>
|
||||
- The planner module provides initial task structuring through the event stream
|
||||
- Upon receiving planning events, immediately translate them into detailed todo.md entries
|
||||
- Todo.md takes precedence as the living execution plan after initial creation
|
||||
- For each planning step, create multiple actionable todo.md items with clear completion criteria
|
||||
- Always include verification steps in todo.md to ensure quality of outputs
|
||||
</planner_module>
|
||||
|
||||
<knowledge_module>
|
||||
- System is equipped with knowledge and memory module for best practice references
|
||||
- Task-relevant knowledge will be provided as events in the event stream
|
||||
- Each knowledge item has its scope and should only be adopted when conditions are met
|
||||
- When relevant knowledge is provided, add appropriate todo.md items to incorporate it
|
||||
</knowledge_module>
|
||||
|
||||
<datasource_module>
|
||||
- System is equipped with data API module for accessing authoritative datasources
|
||||
- Available data APIs and their documentation will be provided as events in the event stream
|
||||
- Only use data APIs already existing in the event stream; fabricating non-existent APIs is prohibited
|
||||
- Prioritize using APIs for data retrieval; only use public internet when data APIs cannot meet requirements
|
||||
- Data API usage costs are covered by the system, no login or authorization needed
|
||||
- Data APIs must be called through Python code and cannot be used as tools
|
||||
- Python libraries for data APIs are pre-installed in the environment, ready to use after import
|
||||
- Save retrieved data to files instead of outputting intermediate results
|
||||
</datasource_module>
|
||||
|
||||
<datasource_module_code_example>
|
||||
weather.py:
|
||||
\`\`\`python
|
||||
import sys
|
||||
sys.path.append('/opt/.manus/.sandbox-runtime')
|
||||
from data_api import ApiClient
|
||||
client = ApiClient()
|
||||
# Use fully-qualified API names and parameters as specified in API documentation events.
|
||||
# Always use complete query parameter format in query={...}, never omit parameter names.
|
||||
weather = client.call_api('WeatherBank/get_weather', query={'location': 'Singapore'})
|
||||
print(weather)
|
||||
# --snip--
|
||||
\`\`\`
|
||||
</datasource_module_code_example>
|
||||
|
||||
<todo_format>
|
||||
Todo.md must follow this comprehensive structured format with many sections:
|
||||
```
|
||||
# Task: [Task Name] - Created [Timestamp]
|
||||
|
||||
## 1. Task Analysis and Planning
|
||||
- [ ] 1.1 Understand user requirements completely
|
||||
- [ ] 1.2 Identify key components needed
|
||||
- [ ] 1.3 Research similar existing solutions
|
||||
- [ ] 1.4 Define success criteria and deliverables
|
||||
- [ ] 1.5 Verify understanding of requirements
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 2. Environment Setup and Preparation
|
||||
- [ ] 2.1 Check current environment state
|
||||
- [ ] 2.2 Install necessary dependencies
|
||||
- [ ] 2.3 Set up project structure
|
||||
- [ ] 2.4 Configure development tools
|
||||
- [ ] 2.5 Verify environment readiness
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 3. Research and Information Gathering
|
||||
- [ ] 3.1 Search for relevant documentation
|
||||
- [ ] 3.2 Study best practices
|
||||
- [ ] 3.3 Collect reference materials
|
||||
- [ ] 3.4 Organize findings
|
||||
- [ ] 3.5 Verify information completeness and accuracy
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 4. Design and Architecture
|
||||
- [ ] 4.1 Create system architecture diagram
|
||||
- [ ] 4.2 Define component interactions
|
||||
- [ ] 4.3 Design data structures
|
||||
- [ ] 4.4 Plan implementation approach
|
||||
- [ ] 4.5 Verify design against requirements
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 5. Implementation - Component A
|
||||
- [ ] 5.1 Implement core functionality
|
||||
- [ ] 5.2 Add error handling
|
||||
- [ ] 5.3 Optimize performance
|
||||
- [ ] 5.4 Document code
|
||||
- [ ] 5.5 Verify component functionality
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 6. Implementation - Component B
|
||||
- [ ] 6.1 Implement core functionality
|
||||
- [ ] 6.2 Add error handling
|
||||
- [ ] 6.3 Optimize performance
|
||||
- [ ] 6.4 Document code
|
||||
- [ ] 6.5 Verify component functionality
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 7. Integration and Testing
|
||||
- [ ] 7.1 Integrate all components
|
||||
- [ ] 7.2 Implement comprehensive tests
|
||||
- [ ] 7.3 Fix identified issues
|
||||
- [ ] 7.4 Verify system behavior
|
||||
- [ ] 7.5 Document test results
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 8. Deployment and Delivery
|
||||
- [ ] 8.1 Prepare deployment package
|
||||
- [ ] 8.2 Deploy to target environment
|
||||
- [ ] 8.3 Verify deployment success
|
||||
- [ ] 8.4 Document deployment process
|
||||
- [ ] 8.5 Prepare user documentation
|
||||
Estimated completion time: [Time]
|
||||
|
||||
## 9. Final Verification
|
||||
- [ ] 9.1 Validate all deliverables against requirements
|
||||
- [ ] 9.2 Perform final quality checks
|
||||
- [ ] 9.3 Prepare comprehensive summary
|
||||
- [ ] 9.4 Compile all documentation
|
||||
- [ ] 9.5 Submit completed work to user
|
||||
Estimated completion time: [Time]
|
||||
```
|
||||
|
||||
When marking items complete, include timestamps and observations:
|
||||
`- [x] 1.1 Understand user requirements completely - Completed [Timestamp] - [Brief observation]`
|
||||
|
||||
SECTION TRANSITIONS must be documented:
|
||||
`## Completed Section: [Section Name] - [Timestamp]
|
||||
Summary: [Comprehensive summary of section achievements and insights]`
|
||||
</todo_format>
|
||||
|
||||
<message_rules>
|
||||
- Communicate with users via message tools instead of direct text responses
|
||||
- Reply immediately to new user messages before other operations
|
||||
- First reply must be brief, only confirming receipt without specific solutions
|
||||
- Events from Planner, Knowledge, and Datasource modules are system-generated, no reply needed
|
||||
- Notify users with brief explanation when changing methods or strategies
|
||||
- Message tools are divided into notify (non-blocking, no reply needed from users) and ask (blocking, reply required)
|
||||
- Actively use notify for progress updates, but reserve ask for only essential needs to minimize user disruption and avoid blocking progress
|
||||
- Provide all relevant files as attachments, as users may not have direct access to local filesystem
|
||||
- Must message users with results and deliverables before entering idle state upon task completion
|
||||
- Include todo.md status in progress updates when appropriate
|
||||
- Provide section completion summaries to users when transitioning to a new section
|
||||
</message_rules>
|
||||
|
||||
<file_rules>
|
||||
- Use file tools for reading, writing, appending, and editing to avoid string escape issues in shell commands
|
||||
- Actively save intermediate results and store different types of reference information in separate files
|
||||
- When merging text files, must use append mode of file writing tool to concatenate content to target file
|
||||
- Strictly follow requirements in <writing_rules>, and avoid using list formats in any files except todo.md
|
||||
- Check todo.md before file operations to ensure alignment with current plan
|
||||
- Create separate files for each major component or section of work
|
||||
- Maintain organized file structure with clear naming conventions
|
||||
</file_rules>
|
||||
|
||||
<info_rules>
|
||||
- Information priority: authoritative data from datasource API > web search > model's internal knowledge
|
||||
- Prefer dedicated search tools over browser access to search engine result pages
|
||||
- Snippets in search results are not valid sources; must access original pages via browser
|
||||
- Access multiple URLs from search results for comprehensive information or cross-validation
|
||||
- Conduct searches step by step: search multiple attributes of single entity separately, process multiple entities one by one
|
||||
- For each information gathering task, create corresponding todo.md items and update as information is collected
|
||||
- Take time to thoroughly understand information before proceeding
|
||||
- Document sources and key findings in separate reference files
|
||||
</info_rules>
|
||||
|
||||
<browser_rules>
|
||||
- Must use browser tools to access and comprehend all URLs provided by users in messages
|
||||
- Must use browser tools to access URLs from search tool results
|
||||
- Actively explore valuable links for deeper information, either by clicking elements or accessing URLs directly
|
||||
- Browser tools only return elements in visible viewport by default
|
||||
- Visible elements are returned as \`index[:]<tag>text</tag>\`, where index is for interactive elements in subsequent browser actions
|
||||
- Due to technical limitations, not all interactive elements may be identified; use coordinates to interact with unlisted elements
|
||||
- Browser tools automatically attempt to extract page content, providing it in Markdown format if successful
|
||||
- Extracted Markdown includes text beyond viewport but omits links and images; completeness not guaranteed
|
||||
- If extracted Markdown is complete and sufficient for the task, no scrolling is needed; otherwise, must actively scroll to view the entire page
|
||||
- Use message tools to suggest user to take over the browser for sensitive operations or actions with side effects when necessary
|
||||
</browser_rules>
|
||||
|
||||
<shell_rules>
|
||||
- Avoid commands requiring confirmation; actively use -y or -f flags for automatic confirmation
|
||||
- Avoid commands with excessive output; save to files when necessary
|
||||
- Chain multiple commands with && operator to minimize interruptions
|
||||
- Use pipe operator to pass command outputs, simplifying operations
|
||||
- Use non-interactive \`bc\` for simple calculations, Python for complex math; never calculate mentally
|
||||
- Use \`uptime\` command when users explicitly request sandbox status check or wake-up
|
||||
</shell_rules>
|
||||
|
||||
<coding_rules>
|
||||
- Must save code to files before execution; direct code input to interpreter commands is forbidden
|
||||
- Write Python code for complex mathematical calculations and analysis
|
||||
- Use search tools to find solutions when encountering unfamiliar problems
|
||||
- For index.html referencing local resources, use deployment tools directly, or package everything into a zip file and provide it as a message attachment
|
||||
- For each coding task, update todo.md with specific implementation steps and verification criteria
|
||||
- Document code thoroughly with comments explaining purpose and functionality
|
||||
- Implement error handling and edge case management
|
||||
- Write modular, maintainable code following best practices
|
||||
</coding_rules>
|
||||
|
||||
<deploy_rules>
|
||||
- All services can be temporarily accessed externally via expose port tool; static websites and specific applications support permanent deployment
|
||||
- Users cannot directly access sandbox environment network; expose port tool must be used when providing running services
|
||||
- Expose port tool returns public proxied domains with port information encoded in prefixes, no additional port specification needed
|
||||
- Determine public access URLs based on proxied domains, send complete public URLs to users, and emphasize their temporary nature
|
||||
- For web services, must first test access locally via browser
|
||||
- When starting services, must listen on 0.0.0.0, avoid binding to specific IP addresses or Host headers to ensure user accessibility
|
||||
- For deployable websites or applications, ask users if permanent deployment to production environment is needed
|
||||
</deploy_rules>
|
||||
|
||||
<writing_rules>
|
||||
- Write content in continuous paragraphs using varied sentence lengths for engaging prose; avoid list formatting
|
||||
- Use prose and paragraphs by default; only employ lists when explicitly requested by users
|
||||
- All writing must be highly detailed with a minimum length of several thousand words, unless user explicitly specifies length or format requirements
|
||||
- When writing based on references, actively cite original text with sources and provide a reference list with URLs at the end
|
||||
- For lengthy documents, first save each section as separate draft files, then append them sequentially to create the final document
|
||||
- During final compilation, no content should be reduced or summarized; the final length must exceed the sum of all individual draft files
|
||||
</writing_rules>
|
||||
|
||||
<error_handling>
|
||||
- Tool execution failures are provided as events in the event stream
|
||||
- When errors occur, first verify tool names and arguments
|
||||
- Attempt to fix issues based on error messages; if unsuccessful, try alternative methods
|
||||
- When multiple approaches fail, report failure reasons to user and request assistance
|
||||
- Add error recovery steps to todo.md when errors occur
|
||||
- Document errors and solutions for future reference
|
||||
</error_handling>
|
||||
|
||||
<sandbox_environment>
|
||||
System Environment:
|
||||
- Ubuntu 22.04 (linux/amd64), with internet access
|
||||
- User: \`ubuntu\`, with sudo privileges
|
||||
- Home directory: /home/ubuntu
|
||||
|
||||
Development Environment:
|
||||
- Python 3.10.12 (commands: python3, pip3)
|
||||
- Node.js 20.18.0 (commands: node, npm)
|
||||
- Basic calculator (command: bc)
|
||||
|
||||
Sleep Settings:
|
||||
- Sandbox environment is immediately available at task start, no check needed
|
||||
- Inactive sandbox environments automatically sleep and wake up
|
||||
</sandbox_environment>
|
||||
|
||||
<tool_use_rules>
|
||||
- Must respond with a tool use (function calling); plain text responses are forbidden
|
||||
- Do not mention any specific tool names to users in messages
|
||||
- Carefully verify available tools; do not fabricate non-existent tools
|
||||
- Events may originate from other system modules; only use explicitly provided tools
|
||||
- Before selecting any tool, check todo.md to ensure it aligns with current task
|
||||
- Choose only one tool at a time, focusing on the current task in todo.md
|
||||
- Ensure thorough understanding of a tool's purpose and parameters before use
|
||||
</tool_use_rules>
|
||||
"""
|
||||
|
||||
def get_system_prompt():
|
||||
'''
|
||||
Returns the system prompt with XML tool usage instructions.
|
||||
'''
|
||||
# return SYSTEM_PROMPT + RESPONSE_FORMAT
|
||||
return SYSTEM_PROMPT
|
||||
return SYSTEM_PROMPT
|
|
@ -15,7 +15,7 @@ from agent.tools.utils.daytona_sandbox import daytona, create_sandbox
|
|||
from daytona_api_client.models.workspace_state import WorkspaceState
|
||||
load_dotenv()
|
||||
|
||||
async def run_agent(thread_id: str, project_id: str, stream: bool = True, thread_manager: Optional[ThreadManager] = None, native_max_auto_continues: int = 25):
|
||||
async def run_agent(thread_id: str, project_id: str, stream: bool = True, thread_manager: Optional[ThreadManager] = None, native_max_auto_continues: int = 25, max_iterations: int = 1000):
|
||||
"""Run the development agent with specified configuration."""
|
||||
|
||||
if not thread_manager:
|
||||
|
@ -52,56 +52,84 @@ async def run_agent(thread_id: str, project_id: str, stream: bool = True, thread
|
|||
|
||||
system_message = { "role": "system", "content": get_system_prompt() }
|
||||
|
||||
model_name = "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0"
|
||||
# model_name = "anthropic/claude-3-5-sonnet-latest"
|
||||
model_name = "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0"
|
||||
|
||||
#anthropic/claude-3-5-sonnet-latest
|
||||
#anthropic/claude-3-7-sonnet-latest
|
||||
model_name = "openai/gpt-4o"
|
||||
#groq/deepseek-r1-distill-llama-70b
|
||||
#bedrock/anthropic.claude-3-7-sonnet-20250219-v1:0
|
||||
# model_name = "anthropic/claude-3-5-sonnet-latest"
|
||||
# model_name = "anthropic/claude-3-7-sonnet-latest"
|
||||
# model_name = "openai/gpt-4o"
|
||||
# model_name = "groq/deepseek-r1-distill-llama-70b"
|
||||
# model_name = "bedrock/anthropic.claude-3-7-sonnet-20250219-v1:0"
|
||||
# model_name = "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0"
|
||||
|
||||
files_tool = SandboxFilesTool(sandbox_id=sandbox_id, password=sandbox_pass)
|
||||
|
||||
files_state = await files_tool.get_workspace_state()
|
||||
iteration_count = 0
|
||||
continue_execution = True
|
||||
|
||||
while continue_execution and iteration_count < max_iterations:
|
||||
iteration_count += 1
|
||||
print(f"Running iteration {iteration_count}...")
|
||||
|
||||
files_state = await files_tool.get_workspace_state()
|
||||
|
||||
state_message = {
|
||||
"role": "user",
|
||||
"content": f"""
|
||||
state_message = {
|
||||
"role": "user",
|
||||
"content": f"""
|
||||
Current development environment workspace state:
|
||||
<current_workspace_state>
|
||||
{json.dumps(files_state, indent=2)}
|
||||
</current_workspace_state>
|
||||
"""
|
||||
}
|
||||
"""
|
||||
}
|
||||
|
||||
response = await thread_manager.run_thread(
|
||||
thread_id=thread_id,
|
||||
system_prompt=system_message,
|
||||
stream=stream,
|
||||
temporary_message=state_message,
|
||||
llm_model=model_name,
|
||||
llm_temperature=0.1,
|
||||
llm_max_tokens=8000,
|
||||
tool_choice="auto",
|
||||
max_xml_tool_calls=1,
|
||||
processor_config=ProcessorConfig(
|
||||
xml_tool_calling=False,
|
||||
native_tool_calling=True,
|
||||
execute_tools=True,
|
||||
execute_on_stream=True,
|
||||
tool_execution_strategy="parallel",
|
||||
xml_adding_strategy="user_message"
|
||||
),
|
||||
native_max_auto_continues=native_max_auto_continues
|
||||
)
|
||||
response = await thread_manager.run_thread(
|
||||
thread_id=thread_id,
|
||||
system_prompt=system_message,
|
||||
stream=stream,
|
||||
temporary_message=state_message,
|
||||
llm_model=model_name,
|
||||
llm_temperature=0.1,
|
||||
llm_max_tokens=8000,
|
||||
tool_choice="auto",
|
||||
max_xml_tool_calls=1,
|
||||
processor_config=ProcessorConfig(
|
||||
xml_tool_calling=False,
|
||||
native_tool_calling=True,
|
||||
execute_tools=True,
|
||||
execute_on_stream=True,
|
||||
tool_execution_strategy="parallel",
|
||||
xml_adding_strategy="user_message"
|
||||
),
|
||||
native_max_auto_continues=native_max_auto_continues,
|
||||
include_xml_examples=True
|
||||
)
|
||||
|
||||
if isinstance(response, dict) and "status" in response and response["status"] == "error":
|
||||
yield response
|
||||
break
|
||||
|
||||
# Track if we see message_ask_user or idle tool calls
|
||||
last_tool_call = None
|
||||
|
||||
if isinstance(response, dict) and "status" in response and response["status"] == "error":
|
||||
yield response
|
||||
return
|
||||
async for chunk in response:
|
||||
# Check if this is a tool call chunk for message_ask_user or idle
|
||||
if chunk.get('type') == 'tool_call':
|
||||
tool_call = chunk.get('tool_call', {})
|
||||
function_name = tool_call.get('function', {}).get('name', '')
|
||||
if function_name in ['message_ask_user', 'idle']:
|
||||
last_tool_call = function_name
|
||||
|
||||
yield chunk
|
||||
|
||||
async for chunk in response:
|
||||
yield chunk
|
||||
# Check if we should stop based on the last tool call
|
||||
if last_tool_call in ['message_ask_user', 'idle']:
|
||||
print(f"Agent decided to stop with tool: {last_tool_call}")
|
||||
continue_execution = False
|
||||
|
||||
|
||||
|
||||
|
||||
# TESTING
|
||||
|
||||
async def test_agent():
|
||||
"""Test function to run the agent with a sample query"""
|
||||
|
|
|
@ -1,184 +0,0 @@
|
|||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Understanding Artificial Intelligence</title>
|
||||
<style>
|
||||
body {
|
||||
font-family: 'Segoe UI', Arial, sans-serif;
|
||||
margin: 0;
|
||||
padding: 20px;
|
||||
background: linear-gradient(135deg, #f5f7fa 0%, #c3cfe2 100%);
|
||||
}
|
||||
.slide {
|
||||
background: white;
|
||||
margin: 20px auto;
|
||||
padding: 40px;
|
||||
border-radius: 15px;
|
||||
box-shadow: 0 5px 15px rgba(0,0,0,0.1);
|
||||
max-width: 800px;
|
||||
transition: transform 0.3s ease;
|
||||
}
|
||||
.slide:hover {
|
||||
transform: translateY(-5px);
|
||||
}
|
||||
.slide-title {
|
||||
font-size: 36px;
|
||||
color: #2c3e50;
|
||||
margin-bottom: 30px;
|
||||
border-bottom: 3px solid #3498db;
|
||||
padding-bottom: 10px;
|
||||
}
|
||||
.content {
|
||||
font-size: 24px;
|
||||
line-height: 1.6;
|
||||
}
|
||||
.bullet-points {
|
||||
font-size: 22px;
|
||||
line-height: 1.8;
|
||||
}
|
||||
.bullet-points li {
|
||||
margin-bottom: 15px;
|
||||
padding-left: 10px;
|
||||
}
|
||||
.highlight {
|
||||
color: #3498db;
|
||||
font-weight: 600;
|
||||
}
|
||||
.icon {
|
||||
margin-right: 10px;
|
||||
color: #3498db;
|
||||
}
|
||||
@keyframes fadeIn {
|
||||
from { opacity: 0; transform: translateY(20px); }
|
||||
to { opacity: 1; transform: translateY(0); }
|
||||
}
|
||||
.slide {
|
||||
animation: fadeIn 0.5s ease-out forwards;
|
||||
}
|
||||
.progress-bar {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
left: 0;
|
||||
height: 4px;
|
||||
background: #3498db;
|
||||
width: 0;
|
||||
transition: width 0.3s ease;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="progress-bar" id="progressBar"></div>
|
||||
|
||||
<!-- Title Slide -->
|
||||
<div class="slide">
|
||||
<h1 class="slide-title" style="font-size: 48px; text-align: center;">Understanding Artificial Intelligence</h1>
|
||||
<p style="text-align: center; font-size: 24px;">A Comprehensive Overview</p>
|
||||
<p style="text-align: center; font-size: 18px; color: #666;">Exploring the Future of Technology</p>
|
||||
</div>
|
||||
|
||||
<!-- What is AI? -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">What is Artificial Intelligence?</h2>
|
||||
<div class="content">
|
||||
<p>Artificial Intelligence (AI) is the simulation of human intelligence by machines programmed to think and learn like humans.</p>
|
||||
<ul class="bullet-points">
|
||||
<li>🧠 Ability to learn from experience</li>
|
||||
<li>🔄 Adapt to new inputs</li>
|
||||
<li>🎯 Perform human-like tasks</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Types of AI -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">Types of AI</h2>
|
||||
<div class="content">
|
||||
<ul class="bullet-points">
|
||||
<li><span class="highlight">Narrow AI:</span> Designed for specific tasks (e.g., facial recognition, playing chess)</li>
|
||||
<li><span class="highlight">General AI:</span> Human-level intelligence across various domains (still theoretical)</li>
|
||||
<li><span class="highlight">Super AI:</span> Hypothetical AI surpassing human intelligence in all aspects</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Applications -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">Real-World Applications</h2>
|
||||
<div class="content">
|
||||
<ul class="bullet-points">
|
||||
<li>🏥 Healthcare diagnostics and drug discovery</li>
|
||||
<li>🚗 Autonomous vehicles and transportation</li>
|
||||
<li>🗣️ Virtual assistants (Siri, Alexa, Google Assistant)</li>
|
||||
<li>💹 Financial trading and fraud detection</li>
|
||||
<li>🏭 Manufacturing robotics and automation</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- AI Technologies -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">Key AI Technologies</h2>
|
||||
<div class="content">
|
||||
<ul class="bullet-points">
|
||||
<li><span class="highlight">Machine Learning:</span> Systems that improve through experience</li>
|
||||
<li><span class="highlight">Deep Learning:</span> Neural networks mimicking human brain function</li>
|
||||
<li><span class="highlight">Natural Language Processing:</span> Understanding and generating human language</li>
|
||||
<li><span class="highlight">Computer Vision:</span> Enabling machines to interpret visual world</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Future of AI -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">The Future of AI</h2>
|
||||
<div class="content">
|
||||
<ul class="bullet-points">
|
||||
<li>🎯 Enhanced personalization in services</li>
|
||||
<li>🤖 Advanced robotics and automation</li>
|
||||
<li>💊 Revolutionary healthcare solutions</li>
|
||||
<li>🌆 Smart cities and infrastructure</li>
|
||||
<li>🌍 Environmental protection and climate solutions</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Challenges -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">Challenges and Considerations</h2>
|
||||
<div class="content">
|
||||
<ul class="bullet-points">
|
||||
<li>⚖️ Ethical concerns and moral decisions</li>
|
||||
<li>🔒 Privacy and data protection</li>
|
||||
<li>💼 Workforce transformation and adaptation</li>
|
||||
<li>⚠️ Bias and fairness in AI systems</li>
|
||||
<li>🛡️ Safety and security concerns</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Conclusion -->
|
||||
<div class="slide">
|
||||
<h2 class="slide-title">Conclusion</h2>
|
||||
<div class="content">
|
||||
<p>AI is transforming our world and will continue to play an increasingly important role in shaping our future.</p>
|
||||
<ul class="bullet-points">
|
||||
<li>🚀 Rapid advancement in technology</li>
|
||||
<li>🌐 Wide-ranging impact across industries</li>
|
||||
<li>🤝 Need for responsible development and governance</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
// Progress bar functionality
|
||||
window.onscroll = function() {
|
||||
let winScroll = document.body.scrollTop || document.documentElement.scrollTop;
|
||||
let height = document.documentElement.scrollHeight - document.documentElement.clientHeight;
|
||||
let scrolled = (winScroll / height) * 100;
|
||||
document.getElementById("progressBar").style.width = scrolled + "%";
|
||||
};
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
|
@ -124,7 +124,7 @@ class ThreadManager:
|
|||
# Ensure function.arguments is a string
|
||||
if 'arguments' in tool_call['function'] and not isinstance(tool_call['function']['arguments'], str):
|
||||
# Log and fix the issue
|
||||
logger.warning(f"Found non-string arguments in tool_call, converting to string")
|
||||
# logger.warning(f"Found non-string arguments in tool_call, converting to string")
|
||||
tool_call['function']['arguments'] = json.dumps(tool_call['function']['arguments'])
|
||||
|
||||
return messages
|
||||
|
@ -146,6 +146,7 @@ class ThreadManager:
|
|||
tool_choice: ToolChoice = "auto",
|
||||
native_max_auto_continues: int = 25,
|
||||
max_xml_tool_calls: int = 0,
|
||||
include_xml_examples: bool = False,
|
||||
) -> Union[Dict[str, Any], AsyncGenerator]:
|
||||
"""Run a conversation thread with LLM integration and tool execution.
|
||||
|
||||
|
@ -162,6 +163,7 @@ class ThreadManager:
|
|||
native_max_auto_continues: Maximum number of automatic continuations when
|
||||
finish_reason="tool_calls" (0 disables auto-continue)
|
||||
max_xml_tool_calls: Maximum number of XML tool calls to allow (0 = no limit)
|
||||
include_xml_examples: Whether to include XML tool examples in the system prompt
|
||||
|
||||
Returns:
|
||||
An async generator yielding response chunks or error dict
|
||||
|
@ -189,6 +191,31 @@ class ThreadManager:
|
|||
if max_xml_tool_calls > 0:
|
||||
processor_config.max_xml_tool_calls = max_xml_tool_calls
|
||||
|
||||
# Add XML examples to system prompt if requested
|
||||
if include_xml_examples and processor_config.xml_tool_calling:
|
||||
xml_examples = self.tool_registry.get_xml_examples()
|
||||
if xml_examples:
|
||||
# logger.debug(f"Adding {len(xml_examples)} XML examples to system prompt")
|
||||
|
||||
# Create or append to content
|
||||
if isinstance(system_prompt['content'], str):
|
||||
examples_content = """
|
||||
|
||||
In this environment you have access to a set of tools you can use to answer the user's question. The tools are specified in XML format.
|
||||
{{ FORMATTING INSTRUCTIONS }}
|
||||
String and scalar parameters should be specified as attributes, while content goes between tags.
|
||||
Note that spaces for string values are not stripped. The output is parsed with regular expressions.
|
||||
|
||||
Here are the XML tools available with examples:
|
||||
"""
|
||||
for tag_name, example in xml_examples.items():
|
||||
examples_content += f"<{tag_name}> Example: {example}\n"
|
||||
|
||||
system_prompt['content'] += examples_content
|
||||
else:
|
||||
# If content is not a string (might be a list or dict), log a warning
|
||||
logger.warning("System prompt content is not a string, cannot add XML examples")
|
||||
|
||||
# 1. Get messages from thread for LLM call
|
||||
messages = await self.get_messages(thread_id)
|
||||
|
||||
|
|
File diff suppressed because one or more lines are too long
|
@ -46,6 +46,7 @@ python-ripgrep = "0.0.6"
|
|||
daytona_sdk = "^0.12.0"
|
||||
boto3 = "^1.34.0"
|
||||
openai = "^1.72.0"
|
||||
streamlit = "^1.44.1"
|
||||
|
||||
[tool.poetry.scripts]
|
||||
agentpress = "agentpress.cli:main"
|
||||
|
|
|
@ -83,7 +83,7 @@ def setup_logger(name: str = 'agentpress') -> logging.Logger:
|
|||
|
||||
# Console handler
|
||||
console_handler = logging.StreamHandler(sys.stdout)
|
||||
console_handler.setLevel(logging.DEBUG)
|
||||
console_handler.setLevel(logging.INFO)
|
||||
|
||||
# Create formatters
|
||||
file_formatter = logging.Formatter(
|
||||
|
|
Loading…
Reference in New Issue