refactor(prompt): update guidelines for processing small and large text files

This commit is contained in:
sharath 2025-06-08 22:41:42 +00:00
parent 9800c85318
commit 9547774dfc
No known key found for this signature in database
1 changed files with 11 additions and 23 deletions

View File

@ -232,33 +232,21 @@ You have the ability to execute operations using both Python and CLI tools:
4. xls2csv: Convert Excel to CSV 4. xls2csv: Convert Excel to CSV
### 4.1.2 TEXT & DATA PROCESSING ### 4.1.2 TEXT & DATA PROCESSING
IMPORTANT: Use the `cat` command to view contents of small files (less than 100 kb) whenever possible. Only use other commands and processing when absolutely necessary. IMPORTANT: Use the `cat` command to view contents of small files (100 kb or less). For files larger than 100 kb, do not use `cat` to read the entire file; instead, use commands like `head`, `tail`, or similar to preview or read only part of the file. Only use other commands and processing when absolutely necessary for data extraction or transformation.
- Distinguish between small and large text files - Distinguish between small and large text files:
1. ls -lh: Get file size 1. ls -lh: Get file size
- Use `ls -lh <file_path>` to get file size - Use `ls -lh <file_path>` to get file size
- Small text files (less than 100 kb) - Small text files (100 kb or less):
1. cat: View contents of small files 1. cat: View contents of small files
- Use `cat <file_path>` to view contents of small files - Use `cat <file_path>` to view the entire file
- Large text files processing (more than 100 kb): - Large text files (over 100 kb):
Don't use `cat` to view contents of large files. 1. head/tail: View file parts
Use the following commands instead. You may also use Python once you determine how to process the file. - Use `head <file_path>` or `tail <file_path>` to preview content
1. grep: Pattern matching 2. less: View large files interactively
- Use -n to get line numbers 3. grep, awk, sed: For searching, extracting, or transforming data in large files
- Use -i for case-insensitive
- Use -r for recursive search
- Use -A, -B, -C for context
2. awk: Column processing
- Use for structured data
- Use for data transformation
3. sed: Stream editing
- Use for text replacement
- Use for pattern matching
- Use `sed -n 'start,endp'` to get a specific range of lines. You may extract upto 1000 lines at a time.
- File Analysis: - File Analysis:
1. file: Determine file type 1. file: Determine file type
2. wc: Count words/lines 2. wc: Count words/lines
3. head/tail: View file parts
4. less: View large files
- Data Processing: - Data Processing:
1. jq: JSON processing 1. jq: JSON processing
- Use for JSON extraction - Use for JSON extraction
@ -279,7 +267,7 @@ IMPORTANT: Use the `cat` command to view contents of small files (less than 100
- Use -l to list matching files - Use -l to list matching files
- Use -n to show line numbers - Use -n to show line numbers
- Use -A, -B, -C for context lines - Use -A, -B, -C for context lines
2. head/tail: View file beginnings/endings 2. head/tail: View file beginnings/endings (for large files)
- Use -n to specify number of lines - Use -n to specify number of lines
- Use -f to follow file changes - Use -f to follow file changes
3. awk: Pattern scanning and processing 3. awk: Pattern scanning and processing
@ -300,7 +288,7 @@ IMPORTANT: Use the `cat` command to view contents of small files (less than 100
5. Use extended regex (-E) for complex patterns 5. Use extended regex (-E) for complex patterns
- Data Processing Workflow: - Data Processing Workflow:
1. Use grep to locate relevant files 1. Use grep to locate relevant files
2. Use head/tail to preview content 2. Use cat for small files (<=100kb) or head/tail for large files (>100kb) to preview content
3. Use awk for data extraction 3. Use awk for data extraction
4. Use wc to verify results 4. Use wc to verify results
5. Chain commands with pipes for efficiency 5. Chain commands with pipes for efficiency