CLI Commands Reference¶
Complete reference for all command-line interface commands and options.
Command Overview¶
The Workflow Bank Statement Separator provides a multi-command CLI interface:
Available Commands¶
process- Process PDF files containing multiple bank statementsprocess-paperless- Process documents from paperless-ngx repositorybatch-process- Process multiple PDF files from a directoryquarantine-status- View quarantine directory status and recent failuresquarantine-clean- Clean old files from quarantine directoryenv-help- Display comprehensive environment variable documentationversion- Display version and contact information
Process Command¶
Process a PDF file containing multiple bank statements.
Syntax¶
Arguments¶
| Argument | Description | Required |
|---|---|---|
INPUT_FILE |
Path to PDF file to process | Yes |
Options¶
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--output |
-o |
PATH | ./separated_statements |
Output directory for separated statements |
--env-file |
PATH | .env |
Path to .env configuration file | |
--model |
CHOICE | gpt-4o-mini |
LLM model to use | |
--verbose |
-v |
FLAG | Enable verbose logging | |
--dry-run |
FLAG | Analyze document without creating output files | ||
--yes |
-y |
FLAG | Skip confirmation prompts | |
--help |
FLAG | Show help message |
Model Choices¶
| Model | Speed | Accuracy | Cost | Best For |
|---|---|---|---|---|
gpt-4o-mini |
Fast | High | Low | General use (recommended) |
gpt-4o |
Medium | Highest | High | Maximum accuracy |
gpt-3.5-turbo |
Fastest | Medium | Lowest | High-volume processing |
Examples¶
# Process with defaults
uv run python -m src.bank_statement_separator.main \
process statements.pdf
# Custom output directory
uv run python -m src.bank_statement_separator.main \
process statements.pdf --output ./my-statements
# Skip confirmations (useful for automation)
uv run python -m src.bank_statement_separator.main \
process statements.pdf --yes
# Use specific model with verbose output
uv run python -m src.bank_statement_separator.main \
process statements.pdf --model gpt-4o --verbose
# Dry-run analysis (no files created)
uv run python -m src.bank_statement_separator.main \
process statements.pdf --dry-run --yes
# Custom configuration file
uv run python -m src.bank_statement_separator.main \
process statements.pdf --env-file /path/to/custom.env
Output Examples¶
Successful Processing¶
๐ Processing PDF file: statements.pdf
๐ Document Analysis: 12 pages detected
๐ค AI Analysis: Using gpt-4o-mini model
โ
Statements detected: 2
๐ Processing Results
โโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโ
โ Metric โ Value โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ Total Pages Processed โ 12 โ
โ Statements Detected โ 2 โ
โ Processing Time โ 3.45s โ
โ Status โ success โ
โโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ
๐ Detected Statements:
โโโโโณโโโโโโโโโณโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโ
โ # โ Pages โ Account โ Period โ Bank โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ 1 โ 1-6 โ ****2819 โ 2015-05 โ Westpac โ
โ 2 โ 7-12 โ ****2819 โ 2015-04 โ Westpac โ
โโโโโดโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
โ
Successfully created 2 statement files:
๐ westpac-2819-2015-05-21.pdf
๐ westpac-2819-2015-04-20.pdf
๐ Processed input file moved to: input/processed/statements.pdf
Dry-Run Analysis¶
๐ DRY RUN MODE - No files will be created
๐ Analyzing PDF file: statements.pdf
๐ Analysis Results
โโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโ
โ Metric โ Value โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ Total Pages โ 12 โ
โ Statements Detected โ 2 โ
โ Analysis Time โ 1.23s โ
โ Would Create Files โ 2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโ
โน๏ธ Run without --dry-run to create separated statement files
Batch Process Command¶
Process multiple PDF files from a directory in a single operation.
Syntax¶
Arguments¶
| Argument | Description | Required |
|---|---|---|
INPUT_DIRECTORY |
Directory containing PDF files to process | Yes |
Options¶
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--output |
-o |
PATH | ./separated_statements |
Output directory for separated statements |
--pattern |
STRING | *.pdf |
File pattern to match (glob syntax) | |
--exclude |
STRING | Pattern to exclude from processing | ||
--max-files |
INTEGER | Maximum number of files to process | ||
--env-file |
PATH | .env |
Path to .env configuration file | |
--model |
CHOICE | gpt-4o-mini |
LLM model to use | |
--verbose |
-v |
FLAG | Enable verbose logging | |
--dry-run |
FLAG | Analyze documents without creating output files | ||
--yes |
-y |
FLAG | Skip confirmation prompts | |
--help |
FLAG | Show help message |
Key Features¶
- Sequential Processing: Files are processed one by one to avoid system overload
- Error Isolation: Failed files are quarantined without stopping the batch
- Progress Tracking: Real-time progress display during processing
- Comprehensive Summary: Detailed batch results with success/failure metrics
- Validation Gate: All outputs validated before Paperless upload
Examples¶
# Process all PDFs in a directory
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs
# Custom output directory
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs --output ./batch-output
# Skip confirmations for automation
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs --yes
# Process only files matching pattern
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs --pattern "*2024*.pdf"
# Exclude specific patterns
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs --exclude "*draft*"
# Limit number of files
uv run python -m src.bank_statement_separator.main \
batch-process /path/to/pdfs --max-files 10
# Production batch processing with logging
uv run python -m src.bank_statement_separator.main \
batch-process /secure/input \
--output /secure/output \
--pattern "*.pdf" \
--exclude "*test*" \
--model gpt-4o-mini \
--verbose \
--yes \
2>&1 | tee /var/log/batch-processing.log
# Dry-run to preview batch
uv run python -m src.bank_statement_separator.main \
batch-process /secure/input \
--dry-run \
--yes
Output Examples¶
Successful Batch Processing¶
๐ Discovering files in: /path/to/pdfs
๐ Found 5 file(s) to process
โข statement_jan_2024.pdf
โข statement_feb_2024.pdf
โข statement_mar_2024.pdf
โข statement_apr_2024.pdf
โข statement_may_2024.pdf
๐ Starting batch processing...
Processing statement_jan_2024.pdf (1/5)
Processing statement_feb_2024.pdf (2/5)
Processing statement_mar_2024.pdf (3/5)
Processing statement_apr_2024.pdf (4/5)
Processing statement_may_2024.pdf (5/5)
๐ Batch Processing Summary Results
โโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโโโโโโโ
โ Metric โ Count โ Percentage โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ Total Files โ 5 โ 100% โ
โ Processed โ 5 โ 100.0% โ
โ Successful โ 4 โ 80.0% โ
โ Quarantined โ 1 โ 20.0% โ
โ Uploaded to Paperless โ 12 โ โ
โ Processing Time โ 15.3s โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโโโ
โ
Successfully processed 4 files
โ ๏ธ 1 file(s) quarantined - check error reports
๐ Output files saved to configured directories
Batch with Errors¶
๐ Discovering files in: /path/to/pdfs
๐ Found 3 file(s) to process
๐ Starting batch processing...
Processing corrupted.pdf (1/3)
โ ๏ธ Error processing corrupted.pdf - moved to quarantine
Processing valid.pdf (2/3)
โ
Successfully processed valid.pdf
Processing protected.pdf (3/3)
โ ๏ธ Error processing protected.pdf - password protected
๐ Batch Processing Summary Results
โโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโโโโโโโ
โ Metric โ Count โ Percentage โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ Total Files โ 3 โ 100% โ
โ Processed โ 3 โ 100.0% โ
โ Successful โ 1 โ 33.3% โ
โ Quarantined โ 2 โ 66.7% โ
โ Processing Time โ 8.7s โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโโโ
โ ๏ธ Batch completed with errors
๐ Error reports available in quarantine directory
Batch Processing Workflow¶
- Discovery Phase: Scan directory for matching PDF files
- Sequential Processing: Process each file individually
- Error Isolation: Failed files quarantined, batch continues
- Validation Gate: Validate outputs before Paperless upload
- Summary Report: Display comprehensive batch results
Performance Considerations¶
- Sequential vs Parallel: Uses sequential processing to avoid overwhelming system resources
- Memory Management: Each file processed independently to manage memory usage
- Error Recovery: Individual file failures don't affect other files in batch
- Progress Feedback: Real-time progress updates for long-running batches
Quarantine Status Command¶
View the status of the quarantine directory and recent processing failures.
Syntax¶
Options¶
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--env-file |
PATH | .env |
Path to .env configuration file | |
--verbose |
-v |
FLAG | Enable verbose logging | |
--help |
FLAG | Show help message |
Examples¶
# Check quarantine status
uv run python -m src.bank_statement_separator.main quarantine-status
# Verbose output with details
uv run python -m src.bank_statement_separator.main quarantine-status --verbose
Output Examples¶
Quarantine Status¶
๐ Quarantine Directory Status
Path: /path/to/quarantine
๐ Summary
โโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโ
โ Metric โ Count โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ Total Files โ 3 โ
โ This Week โ 1 โ
โ This Month โ 2 โ
โ Older Files โ 1 โ
โ Error Reports โ 3 โ
โโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโ
๐ Recent Files (Last 7 days)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ File โ Date โ Reason โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ failed_20240831_143022_doc.pdf โ 2024-08-31 14:30 โ Password protected โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ก Use 'quarantine-clean' command to remove old files
Empty Quarantine¶
๐ Quarantine Directory Status
Path: /path/to/quarantine
โ
Quarantine directory is empty - no failed documents
Quarantine Clean Command¶
Clean old files from the quarantine directory with safety checks.
Syntax¶
Options¶
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--days |
INTEGER | 30 |
Clean files older than N days | |
--env-file |
PATH | .env |
Path to .env configuration file | |
--dry-run |
FLAG | Preview what would be cleaned | ||
--yes |
-y |
FLAG | Skip confirmation prompts | |
--verbose |
-v |
FLAG | Enable verbose logging | |
--help |
FLAG | Show help message |
Examples¶
# Preview cleanup (no files deleted)
uv run python -m src.bank_statement_separator.main \
quarantine-clean --dry-run
# Clean files older than 30 days (default)
uv run python -m src.bank_statement_separator.main \
quarantine-clean
# Clean files older than 7 days with confirmation
uv run python -m src.bank_statement_separator.main \
quarantine-clean --days 7
Output Examples¶
Dry-Run Cleanup¶
๐๏ธ QUARANTINE CLEANUP (DRY RUN)
Files older than 30 days will be identified
๐ Cleanup Preview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโ
โ File โ Age โ Size โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ failed_20240725_120000_old.pdf โ 37 days โ 2.1 MB โ
โ failed_20240720_140000_corrupt.pdfโ 42 days โ 156 KB โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโ
๐ Summary
- Files to delete: 2
- Total size to free: 2.3 MB
- Error reports to delete: 2
โ ๏ธ Run without --dry-run to actually delete files
Actual Cleanup¶
๐๏ธ QUARANTINE CLEANUP
Cleaning files older than 30 days...
โ ๏ธ WARNING: This will permanently delete 2 files (2.3 MB)
Continue? [y/N]: y
๐๏ธ Deleting files...
โ failed_20240725_120000_old.pdf
โ failed_20240720_140000_corrupt.pdf
๐ Deleted 2 error reports
โ
Cleanup completed
- Files deleted: 2
- Space freed: 2.3 MB
- Error reports cleaned: 2
Global Options¶
These options are available for all commands:
Environment Help Command¶
Get comprehensive documentation about environment variables used to configure the application.
Syntax¶
Options¶
| Option | Type | Default | Description |
|---|---|---|---|
--category |
CHOICE | all |
Show environment variables by category |
--help |
FLAG | Show help message |
Categories¶
| Category | Description |
|---|---|
all |
Show all environment variables (default) |
llm |
LLM provider configuration (OpenAI, Ollama) |
processing |
Document processing and output settings |
security |
Security controls and logging configuration |
paperless |
Paperless-ngx integration settings |
error-handling |
Error recovery and document quarantine settings |
validation |
Document validation and quality checks |
Examples¶
# Show only LLM provider configuration
uv run bank-statement-separator env-help --category llm
# Show Paperless integration variables
uv run bank-statement-separator env-help --category paperless
# Show error handling and quarantine settings
uv run bank-statement-separator env-help --category error-handling
Output Example¶
๐ Environment Variable Documentation
============================================================
๐ก Use --category <name> to filter by specific category
Available categories: llm, processing, security, paperless, error-handling, validation
๐ค LLM Provider Configuration
Configure AI/LLM providers for document analysis
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโ
โ Variable โ Description โ Default โ Required โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ LLM_PROVIDER โ LLM provider selection (openai, oll...โ openai โ No โ
โ OPENAI_API_KEY โ OpenAI API key for GPT models โ None โ If using Open...โ
โ OPENAI_MODEL โ OpenAI model to use โ gpt-4o-mini โ No โ
โ OLLAMA_BASE_URL โ Ollama server base URL โ http://localhost...โ If using Olla...โ
โ OLLAMA_MODEL โ Ollama model to use โ llama3.2 โ No โ
โ LLM_TEMPERATURE โ LLM temperature for response random... โ 0.0 โ No โ
โ LLM_MAX_TOKENS โ Maximum tokens for LLM responses โ 4000 โ No โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโ
๐ Configuration Notes
โข Create a .env file from .env.example to configure your environment
โข Use --env-file option with commands to specify custom config file
โข Most variables have sensible defaults and are optional
โข Required variables depend on enabled features (LLM provider, Paperless, etc.)
๐ More Information
โข Documentation: https://madeinoz67.github.io/bank-statement-separator/
โข Configuration Guide: https://madeinoz67.github.io/bank-statement-separator/getting-started/configuration/
โข Environment Variables Reference: https://madeinoz67.github.io/bank-statement-separator/reference/environment-variables/
Integration with Other Commands¶
The env-help command is referenced in other command help text:
# Process command shows relevant environment variables
uv run bank-statement-separator process --help
# Paperless command shows required variables
uv run bank-statement-separator process-paperless --help
# Batch processing shows error handling variables
uv run bank-statement-separator batch-process --help
Version Command¶
Display version information and helpful links for support and documentation.
Syntax¶
Features¶
- Version Information: Shows current application version
- Author Details: Developer and license information
- Repository Link: Direct link to GitHub repository
- Documentation Links: Links to user documentation
- Issue Tracker: Link for reporting bugs and feature requests
Example Output¶
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Bank Statement Separator โ
โ Version Information โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Version: 0.3.1
Author: Stephen Eaton
License: MIT
Repository: https://github.com/madeinoz67/bank-statement-separator
Documentation: https://madeinoz67.github.io/bank-statement-separator/
Issues: https://github.com/madeinoz67/bank-statement-separator/issues
An AI-powered tool for automatically separating
multi-statement PDF files using LangChain and LangGraph.
Use Cases¶
- Version Checking: Verify installed version for support
- Getting Help: Quick access to documentation and issue tracker
- Development: Check version in automation scripts
Help System¶
# Main help
uv run bank-statement-separator --help
# Command-specific help with environment variables
uv run bank-statement-separator process --help
uv run bank-statement-separator process-paperless --help
uv run bank-statement-separator batch-process --help
uv run bank-statement-separator quarantine-status --help
uv run bank-statement-separator quarantine-clean --help
uv run bank-statement-separator env-help --help
uv run bank-statement-separator version --help
Environment Variables¶
Override configuration via environment variables:
# Override API key
OPENAI_API_KEY="sk-override-key" uv run python -m src.bank_statement_separator.main process input.pdf
# Disable API usage (fallback mode)
OPENAI_API_KEY="" uv run python -m src.bank_statement_separator.main process input.pdf
# Override model
LLM_MODEL=gpt-4o uv run python -m src.bank_statement_separator.main process input.pdf
Error Handling¶
Exit Codes¶
| Code | Description |
|---|---|
0 |
Success |
1 |
General error |
2 |
Invalid arguments |
3 |
File not found |
4 |
Permission denied |
5 |
Processing failed |
6 |
API error |
Common Error Messages¶
Automation Examples¶
Batch Processing Script¶
#!/bin/bash
# process_statements.sh
INPUT_DIR="/secure/input"
OUTPUT_DIR="/secure/output"
LOG_FILE="/var/log/bank-separator.log"
# Use the new batch-process command for efficiency
echo "Starting batch processing: $(date)" | tee -a "$LOG_FILE"
uv run bank-statement-separator \
batch-process "$INPUT_DIR" \
--output "$OUTPUT_DIR" \
--pattern "*.pdf" \
--yes \
--verbose \
2>&1 | tee -a "$LOG_FILE"
# Clean old quarantine files weekly
uv run python -m src.bank_statement_separator.main \
quarantine-clean --days 30 --yes
Cron Job Setup¶
# Edit crontab
crontab -e
# Add entries for automated processing
# Process statements daily at 2 AM
0 2 * * * /path/to/process_statements.sh
# Clean quarantine weekly on Sundays at 3 AM
0 3 * * 0 cd /path/to/bank-statement-separator && uv run bank-statement-separator quarantine-clean --days 30 --yes
# Check quarantine status daily
0 9 * * * cd /path/to/bank-statement-separator && uv run bank-statement-separator quarantine-status | mail -s "Daily Quarantine Status" admin@company.com
Docker Integration¶
# Docker run example (when available)
docker run --rm -v $(pwd):/workspace \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
your-org/bank-statement-separator:latest \
process /workspace/input.pdf --output /workspace/output --yes
Performance Tips¶
Optimize Processing Speed¶
# Use fastest model for high-volume processing
uv run python -m src.bank_statement_separator.main \
process input.pdf --model gpt-3.5-turbo
# Process without API (fastest, lower accuracy)
OPENAI_API_KEY="" uv run python -m src.bank_statement_separator.main \
process input.pdf --yes
# Skip confirmations for automation
uv run python -m src.bank_statement_separator.main \
process input.pdf --yes
Monitor Resource Usage¶
# Monitor memory usage
/usr/bin/time -v uv run python -m src.bank_statement_separator.main process large-file.pdf
# Monitor API usage
grep "LLM_API_CALL" /var/log/bank-separator/audit.log | tail -10
# Check processing times
grep "Processing Time" /var/log/bank-separator/processing.log
Troubleshooting Commands¶
Diagnostic Commands¶
# Test configuration
uv run python -c "from src.bank_statement_separator.config import load_config; print('Config OK')"
# Test API key
uv run python -c "
import openai
from src.bank_statement_separator.config import load_config
config = load_config()
if config.openai_api_key:
client = openai.Client(api_key=config.openai_api_key)
print('API key valid')
else:
print('No API key configured')
"
# Test imports
uv run python -c "import src.bank_statement_separator; print('Import OK')"
Debug Mode¶
# Enable debug logging
LOG_LEVEL=DEBUG uv run python -m src.bank_statement_separator.main \
process input.pdf --verbose
# Test with minimal file
uv run python -m src.bank_statement_separator.main \
process small-test.pdf --dry-run --verbose
# Check quarantine details
uv run python -m src.bank_statement_separator.main \
quarantine-status --verbose