OSINT Image Analysis - Tool Requirements¶

Overview¶

This document outlines the required tools for a local-first OSINT image analysis toolkit, organized by priority and implementation tier.

Image Analysis Pipeline

Tier 1: Essential Tools (Must Have)¶

Metadata Extraction¶

ExifTool (v13.45+)
Purpose: Comprehensive EXIF/GPS/IPTC/XMP metadata extraction
Install: brew install exiftool (macOS), apt install libimage-exiftool-perl (Linux)
License: Open source (GPL/Artistic)
Interface: CLI with JSON output support

Image Processing¶

Sharp (Node.js/Bun library)
Purpose: High-performance image processing and basic metadata
Install: bun add sharp
License: Apache-2.0
Interface: Node.js/Bun library

File Operations¶

ImageMagick (optional fallback)
Purpose: Batch processing, format conversion
Install: brew install imagemagick (macOS)
License: Apache-2.0
Interface: CLI

Tier 2: Enhanced Tools (Should Have)¶

OCR Capabilities¶

Tesseract.js
Purpose: Text extraction from images
Install: bun add tesseract.js
License: Apache-2.0
Interface: Node.js/Bun library

Perceptual Hashing¶

Sharp with custom implementations
Purpose: Duplicate detection (aHash, pHash, dHash)
Algorithms: Implement in TypeScript
License: Custom implementation

Image Forensics¶

ELA (Error Level Analysis) implementation
Purpose: Detect image manipulation
Implementation: Custom TypeScript using Sharp
Method: Save at different quality levels, compare differences

Tier 3: Advanced Tools (Nice to Have)¶

Steganography Detection¶

Binwalk
Purpose: Detect embedded files/data
Install: brew install binwalk (macOS), apt install binwalk (Linux)
License: MIT
Interface: CLI

Object Detection¶

TensorFlow.js or ONNX Runtime
Purpose: Object detection and classification
Install: bun add @tensorflow/tensorflowjs or bun add onnxruntime-node
License: Apache-2.0
Interface: Node.js library

Reverse Image Search¶

Self-hosted vector search (future)
Milvus or FAISS for local similarity search
Requires: Python or Docker setup

System Requirements¶

Development Environment¶

Runtime: Bun 1.0+
Language: TypeScript 5.0+
Package Manager: Bun
Testing: Bun's built-in test runner

External Dependencies¶

ExifTool CLI (must be in PATH)
Optional: ImageMagick CLI
Optional: Binwalk CLI

Tool Wrapper Architecture¶

Core Components¶

// Tool interfaces and implementations
src/tools/
├── types.ts              # Shared interfaces and types
├── metadata-extractor.ts # ExifTool wrapper
├── image-processor.ts    # Sharp-based operations
├── ocr-engine.ts         # Tesseract.js wrapper
├── hash-calculator.ts    # Perceptual hashing
├── forensic-analyzer.ts  # ELA and manipulation detection
└── index.ts              # Main exports

// Tests
tests/
├── metadata-extractor.test.ts
├── image-processor.test.ts
├── ocr-engine.test.ts
├── hash-calculator.test.ts
├── forensic-analyzer.test.ts
└── fixtures/             # Test images

API Design¶

All tools follow a consistent interface:

interface ToolWrapper {
  name: string;
  version: string;
  isAvailable(): Promise<boolean>;
  process(input: string | Buffer): Promise<Result>;
  batch?(inputs: string[]): Promise<Result[]>;
}

Installation Commands¶

macOS¶

# Essential tools
brew install exiftool

# Optional enhancements
brew install imagemagick binwalk

# Node/Bun dependencies
bun install

Linux (Debian/Ubuntu)¶

# Essential tools
sudo apt-get install libimage-exiftool-perl

# Optional enhancements
sudo apt-get install imagemagick binwalk

# Node/Bun dependencies
bun install

Testing Strategy¶

Unit Test Coverage¶

Tool availability checks
Input validation
Successful operation scenarios
Error handling
Edge cases (corrupt files, missing metadata)

Test Fixtures¶

Sample images with EXIF data
Images without metadata
Corrupted images
Images with text (for OCR)
Known manipulated images (for forensics)

Mock Strategy¶

Mock CLI tool executions for unit tests
Use integration tests for actual tool validation
Provide fixtures for consistent testing

Success Criteria¶

All essential tools (Tier 1) are wrapped and tested
Unit test coverage > 80%
All tests pass consistently
Clear error messages for missing tools
Type-safe implementations with TypeScript
Comprehensive documentation for each tool

Future Enhancements¶

Integration with Python ML libraries via child processes
Docker container for consistent tool availability
Plugin system for adding new analyzers
Web UI for batch analysis
Report generation (HTML, PDF, JSON)