Modern Document Scanner for the Web
Scanic is a blazing-fast, lightweight, and modern document scanner library written in JavaScript and rust (WASM). It enables developers to detect, scan, and process documents from images directly in the browser or Node.js, with no dependencies or external services.
I always wanted to use document scanning features within web environments for years. While OpenCV makes this easy, it comes at the cost of a 30+ MB download.
Scanic combines pure JavaScript algorithms with Rust-compiled WebAssembly for performance-critical operations like Gaussian blur, Canny edge detection, and gradient calculations. This hybrid approach delivers near-native performance while maintaining JavaScript’s accessibility and a lightweight footprint.
Performance-wise, I’m working to match OpenCV solutions while maintaining the lightweight footprint - this is an ongoing area of improvement.
This library is heavily inspired by jscanify
Try the live demo: Open Demo
npm install scanic
Or use via CDN:
<script src="https://unpkg.com/scanic/dist/scanic.js"></script>
import { scanDocument, extractDocument } from 'scanic';
// Simple usage - just detect document
const result = await scanDocument(imageElement);
if (result.success) {
console.log('Document found at corners:', result.corners);
}
// Extract the document (with perspective correction)
const extracted = await scanDocument(imageElement, { mode: 'extract' });
if (extracted.success) {
document.body.appendChild(extracted.output); // Display extracted document
}
// Manual extraction with custom corner points (for image editors)
const corners = {
topLeft: { x: 100, y: 50 },
topRight: { x: 400, y: 60 },
bottomRight: { x: 390, y: 300 },
bottomLeft: { x: 110, y: 290 }
};
const manualExtract = await extractDocument(imageElement, corners);
if (manualExtract.success) {
document.body.appendChild(manualExtract.output);
}
import { scanDocument } from 'scanic';
async function processDocument() {
// Get image from file input or any source
const imageFile = document.getElementById('fileInput').files[0];
const img = new Image();
img.onload = async () => {
try {
// Extract and display the scanned document
const result = await scanDocument(img, {
mode: 'extract',
output: 'canvas'
});
if (result.success) {
// Add the extracted document to the page
document.getElementById('output').appendChild(result.output);
// Or get as data URL for download/display
const dataUrl = result.output.toDataURL('image/png');
console.log('Extracted document as data URL:', dataUrl);
}
} catch (error) {
console.error('Error processing document:', error);
}
};
img.src = URL.createObjectURL(imageFile);
}
// HTML setup
// <input type="file" id="fileInput" accept="image/*" onchange="processDocument()">
// <div id="output"></div>
scanDocument(image, options?)
Main entry point for document scanning with flexible modes and output options.
Parameters:
image
: HTMLImageElement, HTMLCanvasElement, or ImageDataoptions
: Optional configuration object
mode
: String - ‘detect’ (default), or ‘extract’
'detect'
: Only detect document, return corners/contour info (no image processing)'extract'
: Extract/warp the document regionoutput
: String - ‘canvas’ (default), ‘imagedata’, or ‘dataurl’debug
: Boolean (default: false) - Enable debug informationmaxProcessingDimension
: Number (default: 800) - Maximum dimension for processing in pixelslowThreshold
: Number (default: 75) - Lower threshold for Canny edge detectionhighThreshold
: Number (default: 200) - Upper threshold for Canny edge detectiondilationKernelSize
: Number (default: 3) - Kernel size for dilationdilationIterations
: Number (default: 1) - Number of dilation iterationsminArea
: Number (default: 1000) - Minimum contour area for document detectionepsilon
: Number - Epsilon for polygon approximationReturns: Promise<{ output, corners, contour, debug, success, message }>
output
: Processed image (null for ‘detect’ mode)corners
: Object with { topLeft, topRight, bottomRight, bottomLeft }
coordinatescontour
: Array of contour pointssuccess
: Boolean indicating if document was detectedmessage
: Status messageconst options = {
mode: 'extract',
maxProcessingDimension: 1000, // Higher quality, slower processing
lowThreshold: 50, // More sensitive edge detection
highThreshold: 150,
dilationKernelSize: 5, // Larger dilation kernel
minArea: 2000, // Larger minimum document area
debug: true // Enable debug information
};
const result = await scanDocument(imageElement, options);
// Just detect (no image processing)
const detection = await scanDocument(imageElement, { mode: 'detect' });
// Extract as canvas
const extracted = await scanDocument(imageElement, {
mode: 'extract',
output: 'canvas'
});
// Extract as ImageData
const rawData = await scanDocument(imageElement, {
mode: 'extract',
output: 'imagedata'
});
// Extract as DataURI
const rawData = await scanDocument(imageElement, {
mode: 'extract',
output: 'dataurl'
});
Clone the repository and set up the development environment:
git clone https://github.com/marquaye/scanic.git
cd scanic
npm install
Start the development server:
npm run dev
Build for production:
npm run build
The built files will be available in the dist/
directory.
The Rust WASM module is pre-compiled and included in the repository. If you need to rebuild it:
npm run build:wasm
This uses Docker to build the WASM module without requiring local Rust installation.
Scanic uses a hybrid JavaScript + WebAssembly approach:
Contributions are welcome! Here’s how you can help:
git checkout -b feature/amazing-feature
)git commit -m 'Add amazing feature'
)git push origin feature/amazing-feature
)Please ensure your code follows the existing style.
Special thanks to our amazing sponsors who make this project possible!
ZeugnisProfi Professional certificate and document services |
ZeugnisProfi.de German document processing specialists |
Verlingo Language and translation services |
MIT License © marquaye