Why Chandra is the Ultimate OCR Tool for Handwriting and Tables

B
Bright Coding
Author
Share:
Why Chandra is the Ultimate OCR Tool for Handwriting and Tables
Advertisement

Why Chandra is the Ultimate OCR Tool for Handwriting and Tables

Introduction

Are you tired of dealing with messy forms, complex tables, and hard-to-read handwriting in your documents? Traditional OCR tools often fall short when it comes to handling these intricate details. But what if there was a tool that could accurately read and interpret even the most challenging documents? Enter Chandra, the OCR model that handles complex tables, forms, handwriting, and more. In this article, we'll explore why Chandra is trending now, its key features, and how you can start using it today.

What is Chandra?

Chandra is an advanced OCR (Optical Character Recognition) model developed by Datalab. It is specifically designed to handle complex documents such as handwritten notes, tables, math equations, and messy forms. Created with the latest advancements in machine learning, Chandra stands out in its ability to accurately read and interpret documents that traditional OCR tools struggle with. As the demand for document intelligence grows, Chandra has become a game-changer for developers and businesses alike.

Key Features

Chandra offers a range of features that make it a powerful tool for document processing:

  • Two Inference Modes: Run locally via HuggingFace Transformers or deploy a vLLM server for production throughput.
  • Layout-aware Output: Every text block, table, and image comes with bounding box coordinates.
  • Structured Formats: Output as Markdown, HTML, or JSON with full layout metadata.
  • 40+ Languages Supported: Chandra can handle documents in a wide range of languages.
  • Advanced Handling: Supports handwriting, tables, math equations, forms, and complex layouts.

Use Cases

Chandra excels in various real-world scenarios where traditional OCR tools fail. Here are some concrete use cases:

  • Medical Notes: Doctors often write notes in a cursive and messy handwriting style. Chandra can accurately read and convert these notes into structured text.
  • Financial Filings: Financial documents often contain complex tables and merged cells. Chandra preserves the structure of these tables, making it easier to extract and analyze data.
  • Educational Materials: Textbooks, worksheets, and research papers often include math equations and complex layouts. Chandra can handle these elements with ease.
  • Newspapers: Multi-column layouts, figures, and captions are common in newspapers. Chandra can accurately process and convert these documents.

Step-by-Step Installation & Setup Guide

Installation

To get started with Chandra, you need to install the chandra-ocr package. You can do this using pip:

pip install chandra-ocr

For better performance with HuggingFace inference, we recommend installing flash attention.

From Source

If you prefer to install from source, follow these steps:

git clone https://github.com/datalab-to/chandra.git
cd chandra
uv sync
source .venv/bin/activate

Configuration

You can configure Chandra using environment variables or a local.env file. Here are some common settings:

MODEL_CHECKPOINT=datalab-to/chandra
MAX_OUTPUT_TOKENS=8192
VLLM_API_BASE=http://localhost:8000/v1
VLLM_GPUS=0

vLLM Server

For production or batch processing, you can launch a vLLM server:

chandra_vllm

Configure the server via environment variables:

  • VLLM_API_BASE: Server URL (default: http://localhost:8000/v1)
  • VLLM_MODEL_NAME: Model name (default: chandra)
  • VLLM_GPUS: GPU device IDs (default: 0)

REAL Code Examples from the Repository

CLI Usage

To use Chandra via the command line, you can run the following commands:

# Single file with vLLM server
chandra input.pdf ./output --method vllm

# Directory with local model
chandra ./documents ./output --method hf

Python Usage

Here's a Python example to get you started:

from chandra.model import InferenceManager
from chandra.input import load_pdf_images

manager = InferenceManager(method="hf")
images = load_pdf_images("document.pdf")
results = manager.generate(images)
print(results[0].markdown)

Explanation

  • InferenceManager: This class handles the inference process. You can specify the inference method (hf for HuggingFace or vllm for vLLM).
  • load_pdf_images: This function loads images from a PDF file.
  • generate: This method generates the OCR results.
  • markdown: This attribute returns the OCR results in Markdown format.

Advanced Usage

For advanced usage, you can use the following options:

--method [hf|vllm]: Inference method (default: vllm)
--page-range TEXT: Page range for PDFs (e.g., "1-5,7,9-12")
--max-output-tokens INTEGER: Max tokens per page
--max-workers INTEGER: Parallel workers for vLLM
--include-images/--no-images: Extract and save images (default: include)
--include-headers-footers/--no-headers-footers: Include page headers/footers (default: exclude)
--batch-size INTEGER: Pages per batch (default: 1)

Advanced Usage & Best Practices

To get the most out of Chandra, consider the following best practices:

  • Use vLLM for Production: For high throughput and production environments, use the vLLM server.
  • Optimize Environment: Ensure you have the necessary dependencies and environment variables configured for optimal performance.
  • Batch Processing: When processing multiple documents, use batch processing to improve efficiency.
  • Regular Updates: Keep your Chandra installation up to date with the latest improvements and bug fixes.

Comparison with Alternatives

When choosing an OCR tool, it's important to compare the options available. Here's a comparison table to help you decide:

Feature Chandra Traditional OCR Tools
Handwriting Support Excellent Limited
Table Handling Excellent Limited
Math Equation Support Excellent Limited
Complex Layouts Excellent Limited
Inference Modes vLLM, HuggingFace Limited
Output Formats Markdown, HTML, JSON Limited
Supported Languages 40+ Limited

FAQ

How accurate is Chandra?

Chandra is highly accurate, especially with complex documents. It has been benchmarked and tested to handle handwriting, tables, and forms with high precision.

Can I use Chandra commercially?

Yes, but with some restrictions. The code is Apache 2.0 licensed, while the model weights use a modified OpenRAIL-M license. For broader commercial licensing, see pricing.

What languages does Chandra support?

Chandra supports 40+ languages, making it a versatile tool for international use.

How can I get help with Chandra?

Join the Discord community to discuss development and get help.

Is there a hosted API available?

Yes, a hosted API with additional accuracy improvements is available at datalab.to. You can try the free playground without installing.

Conclusion

Chandra is a powerful OCR tool that handles complex documents with ease. Its ability to accurately read handwriting, tables, math equations, and forms makes it a must-have for developers and businesses. To get started, simply install the chandra-ocr package and follow the setup guide. For more information and to contribute, visit the Chandra GitHub repository. Give it a star if you find it helpful! ⭐

Advertisement

Comments (0)

No comments yet. Be the first to share your thoughts!

Leave a Comment

Apps & Tools Open Source

Apps & Tools Open Source

Bright Coding Prompt

Bright Coding Prompt

Categories

Coding 7 No-Code 2 Automation 14 AI-Powered Content Creation 1 automated video editing 1 Tools 12 Open Source 24 AI 21 Gaming 1 Productivity 15 Security 4 Music Apps 1 Mobile 3 Technology 19 Digital Transformation 2 Fintech 6 Cryptocurrency 2 Trading 2 Cybersecurity 10 Web Development 16 Frontend 1 Marketing 1 Scientific Research 2 Devops 10 Developer 2 Software Development 6 Entrepreneurship 1 Maching learning 2 Data Engineering 3 Linux Tutorials 1 Linux 3 Data Science 4 Server 1 Self-Hosted 6 Homelab 2 File transfert 1 Photo Editing 1 Data Visualization 3 iOS Hacks 1 React Native 1 prompts 1 Wordpress 1 WordPressAI 1 Education 1 Design 1 Streaming 2 LLM 1 Algorithmic Trading 2 Internet of Things 1 Data Privacy 1 AI Security 2 Digital Media 2 Self-Hosting 3 OCR 1 Defi 1 Dental Technology 1 Artificial Intelligence in Healthcare 1 Electronic 2 DIY Audio 1 Academic Writing 1 Technical Documentation 1 Publishing 1 Broadcasting 1 Database 3 Smart Home 1 Business Intelligence 1 Workflow 1 Developer Tools 143 Developer Technologies 3 Payments 1 Development 4 Desktop Environments 1 React 4 Project Management 1 Neurodiversity 1 Remote Communication 1 Machine Learning 14 System Administration 1 Natural Language Processing 1 Data Analysis 1 WhatsApp 1 Library Management 2 Self-Hosted Solutions 2 Blogging 1 IPTV Management 1 Workflow Automation 1 Artificial Intelligence 11 macOS 3 Privacy 1 Manufacturing 1 AI Development 11 Freelancing 1 Invoicing 1 AI & Machine Learning 7 Development Tools 3 CLI Tools 1 OSINT 1 Investigation 1 Backend Development 1 AI/ML 19 Windows 1 Privacy Tools 3 Computer Vision 6 Networking 1 DevOps Tools 3 AI Tools 8 Developer Productivity 6 CSS Frameworks 1 Web Development Tools 1 Cloudflare 1 GraphQL 1 Database Management 1 Educational Technology 1 AI Programming 3 Machine Learning Tools 2 Python Development 2 IoT & Hardware 1 Apple Ecosystem 1 JavaScript 6 AI-Assisted Development 2 Python 2 Document Generation 3 Email 1 macOS Utilities 1 Virtualization 3 Browser Automation 1 AI Development Tools 1 Docker 2 Mobile Development 4 Marketing Technology 1 Open Source Tools 8 Documentation 1 Web Scraping 2 iOS Development 3 Mobile Apps 1 Mobile Tools 2 Android Development 3 macOS Development 1 Web Browsers 1 API Management 1 UI Components 1 React Development 1 UI/UX Design 1 Digital Forensics 1 Music Software 2 API Development 3 Business Software 1 ESP32 Projects 1 Media Server 1 Container Orchestration 1 Speech Recognition 1 Media Automation 1 Media Management 1 Self-Hosted Software 1 Java Development 1 Desktop Applications 1 AI Automation 2 AI Assistant 1 Linux Software 1 Node.js 1 3D Printing 1 Low-Code Platforms 1 Software-Defined Radio 2 CLI Utilities 1 Music Production 1 Monitoring 1 IoT 1 Hardware Programming 1 Godot 1 Game Development Tools 1 IoT Projects 1 ESP32 Development 1 Career Development 1 Python Tools 1 Product Management 1 Python Libraries 1 Legal Tech 1 Home Automation 1 Robotics 1 Hardware Hacking 1 macOS Apps 3 Game Development 1 Network Security 1 Terminal Applications 1 Data Recovery 1 Developer Resources 1 Video Editing 1 AI Integration 4 SEO Tools 1 macOS Applications 1 Penetration Testing 1 System Design 1 Edge AI 1 Audio Production 1 Live Streaming Technology 1 Music Technology 1 Generative AI 1 Flutter Development 1 Privacy Software 1 API Integration 1 Android Security 1 Cloud Computing 1 AI Engineering 1 Command Line Utilities 1 Audio Processing 1 Swift Development 1 AI Frameworks 1 Multi-Agent Systems 1 JavaScript Frameworks 1 Media Applications 1 Mathematical Visualization 1 AI Infrastructure 1 Edge Computing 1 Financial Technology 2 Security Tools 1 AI/ML Tools 1 3D Graphics 2 Database Technology 1 Observability 1 RSS Readers 1 Next.js 1 SaaS Development 1 Docker Tools 1 DevOps Monitoring 1 Visual Programming 1 Testing Tools 1 Video Processing 1 Database Tools 1 Family Technology 1 Open Source Software 1 Motion Capture 1 Scientific Computing 1 Infrastructure 1 CLI Applications 1 AI and Machine Learning 1 Finance/Trading 1 Cloud Infrastructure 1 Quantum Computing 1
Advertisement
Advertisement