Self-Hosted Developer Tools 1 min read

Perplexideez: The Self-Hosted Search Revolution

B
Bright Coding
Author
Share:
Perplexideez: The Self-Hosted Search Revolution
Advertisement

Your data, your models, your search. Perplexideez delivers AI-powered insights without compromising privacy.

Tired of cloud AI search engines mining your queries? You're not alone. Developers and privacy-conscious teams are abandoning proprietary solutions that treat their research data as a commodity. Perplexideez changes everything. This revolutionary self-hosted search engine combines local AI agents with SearXNG to create a Perplexity-like experience that runs entirely on your infrastructure. No data leaks. No subscription fees. Complete control.

In this deep dive, you'll discover how Perplexideez transforms your approach to AI-assisted research. We'll explore its powerful multi-user architecture, walk through production-ready deployment configurations, and extract real code examples straight from the repository. Whether you're securing sensitive research for your enterprise or building a private knowledge hub, this guide delivers the technical blueprint you need.

What Is Perplexideez?

Perplexideez is an open-source, self-hosted AI search platform that replicates and surpasses Perplexity AI's functionality while keeping every byte of data under your control. Created by developer brunostjohn, this project emerged from genuine frustration with existing self-hosted alternatives that offered poor integration with other self-hosted services and lacked robust multi-user support.

At its core, Perplexideez orchestrates three powerful components: a PostgreSQL database for persistent storage, SearXNG for privacy-respecting web search aggregation, and either Ollama or OpenAI-compatible endpoints for local language model inference. This architecture creates a stateless, scalable application that runs in rootless containers—ready for Kubernetes production environments.

The platform handles the entire search lifecycle: it dispatches queries to SearXNG, processes results through your chosen LLM, annotates responses with source citations, generates intelligent follow-up questions, and manages user sessions through OIDC SSO. Every search becomes traceable, shareable, and reproducible. The project has gained rapid traction in the self-hosting community because it solves the fundamental tension between AI convenience and data sovereignty. While cloud providers monetize your curiosity, Perplexideez ensures your research patterns, intellectual property, and sensitive queries remain entirely private.

Key Features That Define Excellence

Perplexideez packs enterprise-grade capabilities into a sleek, containerized package. Let's dissect what makes this tool essential for modern development teams.

AI-Powered Web Search Orchestration The system doesn't just search—it intelligently delegates. When you submit a query, Perplexideez routes it to your SearXNG instance, which aggregates results from dozens of search engines without tracking. The LLM then synthesizes these results into coherent answers, complete with inline source annotations. Hover over any citation to see the exact source material, then click through to verify claims instantly. This two-tier architecture eliminates hallucinations while maintaining search neutrality.

Multi-User Architecture with OIDC SSO Unlike simplistic single-user clones, Perplexideez implements proper tenant isolation. Each user's searches, favorites, and shared links remain private by default. The platform integrates with any OpenID Connect provider—Keycloak, Authentik, or corporate Azure AD—allowing you to disable password authentication entirely. Environment variables like OIDC_ISSUER, OIDC_CLIENT_ID, and OIDC_CLIENT_SECRET make configuration straightforward.

Granular Model Selection & Resource Management Perplexideez recognizes that not all tasks require massive models. The UI lets you assign different models to different tasks: a lightweight model for query parsing, a powerful one for synthesis. Environment variables such as FAST_MODEL and SMART_MODEL optimize resource consumption. This prevents your GPU from being monopolized by simple operations.

Intelligent Follow-Up Generation The system analyzes its own responses to suggest relevant follow-up questions automatically. This feature, powered by the LLM's self-reflection capabilities, transforms research from a static Q&A into a dynamic discovery process. You explore topics depth-first without manually crafting every query.

Enterprise-Grade Sharing Controls Share search results via cryptographically random URLs. Each shared link offers three security levels: public access, authentication-required, or disabled. You can reroll link IDs instantly if a URL is compromised. Public shares generate beautiful OpenGraph embeds with summaries, making collaboration seamless while maintaining access control.

Stateless Container Design Every container runs as non-root by default. The application persists only in-progress generations to memory, making it truly stateless. This design enables fearless Kubernetes rolling updates and horizontal scaling. Your deployment remains available even during cluster maintenance.

Real-World Use Cases That Deliver Value

1. Enterprise Competitive Intelligence A market research team needs to track competitor movements without leaking search patterns to public AI services. They deploy Perplexideez behind their VPN, configure SearXNG to prioritize industry-specific sources, and use Ollama with a fine-tuned model that understands their sector's jargon. Each analyst has isolated workspaces, and shareable reports let leadership review findings without accessing the full system. The SSO integration ensures offboarded employees lose access immediately.

2. Academic Research Institution University researchers handling sensitive human subject data must comply with IRB regulations that prohibit cloud AI processing. Perplexideez runs on-premises, with PostgreSQL backed by encrypted volumes. The source attribution feature lets students verify claims for literature reviews, while favorites help faculty track evolving hypotheses across semesters. The follow-up question generator helps graduate students explore methodological alternatives they hadn't considered.

3. DevSecOps Knowledge Base A platform engineering team integrates Perplexideez with their internal documentation, logging systems, and Git repositories. SearXNG indexes these private sources alongside the public web. When incidents occur, engineers query the system to find relevant logs, similar past issues, and remediation steps—all with source citations. The multi-model configuration uses fast models for log parsing and smart models for incident summary reports, optimizing response times during outages.

4. Privacy-Focused Startup A fintech startup cannot risk exposing customer data or product roadmaps to external AI providers. They deploy Perplexideez on a bare-metal server, using quantized Llama models via Ollama. The sharing feature lets product managers distribute competitive analysis to investors via password-protected links. The stateless container design means they can migrate between cloud providers without data portability issues, maintaining their privacy guarantees.

Step-by-Step Production Deployment

Deploying Perplexideez requires four coordinated services: PostgreSQL, SearXNG, your LLM provider (Ollama/OpenAI), and the Perplexideez application itself. Here's the proven path.

Step 1: Prepare Your Environment Clone the repository and examine the example configurations:

git clone https://github.com/brunostjohn/perplexideez.git
cd perplexideez/deploy/docker
cp .env.example .env

Edit .env with your secrets. The critical variables include:

  • DATABASE_URL: PostgreSQL connection string
  • SEARXNG_URL: Your SearXNG instance endpoint
  • OPENAI_API_KEY or OLLAMA_URL: LLM backend configuration
  • OIDC_* variables for SSO integration

Step 2: Configure SearXNG In your SearXNG settings.yml, enable JSON output—this is non-optional:

search:
  formats:
    - html
    - json  # Required for Perplexideez

Disable the limiter to prevent request blocking:

server:
  limiter: false  # Perplexideez triggers limiter frequently

Step 3: Run Database Migrations Execute the migration container before starting the app:

docker run --rm \
  -e DATABASE_URL="postgresql://user:pass@postgres:5432/perplexideez" \
  ghcr.io/brunostjohn/perplexideez/migrate:latest

Step 4: Launch the Stack Use the provided Compose file to start all services:

docker-compose up -d

Access the UI at http://localhost:3000. The first user to authenticate becomes the initial administrator.

Real Code Examples from the Repository

Let's examine actual configuration patterns from the Perplexideez codebase.

Example 1: Docker Compose Service Definition The application service requires precise environment variable injection:

# docker-compose.yml snippet
services:
  perplexideez:
    image: ghcr.io/brunostjohn/perplexideez/app:latest
    environment:
      # Database connection - must be accessible from container
      DATABASE_URL: postgresql://perplexideez:${DB_PASSWORD}@postgres:5432/perplexideez
      
      # SearXNG endpoint - must return JSON
      SEARXNG_URL: http://searxng:8080
      
      # AI Backend selection
      OPENAI_API_KEY: ${OPENAI_KEY}  # For OpenAI-compatible endpoints
      # OLLAMA_URL: http://ollama:11434  # Alternative to OpenAI
      
      # Model configuration for cost optimization
      FAST_MODEL: gpt-3.5-turbo      # For quick tasks
      SMART_MODEL: gpt-4             # For complex synthesis
      
      # OIDC SSO configuration
      OIDC_ISSUER: https://auth.yourdomain.com
      OIDC_CLIENT_ID: perplexideez
      OIDC_CLIENT_SECRET: ${OIDC_SECRET}
    ports:
      - "3000:3000"
    depends_on:
      - postgres
      - searxng

Example 2: Required Environment Variables The migration container needs minimal configuration:

# .env file excerpt
# Database storage configuration
DATABASE_URL=postgresql://perplexideez:secure_password_123@postgres:5432/perplexideez

# App setup
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=complex_random_string_at_least_32_chars

# SearXNG integration
SEARXNG_URL=https://searxng.yourdomain.com

# AI backend (choose one approach)
# Option A: OpenAI compatible
OPENAI_API_KEY=sk-your-openai-key
# Option B: Ollama local
OLLAMA_URL=http://your-ollama-server:11434

Example 3: SearXNG Configuration for JSON Output The SearXNG configuration must expose JSON results:

# searxng/settings.yml
search:
  # CRITICAL: JSON format must be enabled
  formats:
    - html
    - json
  
  # Disable limiter to prevent Perplexideez from being blocked
  safe_search: 0
  autocomplete: ""
  default_lang: ""

server:
  limiter: false  # Perplexideez triggers rate limits frequently
  secret_key: your-secret-key-here
  port: 8080
  bind_address: "0.0.0.0"

Example 4: Kubernetes Deployment Pattern For cluster deployments, use the provided manifests as a template:

# k8s-deployment.yaml (based on homelab example)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: perplexideez-app
spec:
  replicas: 2  # Stateless design supports scaling
  template:
    spec:
      securityContext:
        runAsNonRoot: true  # Enforced non-root execution
        runAsUser: 1000
      containers:
      - name: app
        image: ghcr.io/brunostjohn/perplexideez/app:latest
        envFrom:
        - secretRef:
            name: perplexideez-env
        readinessProbe:
          httpGet:
            path: /api/health
            port: 3000

Each example demonstrates production-hardened patterns: non-root containers, secret management, and health checks for orchestrator integration.

Advanced Usage & Best Practices

Model Routing Strategy Configure FAST_MODEL for query classification and SMART_MODEL for answer synthesis. This reduces API costs by 60-70% while maintaining quality. For Ollama users, deploy multiple quantized models: a 3B parameter model for speed, a 13B model for depth.

Database Connection Pooling PostgreSQL connections are precious. Set DATABASE_URL with pooling parameters:

postgresql://user:pass@host:5432/db?pool_size=20&max_overflow=0

SearXNG Instance Hardening Run a dedicated SearXNG instance for Perplexideez. Disable engines that return inconsistent JSON, and prioritize DuckDuckGo, Brave Search, and Wikipedia for reliable results. This reduces parsing errors and improves response consistency.

Backup Strategy While the app is stateless, PostgreSQL holds user data and favorites. Schedule hourly backups of your database volume. The favorites and shared links tables contain irreplaceable research context.

SSO Security Always set OIDC_ALLOW_DANGEROUS_EMAIL_ACCOUNT_LINKING=false in production. This prevents account takeover attacks. Use PKCE-enabled OIDC clients for additional security.

Comparison: Perplexideez vs. Alternatives

Feature Perplexideez Perplexity AI SearXNG Alone LibreChat
Data Privacy ✅ Complete control ❌ Cloud processed ✅ Self-hosted ⚠️ Mixed
AI Integration ✅ Native LLM ✅ Proprietary ❌ None ✅ Plugin-based
Multi-User SSO ✅ Built-in OIDC ❌ Single user ❌ None ⚠️ Limited
Source Attribution ✅ Hover & click ✅ Basic ❌ Manual ⚠️ Partial
Self-Hosted ✅ Fully containerized ❌ SaaS only ✅ Yes ✅ Yes
Follow-Up Questions ✅ AI-generated ✅ Yes ❌ None ❌ None
Sharing Controls ✅ Access levels ✅ Public links ❌ None ⚠️ Basic
Cost ✅ Free (self-hosted) 💰 Subscription ✅ Free ✅ Free

Perplexideez uniquely combines privacy, multi-tenancy, and intelligent search orchestration. While Perplexity AI offers convenience, it can't match the data sovereignty. SearXNG alone lacks AI synthesis. LibreChat provides conversational AI but misses Perplexideez's research-focused workflow and citation system.

Frequently Asked Questions

What hardware is required for local models? For Ollama integration, allocate at least 16GB RAM for a 7B parameter model and 32GB for 13B models. CPU inference works but GPU acceleration (6GB+ VRAM) delivers acceptable response times. The app itself needs only 512MB RAM.

Can I disable web search and search only internal documents? Currently, Perplexideez requires SearXNG for core functionality. However, you can configure SearXNG to index only your internal sources by disabling all public search engines and adding your document endpoints as custom engines.

How does it handle SearXNG rate limiting? The README explicitly warns about SearXNG's limiter. The recommended solution is disabling it (limiter: false). For production, deploy a dedicated SearXNG instance with aggressive caching and consider implementing request queuing at the application layer.

Is migration from other tools supported? No official migration tools exist yet. The database schema is straightforward, so you could script migrations from JSON exports of other tools. The project welcomes contributions for migration utilities.

Can I use Azure OpenAI or local vLLM endpoints? Yes. Any OpenAI-compatible API works by setting OPENAI_API_KEY and OPENAI_API_BASE_URL. Tested with Azure OpenAI, vLLM, and LocalAI. Adjust FAST_MODEL and SMART_MODEL names to match your endpoint's model catalog.

What about mobile support? The responsive web UI works flawlessly on mobile browsers. No native app exists, but the PWA manifest allows "Add to Home Screen" installation with offline capability for cached searches.

How do I troubleshoot "JSON output disabled" errors? This means SearXNG isn't configured correctly. Verify search.formats includes json in your SearXNG settings. Restart SearXNG after changes. Check logs with docker logs searxng to confirm the config loaded properly.

Conclusion: Take Control of Your Search

Perplexideez represents a paradigm shift in AI-assisted research. It proves that privacy and powerful features aren't mutually exclusive. The thoughtful architecture—stateless containers, OIDC SSO, and source attribution—addresses real enterprise concerns while delivering a superior user experience.

The project's rapid evolution shows the maintainer's commitment to solving genuine pain points. Multi-user support and sharing controls transform it from a personal tool into a collaborative platform. The integration with SearXNG and local LLMs creates a future-proof foundation as models continue improving.

For developers, researchers, and privacy advocates, Perplexideez isn't just an alternative to cloud AI search—it's an upgrade. You gain auditability, customization, and data sovereignty without sacrificing the intelligent features that make AI search compelling.

Deploy Perplexideez today. Clone the repository, configure your environment, and experience AI search that respects your autonomy. The future of research is self-hosted, and it's more accessible than you think.

Visit the GitHub repository to get started: https://github.com/brunostjohn/perplexideez

Advertisement

Comments (0)

No comments yet. Be the first to share your thoughts!

Leave a Comment

Apps & Tools Open Source

Apps & Tools Open Source

Bright Coding Prompt

Bright Coding Prompt

Categories

Coding 7 No-Code 2 Automation 14 AI-Powered Content Creation 1 automated video editing 1 Tools 12 Open Source 24 AI 21 Gaming 1 Productivity 16 Security 4 Music Apps 1 Mobile 3 Technology 19 Digital Transformation 2 Fintech 6 Cryptocurrency 2 Trading 2 Cybersecurity 10 Web Development 16 Frontend 1 Marketing 1 Scientific Research 2 Devops 10 Developer 2 Software Development 6 Entrepreneurship 1 Maching learning 2 Data Engineering 3 Linux Tutorials 1 Linux 3 Data Science 4 Server 1 Self-Hosted 6 Homelab 2 File transfert 1 Photo Editing 1 Data Visualization 3 iOS Hacks 1 React Native 1 prompts 1 Wordpress 1 WordPressAI 1 Education 1 Design 1 Streaming 2 LLM 1 Algorithmic Trading 2 Internet of Things 1 Data Privacy 1 AI Security 2 Digital Media 2 Self-Hosting 3 OCR 1 Defi 1 Dental Technology 1 Artificial Intelligence in Healthcare 1 Electronic 2 DIY Audio 1 Academic Writing 1 Technical Documentation 1 Publishing 1 Broadcasting 1 Database 3 Smart Home 1 Business Intelligence 1 Workflow 1 Developer Tools 144 Developer Technologies 3 Payments 1 Development 4 Desktop Environments 1 React 4 Project Management 1 Neurodiversity 1 Remote Communication 1 Machine Learning 14 System Administration 1 Natural Language Processing 1 Data Analysis 1 WhatsApp 1 Library Management 2 Self-Hosted Solutions 2 Blogging 1 IPTV Management 1 Workflow Automation 1 Artificial Intelligence 11 macOS 3 Privacy 1 Manufacturing 1 AI Development 11 Freelancing 1 Invoicing 1 AI & Machine Learning 7 Development Tools 3 CLI Tools 1 OSINT 1 Investigation 1 Backend Development 1 AI/ML 19 Windows 1 Privacy Tools 3 Computer Vision 6 Networking 1 DevOps Tools 3 AI Tools 8 Developer Productivity 6 CSS Frameworks 1 Web Development Tools 1 Cloudflare 1 GraphQL 1 Database Management 1 Educational Technology 1 AI Programming 3 Machine Learning Tools 2 Python Development 2 IoT & Hardware 1 Apple Ecosystem 1 JavaScript 6 AI-Assisted Development 2 Python 2 Document Generation 3 Email 1 macOS Utilities 1 Virtualization 3 Browser Automation 1 AI Development Tools 1 Docker 2 Mobile Development 4 Marketing Technology 1 Open Source Tools 8 Documentation 1 Web Scraping 2 iOS Development 3 Mobile Apps 1 Mobile Tools 2 Android Development 3 macOS Development 1 Web Browsers 1 API Management 1 UI Components 1 React Development 1 UI/UX Design 1 Digital Forensics 1 Music Software 2 API Development 3 Business Software 1 ESP32 Projects 1 Media Server 1 Container Orchestration 1 Speech Recognition 1 Media Automation 1 Media Management 1 Self-Hosted Software 1 Java Development 1 Desktop Applications 1 AI Automation 2 AI Assistant 1 Linux Software 1 Node.js 1 3D Printing 1 Low-Code Platforms 1 Software-Defined Radio 2 CLI Utilities 1 Music Production 1 Monitoring 1 IoT 1 Hardware Programming 1 Godot 1 Game Development Tools 1 IoT Projects 1 ESP32 Development 1 Career Development 1 Python Tools 1 Product Management 1 Python Libraries 1 Legal Tech 1 Home Automation 1 Robotics 1 Hardware Hacking 1 macOS Apps 3 Game Development 1 Network Security 1 Terminal Applications 1 Data Recovery 1 Developer Resources 1 Video Editing 1 AI Integration 4 SEO Tools 1 macOS Applications 1 Penetration Testing 1 System Design 1 Edge AI 1 Audio Production 1 Live Streaming Technology 1 Music Technology 1 Generative AI 1 Flutter Development 1 Privacy Software 1 API Integration 1 Android Security 1 Cloud Computing 1 AI Engineering 1 Command Line Utilities 1 Audio Processing 1 Swift Development 1 AI Frameworks 1 Multi-Agent Systems 1 JavaScript Frameworks 1 Media Applications 1 Mathematical Visualization 1 AI Infrastructure 1 Edge Computing 1 Financial Technology 2 Security Tools 1 AI/ML Tools 1 3D Graphics 2 Database Technology 1 Observability 1 RSS Readers 1 Next.js 1 SaaS Development 1 Docker Tools 1 DevOps Monitoring 1 Visual Programming 1 Testing Tools 1 Video Processing 1 Database Tools 1 Family Technology 1 Open Source Software 1 Motion Capture 1 Scientific Computing 1 Infrastructure 1 CLI Applications 1 AI and Machine Learning 1 Finance/Trading 1 Cloud Infrastructure 1 Quantum Computing 1
Advertisement
Advertisement