Engineering

Production infrastructure, operated by us.

A real platform, with real services in production — built on Canadian-residency infrastructure and a sovereign AI model stack.

44+
Production services
12
Live databases
33
CRM routers
46
Frontend pages

Production infrastructure

Container-isolated services on hardened Linux running in Canadian-residency facilities. Native reverse proxy with managed TLS, scheduled backups, and monitoring dashboards. Workflow automation runs on a self-hosted n8n instance; scheduling on a self-hosted Cal.com.

Compute

Container-isolated services on hardened Linux. Sized for sustained AI workloads.

Networking

Native reverse proxy with managed TLS certificates and HSTS enforced.

Backups

Daily automated backups with off-host retention and audited restore drills.

Monitoring

Privacy-respecting analytics and operational telemetry on a 10-minute cadence.

Automation

Self-hosted n8n workflows orchestrate provisioning and lifecycle automation.

Scheduling

Self-hosted Cal.com handles team scheduling without third-party leakage.

AI model stack — running in production

A combination of locally-hosted open models and selected cloud reasoning layers, orchestrated to keep sensitive data on Canadian infrastructure.

  • Gemma 4 26B / e4b / e2b — local inference for general reasoning
  • Sarvam Bulbul v3 Tamil text-to-speech streaming pipeline
  • Sarvam 30B Tamil-language NLU and dialogue
  • Whisper.cpp Large-v3-Turbo Tamil speech-to-text, sub-500ms on-server latency
  • OpenVoice v2 (CPU Hybrid) Voice-cloning pipeline with Edge TTS + tone-colour conversion
  • ElevenLabs IVC Premium-tier voice cloning, additive to the core pipeline
  • Gemini 2.0 Flash Lite / 2.5 Flash Cloud reasoning layer where appropriate

Engine registry

A locked, versioned set of inference engines used across VoxTN products and managed services.

TTS

  • Edge TTS (default)
  • Coqui XTTS v2
  • Meta MMS Tamil
  • Google Chirp 3 HD
  • Gemini 2.5 TTS
  • Sarvam Bulbul v3
  • Sarvam Edge
  • ElevenLabs IVC

STT

  • Whisper.cpp Large-v3-Turbo
  • Sarvam Saaras v3

Cloning

  • OpenVoice v2 CPU Hybrid (active)
  • GPT-SoVITS
  • ElevenLabs IVC (additive tier)

Platform technology stack

Frontend

  • Astro 5
  • Next.js
  • Flutter

Backend

  • FastAPI (Python)
  • Go microservices
  • Celery worker + beat

Data

  • PostgreSQL 16
  • pgvector
  • Redis 7
  • Cloudflare R2 object storage
  • Sanity CMS

Auth

  • Clerk (JWT)

Automation

  • n8n workflows

Payments

  • Stripe

Deployments

  • Vercel (web)
  • Sovereign Canadian compute (services and AI)

Scaling roadmap

A multi-phase plan for sovereign capacity expansion, including GPU-accelerated inference clusters and a dual-node architecture for high-availability.

  1. Phase 1

    Single-node sovereign stack

    Today: production AI workloads served from Canadian-residency compute with full backup and observability.

  2. Phase 2

    Dual-node high availability

    Active-active architecture for managed services and AI inference across two Canadian sites.

  3. Phase 3

    GPU inference cluster

    Dedicated GPU capacity for higher-throughput voice and video generation workloads.

Need this kind of infrastructure for your business?

We design, deploy, and operate sovereign technology stacks for Canadian companies in regulated industries.