🧑‍💻Tech & AI Stack

Belto’s infrastructure is engineered for scalability, security, and performance, built entirely around delivering safe and structured AI in classrooms.

🧠 Core AI Engine

LLaMA.cpp + GGUF Models: Belto runs open-source large language models (LLMs) via llama.cpp, optimized for low-latency inference using quantized GGUF models.
DeepSeek & LLaMA Support: We support both Meta’s LLaMA family and DeepSeek models depending on teachers selection (e.g., reasoning, generation or coding) to balance performance and token efficiency.
Modular Server Architecture: Our system supports the parallel deployment of multiple LLMs across different backends. Workloads are distributed dynamically based on load and model suitability.
Local Infrastructure: Hosted on-premises across 3 physical servers running 3× RTX 3060 and 1× RTX 4090, optimized for inference throughput.
Hybrid-Cloud Ready: In high-load scenarios or for redundancy, we spin up cloud instances on AWS and Azure, enabling seamless hybrid operation.
Token Efficiency: All generations are capped and throttled to comply with teacher-set token rules, ensuring performance and budget control.

🧩 System Architecture

Frontend: Built with Next.js, hosted on Vercel
Backend: FastAPI w/ async endpoints, exposed via secure API Gateway
Database: MongoDB Atlas for scalability, with schema enforcement per classroom
Webhook System: Stripe + internal services for payment, authentication, and AI usage logging

🏫 LMS & Classroom Integration

Supports Canvas, Blackboard, Moodle, Google Classroom, and others (via OAuth + LTI coming soon)
Teachers sync rosters, upload lecture files, and control AI permissions per class
Future: SIS integrations for district-wide scaling

PreviousHow to Use Belto Doc NextSales & GTM

Last updated 4 months ago