Skip to content
← Back to Dashboard

Platform Architecture

This document provides a comprehensive overview of the Adaptensor platform architecture, covering data flow, security boundaries, and scaling strategies.

System Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                              ADAPTENSOR PLATFORM                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐  │
│  │   Web UI    │    │ Python SDK  │    │  REST API   │    │   Webhooks  │  │
│  │  (React)    │    │   (PyPI)    │    │  (Direct)   │    │ (Callbacks) │  │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘    └──────┬──────┘  │
│         │                  │                  │                  │          │
│         └──────────────────┼──────────────────┼──────────────────┘          │
│                            │                  │                             │
│                            ▼                  ▼                             │
│                    ┌─────────────────────────────────────┐                  │
│                    │         API Gateway (Cloud Run)     │                  │
│                    │   • Authentication (Firebase)       │                  │
│                    │   • Rate Limiting                   │                  │
│                    │   • Request Routing                 │                  │
│                    └──────────────────┬──────────────────┘                  │
│                                       │                                     │
│         ┌─────────────────────────────┼─────────────────────────────┐       │
│         │                             │                             │       │
│         ▼                             ▼                             ▼       │
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐  │
│  │  Document   │              │   Search    │              │   Billing   │  │
│  │  Service    │              │   Service   │              │   Service   │  │
│  │             │              │             │              │             │  │
│  │ • Upload    │              │ • Query     │              │ • Credits   │  │
│  │ • Index     │              │ • Retrieve  │              │ • Metering  │  │
│  │ • Chunk     │              │ • Rank      │              │ • Stripe    │  │
│  └──────┬──────┘              └──────┬──────┘              └──────┬──────┘  │
│         │                            │                            │         │
│         └────────────────────────────┼────────────────────────────┘         │
│                                      │                                      │
│                                      ▼                                      │
│                    ┌─────────────────────────────────────┐                  │
│                    │       AdaptCore™ + AdaptLLM™        │                  │
│                    │   TPU Middleware & Inference Engine │                  │
│                    └──────────────────┬──────────────────┘                  │
│                                       │                                     │
│                    ┌──────────────────┼──────────────────┐                  │
│                    │                  │                  │                  │
│                    ▼                  ▼                  ▼                  │
│             ┌───────────┐      ┌───────────┐      ┌───────────┐            │
│             │ TPU v2-8  │      │ TPU v2-8  │      │ TPU v4-8  │            │
│             │ (Shared)  │      │ (Shared)  │      │ (On-Demand)│            │
│             └───────────┘      └───────────┘      └───────────┘            │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Data Storage Architecture

Per-User Isolation

Every user's data is completely isolated:

Google Cloud Storage
├── gs://adaptensor-uploads/
│   ├── {user_id_1}/
│   │   ├── document_abc.pdf
│   │   ├── document_def.docx
│   │   └── ...
│   ├── {user_id_2}/
│   │   └── ...
│   └── {user_id_N}/
├── gs://adaptensor-indexes/
│   ├── {user_id_1}/
│   │   ├── index_aviation/
│   │   │   ├── chunks.json
│   │   │   └── vectors.adapthex
│   │   └── index_legal/
│   ├── {user_id_2}/
│   │   └── ...
│   └── {user_id_N}/

Firestore Database
├── users/
│   ├── {user_id_1}/
│   │   ├── credits: 45.50
│   │   ├── createdAt: ...
│   │   └── settings: {...}
│   └── ...
├── documents/
│   ├── {doc_id}/
│   │   ├── userId: {user_id}  ← Isolation key
│   │   ├── filename: "..."
│   │   ├── chunks: 150
│   │   └── ...
│   └── ...
├── transactions/
│   ├── {tx_id}/
│   │   ├── userId: {user_id}
│   │   ├── type: "query"
│   │   ├── amount: -0.0001
│   │   └── timestamp: ...
│   └── ...

Security Boundaries

┌─────────────────────────────────────────────────────────────┐
│                    Security Perimeter                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Authentication Layer (Firebase Auth)                       │
│  ├── Google OAuth                                          │
│  ├── Email/Password                                        │
│  └── API Keys (SHA256 hashed)                              │
│                                                             │
│  Authorization Layer                                        │
│  ├── Every request validates: token → user_id              │
│  ├── Every data access filters by: user_id                 │
│  └── No cross-tenant queries possible                      │
│                                                             │
│  Data Layer                                                 │
│  ├── GCS: IAM policies restrict access to service account  │
│  ├── Firestore: Security rules enforce user_id matching    │
│  └── TPU: No persistent storage, process only              │
│                                                             │
│  Network Layer                                              │
│  ├── HTTPS everywhere (TLS 1.3)                            │
│  ├── Cloud Run: Private ingress option                     │
│  └── VPC Service Controls (Enterprise)                     │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Request Flow: Document Upload

1. User uploads file via dashboard or SDK
2. API Gateway receives request
   ├── Validates Firebase token
   ├── Extracts user_id
   └── Checks credit balance
3. Document Service
   ├── Generates unique doc_id
   ├── Uploads to gs://adaptensor-uploads/{user_id}/{doc_id}
   └── Creates Firestore document record
4. Chunking Engine
   ├── Downloads file from GCS
   ├── Parses (PDF, DOCX, etc.)
   ├── Splits into semantic chunks
   └── Returns chunk array
5. Embedding Service (TPU)
   ├── Batches chunks (AdaptCore bucketing)
   ├── Generates 384-dim embeddings
   └── Returns vectors
6. Compression (AdaptHex™)
   ├── Quantizes float32 → hex8/hex4
   ├── 4-8x size reduction
   └── Returns compressed vectors
7. Storage
   ├── Saves chunks + vectors to gs://adaptensor-indexes/{user_id}/
   └── Updates Firestore with chunk count, status
8. Billing
   ├── Calculates cost: chunks × $0.0001
   ├── Deducts from user credits
   └── Logs transaction
9. Response to user with document_id, chunk_count, cost

Request Flow: Semantic Query

1. User sends query via API/SDK
2. API Gateway
   ├── Validates authentication
   ├── Extracts user_id
   └── Checks credits (≥ $0.0001)
3. Query Service
   ├── Embeds query text (same model as indexing)
   └── Returns 384-dim vector
4. Search Engine
   ├── Loads user's index from gs://adaptensor-indexes/{user_id}/
   ├── Decompresses AdaptHex vectors (on-demand)
   ├── Computes cosine similarity
   └── Returns top-k matches
5. (Optional) Reranking
   ├── Cross-encoder scoring
   └── Reorders by relevance
6. Billing
   ├── Deducts $0.0001 (query cost)
   └── Logs transaction
7. Response with results, scores, latency

Scaling Strategy

Current: Small Scale (1-100 users)

┌─────────────────┐     ┌─────────────────┐
│  Cloud Run API  │────▶│   TPU v2-8      │
│  (Auto-scaling) │     │   (On-demand)   │
└─────────────────┘     └─────────────────┘
  • Single API instance handles all requests
  • TPU spun up per job, destroyed after
  • Cost: ~$0/month when idle

Medium Scale (100-1000 users)

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Cloud Run API  │     │   Job Queue     │     │  TPU Pool       │
│  (3-10 instances)│────▶│  (Cloud Tasks)  │────▶│  (2-4 v2-8)     │
└─────────────────┘     └─────────────────┘     └─────────────────┘
  • API auto-scales with traffic
  • Job queue buffers bursty workloads
  • Small TPU pool stays warm
  • Cost: ~$500-2000/month base

Large Scale (1000+ users)

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Load Balancer  │     │  Regional API   │     │  TPU Pods       │
│  (Global)       │────▶│  Clusters       │────▶│  (v5p)          │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                                ┌───────┴───────┐
                                                ▼               ▼
                                         ┌───────────┐   ┌───────────┐
                                         │ Shared    │   │ Dedicated │
                                         │ Pool      │   │ Customers │
                                         └───────────┘   └───────────┘
  • Multi-region deployment
  • TPU pods for massive throughput
  • Dedicated TPU options for enterprise
  • Cost: Usage-based, margins improve at scale

Technology Stack

Layer Technology Purpose
Frontend React + Vite + Tailwind Dashboard UI
Hosting Firebase Hosting Static assets, CDN
Auth Firebase Authentication User identity
API Cloud Run (Python/Flask) Request handling
Database Firestore User data, metadata
Storage Google Cloud Storage Documents, indexes
Queue Cloud Tasks Job scheduling
Compute TPU v2-8 / v4-8 / v5p AI inference
Billing Stripe Payment processing
Docs MkDocs + Netlify Documentation
SDK Python (PyPI) Developer access

Compliance & Certifications

Standard Status Notes
SOC 2 Type II Ready GCP infrastructure compliant
HIPAA Ready BAA available through GCP
GDPR Compliant Data residency options
ISO 27001 Inherited Via GCP certification
FedRAMP Roadmap Enterprise feature

Disaster Recovery

Component RPO RTO Strategy
User Data 0 < 1hr GCS multi-region
Firestore 0 < 5min Auto-failover
API N/A < 5min Cloud Run auto-recovery
TPU N/A < 30min Zone failover

Next Steps