Platform Architecture¶

This document provides a comprehensive overview of the Adaptensor platform architecture, covering data flow, security boundaries, and scaling strategies.

System Overview¶

┌─────────────────────────────────────────────────────────────────────────────┐
│                              ADAPTENSOR PLATFORM                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐  │
│  │   Web UI    │    │ Python SDK  │    │  REST API   │    │   Webhooks  │  │
│  │  (React)    │    │   (PyPI)    │    │  (Direct)   │    │ (Callbacks) │  │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘    └──────┬──────┘  │
│         │                  │                  │                  │          │
│         └──────────────────┼──────────────────┼──────────────────┘          │
│                            │                  │                             │
│                            ▼                  ▼                             │
│                    ┌─────────────────────────────────────┐                  │
│                    │         API Gateway (Cloud Run)     │                  │
│                    │   • Authentication (Firebase)       │                  │
│                    │   • Rate Limiting                   │                  │
│                    │   • Request Routing                 │                  │
│                    └──────────────────┬──────────────────┘                  │
│                                       │                                     │
│         ┌─────────────────────────────┼─────────────────────────────┐       │
│         │                             │                             │       │
│         ▼                             ▼                             ▼       │
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐  │
│  │  Document   │              │   Search    │              │   Billing   │  │
│  │  Service    │              │   Service   │              │   Service   │  │
│  │             │              │             │              │             │  │
│  │ • Upload    │              │ • Query     │              │ • Credits   │  │
│  │ • Index     │              │ • Retrieve  │              │ • Metering  │  │
│  │ • Chunk     │              │ • Rank      │              │ • Stripe    │  │
│  └──────┬──────┘              └──────┬──────┘              └──────┬──────┘  │
│         │                            │                            │         │
│         └────────────────────────────┼────────────────────────────┘         │
│                                      │                                      │
│                                      ▼                                      │
│                    ┌─────────────────────────────────────┐                  │
│                    │       AdaptCore™ + AdaptLLM™        │                  │
│                    │   TPU Middleware & Inference Engine │                  │
│                    └──────────────────┬──────────────────┘                  │
│                                       │                                     │
│                    ┌──────────────────┼──────────────────┐                  │
│                    │                  │                  │                  │
│                    ▼                  ▼                  ▼                  │
│             ┌───────────┐      ┌───────────┐      ┌───────────┐            │
│             │ TPU v2-8  │      │ TPU v2-8  │      │ TPU v4-8  │            │
│             │ (Shared)  │      │ (Shared)  │      │ (On-Demand)│            │
│             └───────────┘      └───────────┘      └───────────┘            │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Data Storage Architecture¶

Per-User Isolation¶

Every user's data is completely isolated:

Google Cloud Storage
├── gs://adaptensor-uploads/
│   ├── {user_id_1}/
│   │   ├── document_abc.pdf
│   │   ├── document_def.docx
│   │   └── ...
│   ├── {user_id_2}/
│   │   └── ...
│   └── {user_id_N}/
│
├── gs://adaptensor-indexes/
│   ├── {user_id_1}/
│   │   ├── index_aviation/
│   │   │   ├── chunks.json
│   │   │   └── vectors.adapthex
│   │   └── index_legal/
│   ├── {user_id_2}/
│   │   └── ...
│   └── {user_id_N}/

Firestore Database
├── users/
│   ├── {user_id_1}/
│   │   ├── credits: 45.50
│   │   ├── createdAt: ...
│   │   └── settings: {...}
│   └── ...
│
├── documents/
│   ├── {doc_id}/
│   │   ├── userId: {user_id}  ← Isolation key
│   │   ├── filename: "..."
│   │   ├── chunks: 150
│   │   └── ...
│   └── ...
│
├── transactions/
│   ├── {tx_id}/
│   │   ├── userId: {user_id}
│   │   ├── type: "query"
│   │   ├── amount: -0.0001
│   │   └── timestamp: ...
│   └── ...

Security Boundaries¶

┌─────────────────────────────────────────────────────────────┐
│                    Security Perimeter                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Authentication Layer (Firebase Auth)                       │
│  ├── Google OAuth                                          │
│  ├── Email/Password                                        │
│  └── API Keys (SHA256 hashed)                              │
│                                                             │
│  Authorization Layer                                        │
│  ├── Every request validates: token → user_id              │
│  ├── Every data access filters by: user_id                 │
│  └── No cross-tenant queries possible                      │
│                                                             │
│  Data Layer                                                 │
│  ├── GCS: IAM policies restrict access to service account  │
│  ├── Firestore: Security rules enforce user_id matching    │
│  └── TPU: No persistent storage, process only              │
│                                                             │
│  Network Layer                                              │
│  ├── HTTPS everywhere (TLS 1.3)                            │
│  ├── Cloud Run: Private ingress option                     │
│  └── VPC Service Controls (Enterprise)                     │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Request Flow: Document Upload¶

1. User uploads file via dashboard or SDK
   │
   ▼
2. API Gateway receives request
   ├── Validates Firebase token
   ├── Extracts user_id
   └── Checks credit balance
   │
   ▼
3. Document Service
   ├── Generates unique doc_id
   ├── Uploads to gs://adaptensor-uploads/{user_id}/{doc_id}
   └── Creates Firestore document record
   │
   ▼
4. Chunking Engine
   ├── Downloads file from GCS
   ├── Parses (PDF, DOCX, etc.)
   ├── Splits into semantic chunks
   └── Returns chunk array
   │
   ▼
5. Embedding Service (TPU)
   ├── Batches chunks (AdaptCore bucketing)
   ├── Generates 384-dim embeddings
   └── Returns vectors
   │
   ▼
6. Compression (AdaptHex™)
   ├── Quantizes float32 → hex8/hex4
   ├── 4-8x size reduction
   └── Returns compressed vectors
   │
   ▼
7. Storage
   ├── Saves chunks + vectors to gs://adaptensor-indexes/{user_id}/
   └── Updates Firestore with chunk count, status
   │
   ▼
8. Billing
   ├── Calculates cost: chunks × $0.0001
   ├── Deducts from user credits
   └── Logs transaction
   │
   ▼
9. Response to user with document_id, chunk_count, cost

Request Flow: Semantic Query¶

1. User sends query via API/SDK
   │
   ▼
2. API Gateway
   ├── Validates authentication
   ├── Extracts user_id
   └── Checks credits (≥ $0.0001)
   │
   ▼
3. Query Service
   ├── Embeds query text (same model as indexing)
   └── Returns 384-dim vector
   │
   ▼
4. Search Engine
   ├── Loads user's index from gs://adaptensor-indexes/{user_id}/
   ├── Decompresses AdaptHex vectors (on-demand)
   ├── Computes cosine similarity
   └── Returns top-k matches
   │
   ▼
5. (Optional) Reranking
   ├── Cross-encoder scoring
   └── Reorders by relevance
   │
   ▼
6. Billing
   ├── Deducts $0.0001 (query cost)
   └── Logs transaction
   │
   ▼
7. Response with results, scores, latency

Scaling Strategy¶

Current: Small Scale (1-100 users)¶

┌─────────────────┐     ┌─────────────────┐
│  Cloud Run API  │────▶│   TPU v2-8      │
│  (Auto-scaling) │     │   (On-demand)   │
└─────────────────┘     └─────────────────┘

Single API instance handles all requests
TPU spun up per job, destroyed after
Cost: ~$0/month when idle

Medium Scale (100-1000 users)¶

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Cloud Run API  │     │   Job Queue     │     │  TPU Pool       │
│  (3-10 instances)│────▶│  (Cloud Tasks)  │────▶│  (2-4 v2-8)     │
└─────────────────┘     └─────────────────┘     └─────────────────┘

API auto-scales with traffic
Job queue buffers bursty workloads
Small TPU pool stays warm
Cost: ~$500-2000/month base

Large Scale (1000+ users)¶

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Load Balancer  │     │  Regional API   │     │  TPU Pods       │
│  (Global)       │────▶│  Clusters       │────▶│  (v5p)          │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                                        │
                                                ┌───────┴───────┐
                                                ▼               ▼
                                         ┌───────────┐   ┌───────────┐
                                         │ Shared    │   │ Dedicated │
                                         │ Pool      │   │ Customers │
                                         └───────────┘   └───────────┘

Multi-region deployment
TPU pods for massive throughput
Dedicated TPU options for enterprise
Cost: Usage-based, margins improve at scale

Technology Stack¶

Layer	Technology	Purpose
Frontend	React + Vite + Tailwind	Dashboard UI
Hosting	Firebase Hosting	Static assets, CDN
Auth	Firebase Authentication	User identity
API	Cloud Run (Python/Flask)	Request handling
Database	Firestore	User data, metadata
Storage	Google Cloud Storage	Documents, indexes
Queue	Cloud Tasks	Job scheduling
Compute	TPU v2-8 / v4-8 / v5p	AI inference
Billing	Stripe	Payment processing
Docs	MkDocs + Netlify	Documentation
SDK	Python (PyPI)	Developer access

Compliance & Certifications¶

Standard	Status	Notes
SOC 2 Type II	Ready	GCP infrastructure compliant
HIPAA	Ready	BAA available through GCP
GDPR	Compliant	Data residency options
ISO 27001	Inherited	Via GCP certification
FedRAMP	Roadmap	Enterprise feature

Disaster Recovery¶

Component	RPO	RTO	Strategy
User Data	0	< 1hr	GCS multi-region
Firestore	0	< 5min	Auto-failover
API	N/A	< 5min	Cloud Run auto-recovery
TPU	N/A	< 30min	Zone failover

Next Steps¶

Getting Started - Upload your first document
Python SDK Guide - Integrate with your applications
API Reference - Full endpoint documentation