Architecture Globale - Smart Transcription BFF¶
Version: 1.0.0
Date: 11 Mars 2026
Statut: Production
Table des Matières¶
- Vue d'Ensemble
- Architecture v3: Séparation des Services
- Stack Technologique
- Communication Inter-Services
- Structure du Projet
- Déploiement
- Sécurité
1. Vue d'Ensemble¶
1.1 Rôle du BFF dans l'Écosystème¶
Le Smart Transcription BFF (Backend For Frontend) est le service applicatif qui orchestre l'ensemble du workflow de transcription intelligente :
(palabre.io)"] end subgraph BFF["Smart Transcription BFF"] API[API Routes] Auth[Auth Service] RAG[RAG Service] PostProc[Post-Processing] end subgraph Storage["Stockage"] PG[(PostgreSQL)] QD[(Qdrant
Vector DB)] RD[(Redis
Streams)] S3[(AWS S3
Audio, Docs)] end subgraph GPU["MeetNoo GPU Services"] Pipeline[Pipeline ML] LLM[LLM Engine] end UI <-->|REST API + JWT + SSE| API API --> Auth API --> RAG API --> PostProc API <--> PG RAG <--> QD API <--> RD API <--> S3 RD <-.Redis Streams.-> Pipeline API -->|HTTP Sync| Pipeline API -->|HTTP Sync| LLM style BFF fill:#e0f2fe,stroke:#0284c7,stroke-width:3px style Frontend fill:#f0f9ff,stroke:#0284c7,stroke-width:2px style Storage fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style GPU fill:#ffedd5,stroke:#f97316,stroke-width:3px
Responsabilités BFF:
- Authentification utilisateurs (JWT)
- Gestion crédits & facturation
- Indexation RAG (documents → Qdrant)
- Post-processing speaker identification
- Enrichissement métadonnées
- Génération deliverables (PDF, PPTX, audio)
- SSE progress tracking
Hors périmètre BFF (délégué à MeetNoo):
- Diarisation audio (PyAnnote AI)
- Transcription (Whisper large-v3)
- Extraction voiceprints audio (PyAnnote 512d)
- LLM inference (Qwen 2.5-3B)
- Orchestration GPU (Ray Serve)
2. Architecture v3: Séparation des Services¶
2.1 Diagramme de Conteneurs¶
Vite + TypeScript] end subgraph BFF["Smart Transcription BFF
(Port 8001, Internet)"] API[FastAPI Router] Auth[JWT Auth Service] RAG[RAG Service
Qdrant Indexing] PostProc[Post-Processing
Speaker ID + Enrichment] SSE[SSE Progress] Deliv[Deliverables Generator] end subgraph Storage["Stockage"] PG[(PostgreSQL
Schéma: st.*)] QD[(Qdrant
Vector DB)] RD[(Redis
Streams + Cache)] S3[(AWS S3
Files)] end subgraph GPU["MeetNoo GPU Services
(Port 8000, VPN)"] Pipeline[Pipeline API] LLM[LLM API] Dramatiq[Dramatiq Workers] Ray[Ray Serve GPU] end UI -->|REST + JWT| API API --> Auth API --> RAG API --> PostProc API --> SSE API --> Deliv API --> PG RAG --> QD API --> RD API --> S3 API -->|HTTP Sync| Pipeline API -->|HTTP Sync| LLM RD -.Redis Streams.-> API Pipeline --> Dramatiq LLM --> Dramatiq Dramatiq --> Ray style BFF fill:#1a1f2b,stroke:#06b6d4,stroke-width:3px style GPU fill:#1a1f2b,stroke:#f97316,stroke-width:3px style Frontend fill:#1a1f2b,stroke:#34d399 style Storage fill:#1a1f2b,stroke:#a78bfa
2.2 Principe de Séparation¶
| Aspect | Smart Transcription BFF | MeetNoo GPU Services |
|---|---|---|
| Responsabilité | Couche applicative, orchestration métier | Moteur ML, calculs GPU |
| Exposition | Internet (port 8001) | VPN interne (port 8000) |
| Database | PostgreSQL schéma st.* |
PostgreSQL schéma meetnoo.* |
| Language | Python 3.11, FastAPI | Python 3.11, Dramatiq + Ray |
| GPU | Non | Oui (NVIDIA A6000 48GB) |
| Authentification | JWT tokens, users, crédits | Pipeline API Key (internal) |
| Couplage | Appelle MeetNoo (HTTP) | Ne connaît pas smart-trans |
| Logs | Application logs | ML/GPU logs |
Communication unidirectionnelle:
Smart Transcription → HTTP POST → MeetNoo (request)
Smart Transcription ← Redis Stream ← MeetNoo (response)
Zéro couplage retour: MeetNoo ne connaît pas smart-transcription, publie juste dans Redis.
3. Stack Technologique¶
3.1 Backend Framework¶
Framework: FastAPI 0.104+
Language: Python 3.11
ASGI Server: Uvicorn
Validation: Pydantic v2
ORM: SQLAlchemy 2.0
Migrations: Alembic
Structure FastAPI:
src/
├── main.py # Application entry point
├── routers/ # API endpoints
│ ├── auth.py # Authentication
│ ├── transcripts.py # Transcription workflow
│ ├── contextual_files.py # RAG document upload
│ └── deliverables.py # Summaries, presentations
├── services/ # Business logic
│ ├── transcription_rag_service.py
│ ├── speaker_identification_service.py
│ ├── qdrant_service.py
│ ├── embedding_service.py
│ └── llm_post_processor.py
├── models/ # SQLAlchemy models
├── schemas/ # Pydantic schemas
└── db.py # Database session
3.2 AI/ML Stack¶
| Composant | Technologie | Usage |
|---|---|---|
| Embeddings | BAAI/bge-m3 (1024d) | Text embeddings pour RAG |
| Vector DB | Qdrant 1.8+ | Semantic search |
| Chunking | LlamaIndex SemanticSplitter | Document splitting |
| Metadata Extraction | OpenAI GPT-4o-mini | LLM extraction avec fallback regex |
| Text Extraction | pdfplumber, python-docx | PDF/DOCX parsing |
| LLM Post-Processing | Qwen 2.5-3B (via MeetNoo) | Cleaning + identification |
3.3 Databases & Storage¶
Relational DB:
Engine: PostgreSQL 15
Schema: st.*
Tables: users, transcripts, enriched_segments, voiceprint_library
Connection Pool: 10-20 connections
Vector DB:
Engine: Qdrant 1.8+
Collections: user_{userId}_transcript_{transcriptId}
Distance: Cosine
Dimensions: 1024 (BGE-M3)
Cache & Queue:
Engine: Redis 7
Usage:
- Redis Streams (pipeline:events, llm:reply:{id})
- Metadata cache (7 days TTL)
- Session cache
Object Storage:
Provider: AWS S3
Buckets:
- audio-files/
- contextual-files/
- deliverables/
- voiceprints/
3.4 Communication¶
HTTP Client:
Library: httpx
Async: True
Timeout: 120s (LLM calls)
Retry: 3 attempts exponential backoff
Redis Client:
Library: redis-py
Streams: XREAD / XREADGROUP
Pub/Sub: Pipeline events
WebSocket/SSE:
Library: sse-starlette
Usage: Real-time progress updates
4. Communication Inter-Services¶
4.1 Pattern: HTTP Sync + Redis Streams¶
{tenant_id, file_url} MN-->>ST: 202 {transcription_id} Note over ST,MN: ÉTAPE 2: Traitement Async MN->>Redis: XADD pipeline:events
{txn_id, stage, status, progress} Note over ST,MN: ÉTAPE 3: Consommation Events ST->>Redis: XREADGROUP smart-trans-group Redis-->>ST: [{txn_id, stage:diarize, status:completed}] Note over ST,MN: ÉTAPE 4: Récupération Résultat ST->>MN: GET /api/v1/pipeline/{id}/result MN-->>ST: 200 {segments, speakers, voiceprints} Note over ST,MN: ÉTAPE 5: Post-Processing BFF ST->>ST: Voiceprint Matching + RAG + LLM
4.2 Endpoints MeetNoo Appelés¶
| Endpoint | Méthode | Usage | Timeout |
|---|---|---|---|
/api/v1/pipeline/start |
POST | Démarrer transcription | 10s |
/api/v1/pipeline/{id}/status |
GET | État du pipeline | 5s |
/api/v1/pipeline/{id}/result |
GET | Résultat complet | 10s |
/api/v1/llm/submit |
POST | Enqueue LLM prompt | 10s |
Headers requis:
4.3 Redis Streams Consommés¶
Stream pipeline:events:
{
"txn_id": "uuid",
"stage": "diarize|transcribe|voiceprint|finalize",
"status": "started|completed|failed",
"progress": "0-100",
"error": "optional error message"
}
Stream llm:reply:{request_id}:
{
"request_id": "uuid",
"status": "completed|failed",
"result": "{\"text\":\"...\"}",
"error": "optional"
}
Consumer Group Setup:
# Initialization (main.py)
redis_client.xgroup_create(
name="pipeline:events",
groupname="smart-trans-group",
id="0",
mkstream=True
)
# Consumption (background task)
while True:
messages = redis_client.xreadgroup(
groupname="smart-trans-group",
consumername="consumer-1",
streams={"pipeline:events": ">"},
count=10,
block=5000 # 5s blocking
)
for message in messages:
process_pipeline_event(message)
5. Structure du Projet¶
5.1 Arborescence Détaillée¶
smart-transcription/
├── src/
│ ├── main.py # FastAPI app
│ ├── config.py # Settings (Pydantic)
│ ├── db.py # SQLAlchemy session
│ │
│ ├── routers/ # API Endpoints
│ │ ├── auth.py # POST /api/auth/login
│ │ ├── transcripts.py # POST /api/transcripts/create-with-rag
│ │ ├── contextual_files.py # POST /api/contextual-files/upload
│ │ ├── deliverables.py # POST /api/deliverables/generate
│ │ └── users.py # User management
│ │
│ ├── services/ # Business Logic
│ │ ├── transcription_rag_service.py # Orchestration RAG workflow
│ │ ├── speaker_identification_service.py # 3-priority identification
│ │ ├── voiceprint_matcher.py # Cosine similarity matching
│ │ ├── qdrant_service.py # Vector DB operations
│ │ ├── embedding_service.py # BGE-M3 embeddings
│ │ ├── semantic_chunking_service.py # LlamaIndex chunking
│ │ ├── text_extraction_service.py # PDF/DOCX extraction
│ │ ├── llm_metadata_extractor.py # OpenAI GPT-4o-mini
│ │ ├── hybrid_metadata_extractor.py # LLM + regex fallback
│ │ ├── llm_post_processor.py # Qwen cleaning + identification
│ │ ├── post_processing_orchestrator.py # Pipeline post-processing
│ │ ├── gamma_service.py # Presentation generation
│ │ ├── redis_consumer.py # Redis Streams consumer
│ │ └── cache_service.py # Redis caching
│ │
│ ├── models/ # SQLAlchemy Models
│ │ ├── user.py # User, UserCredit
│ │ ├── transcript.py # Transcript
│ │ ├── enriched_segment.py # EnrichedSegment
│ │ ├── voiceprint_library.py # VoiceprintLibrary
│ │ ├── contextual_file.py # ContextualFile
│ │ └── meeting_summary.py # MeetingSummary
│ │
│ ├── schemas/ # Pydantic Schemas
│ │ ├── auth.py # LoginRequest, TokenResponse
│ │ ├── transcript.py # TranscriptCreate, TranscriptResponse
│ │ ├── metadata_schemas.py # ParticipantMetadata, DocumentMetadata
│ │ └── deliverable.py # SummaryRequest, PresentationRequest
│ │
│ └── utils/ # Utilities
│ ├── jwt_handler.py # JWT encoding/decoding
│ ├── s3_client.py # AWS S3 operations
│ └── logger.py # Logging setup
│
├── alembic/ # Database Migrations
│ ├── versions/
│ │ ├── 001_initial_schema.py
│ │ ├── 002_add_voiceprint_dual_embeddings.py
│ │ └── 003_rename_metadata_column.py
│ └── env.py
│
├── tests/ # Tests
│ ├── unit/
│ ├── integration/
│ └── e2e/
│
├── docs/ # Documentation
│ ├── SMART_TRANSCRIPTION_BFF_README.md
│ ├── ARCHITECTURE_BFF.md # This file
│ ├── PIPELINE_WORKFLOW.md
│ ├── RAG_ENRICHMENT.md
│ └── LLM_PROMPTING.md
│
├── docker-compose.yml # Local development
├── Dockerfile # Production image
├── requirements.txt # Python dependencies
├── .env.example # Environment template
└── README.md # Project README
5.2 Layering Pattern¶
Règles:
- Routers appellent Services (jamais Models directs)
- Services appellent Models
- Models ne connaissent pas Services
- Dependency Injection via FastAPI Depends()
6. Déploiement¶
6.1 Architecture de Déploiement¶
SSL/TLS Termination"] BFF["Smart Transcription BFF
Docker Container :8001
FastAPI + Uvicorn"] PG[("PostgreSQL :5432
Schemas: st.*, meetnoo.*")] Redis[("Redis :6379
Streams + Cache")] Qdrant[("Qdrant :6333
Vector Database")] Nginx --> BFF BFF --> PG BFF --> Redis BFF --> Qdrant end subgraph GPU["GPU Server (Datacenter OVH)"] MeetNoo["MeetNoo Services :8000
Dramatiq + Ray Serve
NVIDIA A6000 48GB"] end VPS -->|Tailscale VPN
100.x.x.x| GPU style VPS fill:#e0f2fe,stroke:#0284c7,stroke-width:3px style GPU fill:#ffedd5,stroke:#f97316,stroke-width:3px style Nginx fill:#dbeafe,stroke:#3b82f6,stroke-width:2px style BFF fill:#bfdbfe,stroke:#2563eb,stroke-width:2px style PG fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style Redis fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style Qdrant fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style MeetNoo fill:#fed7aa,stroke:#ea580c,stroke-width:2px
6.2 Docker Compose (Production)¶
version: '3.8'
services:
smart-transcription:
image: smart-transcription-bff:latest
ports:
- "8001:8001"
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/db
- REDIS_URL=redis://redis:6379/0
- QDRANT_HOST=qdrant
- QDRANT_PORT=6333
- MEETNOO_SERVICES_URL=http://100.x.x.x:8000
- AWS_S3_BUCKET=smart-transcription-files
- OPENAI_API_KEY=${OPENAI_API_KEY}
depends_on:
- postgres
- redis
- qdrant
restart: unless-stopped
postgres:
image: postgres:15-alpine
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_DB=smart_transcription
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=${DB_PASSWORD}
restart: unless-stopped
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
restart: unless-stopped
qdrant:
image: qdrant/qdrant:latest
volumes:
- qdrant_data:/qdrant/storage
restart: unless-stopped
volumes:
postgres_data:
redis_data:
qdrant_data:
6.3 Variables d'Environnement¶
Déployé sur Dokploy (Smart Transcription BFF + MeetNoo GPU Services)
# ============================================
# DATABASE
# ============================================
DATABASE_URL=postgresql://postgres:root@smarttranscription-frontend-database-zuqgnh:5432/smart-transcription
# Alternative (Tailscale VPN): postgresql://postgres:root@100.119.216.100:5444/smart-transcription
# ============================================
# REDIS STREAMS
# ============================================
REDIS_URL=redis://default:***@smarttranscription-transcription-engine-redis-gikheg:6379/0
REDIS_STREAM_KEY=pipeline:events
REDIS_CONSUMER_GROUP=smart-trans-group
USE_REDIS_STREAMS=true
# ============================================
# QDRANT VECTOR DATABASE
# ============================================
QDRANT_URL=http://qdrant-dev:6333
QDRANT_API_KEY=***
VECTOR_DIMENSION=1024
# ============================================
# MEETNOO GPU SERVICES
# ============================================
MEETNOO_API_URL=http://meetnoo-api-dev:8000
# Alternative (Tailscale VPN): http://100.119.216.100:8000
MEETNOO_SERVICES_URL=http://meetnoo-api-dev:8000
MEETNOO_SERVICE_TOKEN=***
MEETNOO_TENANT_ID=smart-transcription
PIPELINE_API_KEY=internal-pipeline-key-change-me
INTERNAL_API_BASE_URL=http://localhost:8000
# ============================================
# WEBHOOK (MeetNoo → Smart Transcription)
# ============================================
SMART_TRANSCRIPTION_WEBHOOK_URL=http://smarttranscription-backend-issqne:8000
# Alternative (Tailscale VPN): http://100.119.216.100:8000
# ============================================
# AWS S3 STORAGE
# ============================================
AWS_ACCESS_KEY_ID=AKIAYFN4EM53QJZYOEVZ
AWS_SECRET_ACCESS_KEY=***
BUCKET_NAME=smarttranscription
REGION_NAME=eu-west-3
# ============================================
# OPENAI (Metadata Extraction + Summarization)
# ============================================
OPENAI_MODEL=gpt-4o-mini
OPENAI_METADATA_MODEL=gpt-4o-mini
OPENAI_METADATA_FALLBACK_MODEL=gpt-4.1-mini
OPENAI_MAX_RETRIES=3
OPENAI_TIMEOUT=60
# ============================================
# EMBEDDINGS (BGE-M3)
# ============================================
EMBEDDING_MODEL=BAAI/bge-m3
EMBEDDING_DEVICE=cpu
EMBEDDING_BATCH_SIZE=32
EMBEDDING_DIMENSION=1024
# ============================================
# RAG CONFIGURATION
# ============================================
USE_SEMANTIC_CHUNKING=true
SEMANTIC_CHUNK_BUFFER_SIZE=1
SEMANTIC_BREAKPOINT_THRESHOLD=95
RAG_SIMILARITY_THRESHOLD_LLM=0.4
RAG_SIMILARITY_THRESHOLD_REGEX=0.5
METADATA_CACHE_TTL=604800 # 7 days
# ============================================
# AUTHENTICATION & SECURITY
# ============================================
JWT_SECRET_KEY=SmartTranscription-Auth-Secret-Key-2025-09-27-Secure-backend
ACCESS_TOKEN_EXPIRE_MINUTES=1440 # 24 hours
KEYCLOAK_URL=https://auth-staging.meetnoo.com
KEYCLOAK_REALM=smart-transcript
# ============================================
# EXTERNAL APIS (OPTIONAL)
# ============================================
# PyAnnote (Voiceprint extraction)
PYANNOTE_API_KEY=sk_***
# Whisper (Transcription - si TRANSCRIPTION_BACKEND=openai)
WHISPER_API_KEY=sk-proj-***
TRANSCRIPTION_BACKEND=local # local = MeetNoo GPU
# Gamma API (Documents)
GAMMA_API_KEY=sk-gamma-***
# ElevenLabs (Text-to-Speech)
ELEVENLABS_API_KEY=sk_***
ELEVENLABS_VOICE_NEUTRAL_ID=21m00Tcm4TlvDq8ikWAM
ELEVENLABS_VOICE_CREOLE_ID=pNInz6obpgDQGcFmaJgB
ELEVENLABS_VOICE_LOCAL_ID=EXAVITQu4vr4xnSDxMaL
# ============================================
# EMAIL (SMTP via Mailjet)
# ============================================
SMTP_HOST=in-v3.mailjet.com
SMTP_PORT=587
SMTP_USER=9bf0c56a4d5a1c8b0ac53e3ef458139c
SMTP_PASSWORD=***
FROM_EMAIL=no-reply@meetnoo.com
FROM_NAME=MeetNoo Palabre
BASE_URL=https://test.meetnoo.com
Notes importantes:
- Les clés sensibles sont masquées (***) dans cette documentation
- Dokploy gère automatiquement les DNS internes (meetnoo-api-dev, qdrant-dev)
- Tailscale VPN utilisé pour communication inter-services (100.119.216.100)
- Redis Streams pour communication asynchrone BFF ↔ MeetNoo
7. Sécurité¶
7.1 Authentification¶
# JWT Token Flow
@router.post("/api/auth/login/json")
async def login(credentials: LoginRequest, db: Session = Depends(get_db)):
user = authenticate_user(db, credentials.email, credentials.password)
if not user:
raise HTTPException(401, "Invalid credentials")
access_token = create_access_token(
data={"sub": user.id, "email": user.email}
)
return {"access_token": access_token, "token_type": "bearer"}
# Protected Endpoint
@router.post("/create-with-rag")
async def create_transcription_with_rag(
audio_file: UploadFile = File(...),
title: Optional[str] = Form(None),
language: str = Form("fr"),
contextual_files: List[UploadFile] = File(default=[]),
current_user: User = Depends(get_current_user), # JWT validation
db: Session = Depends(get_db)
):
# Only authenticated users can create transcripts
# RAG workflow: Upload → Index → Transcribe → Enrich
...
7.2 Authorization¶
# Role-based access
class UserRole(str, Enum):
ADMIN = "admin"
USER = "user"
FREE_TIER = "free"
def require_role(required_role: UserRole):
async def role_checker(current_user: User = Depends(get_current_user)):
if current_user.role != required_role:
raise HTTPException(403, "Insufficient permissions")
return current_user
return role_checker
@router.delete("/api/users/{user_id}")
async def delete_user(
user_id: str,
admin: User = Depends(require_role(UserRole.ADMIN))
):
...
7.3 Data Isolation¶
# Qdrant collection naming ensures user isolation
collection_name = f"user_{user_id}_transcript_{transcript_id}"
# PostgreSQL row-level filtering
transcripts = db.query(Transcript).filter(
Transcript.user_id == current_user.id
).all()
# S3 prefix isolation
s3_key = f"users/{user_id}/audio/{filename}"
7.4 Secrets Management¶
# Environment variables (never commit)
.env
# Encrypted vault (production)
AWS Secrets Manager
HashiCorp Vault
# API Key rotation
PIPELINE_API_KEY rotated every 90 days
Navigation: ← README | Pipeline Workflow →