Aller au contenu

Architecture Globale - Smart Transcription BFF

Version: 1.0.0
Date: 11 Mars 2026
Statut: Production


Table des Matières

  1. Vue d'Ensemble
  2. Architecture v3: Séparation des Services
  3. Stack Technologique
  4. Communication Inter-Services
  5. Structure du Projet
  6. Déploiement
  7. Sécurité

1. Vue d'Ensemble

1.1 Rôle du BFF dans l'Écosystème

Le Smart Transcription BFF (Backend For Frontend) est le service applicatif qui orchestre l'ensemble du workflow de transcription intelligente :

graph LR subgraph Frontend UI["Frontend React
(palabre.io)"] end subgraph BFF["Smart Transcription BFF"] API[API Routes] Auth[Auth Service] RAG[RAG Service] PostProc[Post-Processing] end subgraph Storage["Stockage"] PG[(PostgreSQL)] QD[(Qdrant
Vector DB)] RD[(Redis
Streams)] S3[(AWS S3
Audio, Docs)] end subgraph GPU["MeetNoo GPU Services"] Pipeline[Pipeline ML] LLM[LLM Engine] end UI <-->|REST API + JWT + SSE| API API --> Auth API --> RAG API --> PostProc API <--> PG RAG <--> QD API <--> RD API <--> S3 RD <-.Redis Streams.-> Pipeline API -->|HTTP Sync| Pipeline API -->|HTTP Sync| LLM style BFF fill:#e0f2fe,stroke:#0284c7,stroke-width:3px style Frontend fill:#f0f9ff,stroke:#0284c7,stroke-width:2px style Storage fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style GPU fill:#ffedd5,stroke:#f97316,stroke-width:3px

Responsabilités BFF:
- Authentification utilisateurs (JWT)
- Gestion crédits & facturation
- Indexation RAG (documents → Qdrant)
- Post-processing speaker identification
- Enrichissement métadonnées
- Génération deliverables (PDF, PPTX, audio)
- SSE progress tracking

Hors périmètre BFF (délégué à MeetNoo):
- Diarisation audio (PyAnnote AI)
- Transcription (Whisper large-v3)
- Extraction voiceprints audio (PyAnnote 512d)
- LLM inference (Qwen 2.5-3B)
- Orchestration GPU (Ray Serve)


2. Architecture v3: Séparation des Services

2.1 Diagramme de Conteneurs

graph TB subgraph Frontend["Frontend (React)"] UI[Palabre.io
Vite + TypeScript] end subgraph BFF["Smart Transcription BFF
(Port 8001, Internet)"] API[FastAPI Router] Auth[JWT Auth Service] RAG[RAG Service
Qdrant Indexing] PostProc[Post-Processing
Speaker ID + Enrichment] SSE[SSE Progress] Deliv[Deliverables Generator] end subgraph Storage["Stockage"] PG[(PostgreSQL
Schéma: st.*)] QD[(Qdrant
Vector DB)] RD[(Redis
Streams + Cache)] S3[(AWS S3
Files)] end subgraph GPU["MeetNoo GPU Services
(Port 8000, VPN)"] Pipeline[Pipeline API] LLM[LLM API] Dramatiq[Dramatiq Workers] Ray[Ray Serve GPU] end UI -->|REST + JWT| API API --> Auth API --> RAG API --> PostProc API --> SSE API --> Deliv API --> PG RAG --> QD API --> RD API --> S3 API -->|HTTP Sync| Pipeline API -->|HTTP Sync| LLM RD -.Redis Streams.-> API Pipeline --> Dramatiq LLM --> Dramatiq Dramatiq --> Ray style BFF fill:#1a1f2b,stroke:#06b6d4,stroke-width:3px style GPU fill:#1a1f2b,stroke:#f97316,stroke-width:3px style Frontend fill:#1a1f2b,stroke:#34d399 style Storage fill:#1a1f2b,stroke:#a78bfa

2.2 Principe de Séparation

Aspect Smart Transcription BFF MeetNoo GPU Services
Responsabilité Couche applicative, orchestration métier Moteur ML, calculs GPU
Exposition Internet (port 8001) VPN interne (port 8000)
Database PostgreSQL schéma st.* PostgreSQL schéma meetnoo.*
Language Python 3.11, FastAPI Python 3.11, Dramatiq + Ray
GPU Non Oui (NVIDIA A6000 48GB)
Authentification JWT tokens, users, crédits Pipeline API Key (internal)
Couplage Appelle MeetNoo (HTTP) Ne connaît pas smart-trans
Logs Application logs ML/GPU logs

Communication unidirectionnelle:

Smart Transcription → HTTP POST → MeetNoo (request)
Smart Transcription ← Redis Stream ← MeetNoo (response)

Zéro couplage retour: MeetNoo ne connaît pas smart-transcription, publie juste dans Redis.


3. Stack Technologique

3.1 Backend Framework

Framework: FastAPI 0.104+
Language: Python 3.11
ASGI Server: Uvicorn
Validation: Pydantic v2
ORM: SQLAlchemy 2.0
Migrations: Alembic

Structure FastAPI:

src/
├── main.py                  # Application entry point
├── routers/                 # API endpoints
│   ├── auth.py             # Authentication
│   ├── transcripts.py      # Transcription workflow
│   ├── contextual_files.py # RAG document upload
│   └── deliverables.py     # Summaries, presentations
├── services/               # Business logic
│   ├── transcription_rag_service.py
│   ├── speaker_identification_service.py
│   ├── qdrant_service.py
│   ├── embedding_service.py
│   └── llm_post_processor.py
├── models/                 # SQLAlchemy models
├── schemas/                # Pydantic schemas
└── db.py                   # Database session

3.2 AI/ML Stack

Composant Technologie Usage
Embeddings BAAI/bge-m3 (1024d) Text embeddings pour RAG
Vector DB Qdrant 1.8+ Semantic search
Chunking LlamaIndex SemanticSplitter Document splitting
Metadata Extraction OpenAI GPT-4o-mini LLM extraction avec fallback regex
Text Extraction pdfplumber, python-docx PDF/DOCX parsing
LLM Post-Processing Qwen 2.5-3B (via MeetNoo) Cleaning + identification

3.3 Databases & Storage

Relational DB:
  Engine: PostgreSQL 15
  Schema: st.*
  Tables: users, transcripts, enriched_segments, voiceprint_library
  Connection Pool: 10-20 connections

Vector DB:
  Engine: Qdrant 1.8+
  Collections: user_{userId}_transcript_{transcriptId}
  Distance: Cosine
  Dimensions: 1024 (BGE-M3)

Cache & Queue:
  Engine: Redis 7
  Usage: 
    - Redis Streams (pipeline:events, llm:reply:{id})
    - Metadata cache (7 days TTL)
    - Session cache

Object Storage:
  Provider: AWS S3
  Buckets: 
    - audio-files/
    - contextual-files/
    - deliverables/
    - voiceprints/

3.4 Communication

HTTP Client:
  Library: httpx
  Async: True
  Timeout: 120s (LLM calls)
  Retry: 3 attempts exponential backoff

Redis Client:
  Library: redis-py
  Streams: XREAD / XREADGROUP
  Pub/Sub: Pipeline events

WebSocket/SSE:
  Library: sse-starlette
  Usage: Real-time progress updates

4. Communication Inter-Services

4.1 Pattern: HTTP Sync + Redis Streams

sequenceDiagram participant ST as Smart Transcription participant Redis participant MN as MeetNoo Services Note over ST,MN: ÉTAPE 1: Démarrage Pipeline ST->>MN: POST /api/v1/pipeline/start
{tenant_id, file_url} MN-->>ST: 202 {transcription_id} Note over ST,MN: ÉTAPE 2: Traitement Async MN->>Redis: XADD pipeline:events
{txn_id, stage, status, progress} Note over ST,MN: ÉTAPE 3: Consommation Events ST->>Redis: XREADGROUP smart-trans-group Redis-->>ST: [{txn_id, stage:diarize, status:completed}] Note over ST,MN: ÉTAPE 4: Récupération Résultat ST->>MN: GET /api/v1/pipeline/{id}/result MN-->>ST: 200 {segments, speakers, voiceprints} Note over ST,MN: ÉTAPE 5: Post-Processing BFF ST->>ST: Voiceprint Matching + RAG + LLM

4.2 Endpoints MeetNoo Appelés

Endpoint Méthode Usage Timeout
/api/v1/pipeline/start POST Démarrer transcription 10s
/api/v1/pipeline/{id}/status GET État du pipeline 5s
/api/v1/pipeline/{id}/result GET Résultat complet 10s
/api/v1/llm/submit POST Enqueue LLM prompt 10s

Headers requis:

X-Pipeline-Key: {PIPELINE_API_KEY}
Content-Type: application/json

4.3 Redis Streams Consommés

Stream pipeline:events:

{
  "txn_id": "uuid",
  "stage": "diarize|transcribe|voiceprint|finalize",
  "status": "started|completed|failed",
  "progress": "0-100",
  "error": "optional error message"
}

Stream llm:reply:{request_id}:

{
  "request_id": "uuid",
  "status": "completed|failed",
  "result": "{\"text\":\"...\"}",
  "error": "optional"
}

Consumer Group Setup:

# Initialization (main.py)
redis_client.xgroup_create(
    name="pipeline:events",
    groupname="smart-trans-group",
    id="0",
    mkstream=True
)

# Consumption (background task)
while True:
    messages = redis_client.xreadgroup(
        groupname="smart-trans-group",
        consumername="consumer-1",
        streams={"pipeline:events": ">"},
        count=10,
        block=5000  # 5s blocking
    )
    for message in messages:
        process_pipeline_event(message)


5. Structure du Projet

5.1 Arborescence Détaillée

smart-transcription/
├── src/
│   ├── main.py                          # FastAPI app
│   ├── config.py                        # Settings (Pydantic)
│   ├── db.py                            # SQLAlchemy session
│   │
│   ├── routers/                         # API Endpoints
│   │   ├── auth.py                      # POST /api/auth/login
│   │   ├── transcripts.py               # POST /api/transcripts/create-with-rag
│   │   ├── contextual_files.py          # POST /api/contextual-files/upload
│   │   ├── deliverables.py              # POST /api/deliverables/generate
│   │   └── users.py                     # User management
│   │
│   ├── services/                        # Business Logic
│   │   ├── transcription_rag_service.py      # Orchestration RAG workflow
│   │   ├── speaker_identification_service.py # 3-priority identification
│   │   ├── voiceprint_matcher.py             # Cosine similarity matching
│   │   ├── qdrant_service.py                 # Vector DB operations
│   │   ├── embedding_service.py              # BGE-M3 embeddings
│   │   ├── semantic_chunking_service.py      # LlamaIndex chunking
│   │   ├── text_extraction_service.py        # PDF/DOCX extraction
│   │   ├── llm_metadata_extractor.py         # OpenAI GPT-4o-mini
│   │   ├── hybrid_metadata_extractor.py      # LLM + regex fallback
│   │   ├── llm_post_processor.py             # Qwen cleaning + identification
│   │   ├── post_processing_orchestrator.py   # Pipeline post-processing
│   │   ├── gamma_service.py                  # Presentation generation
│   │   ├── redis_consumer.py                 # Redis Streams consumer
│   │   └── cache_service.py                  # Redis caching
│   │
│   ├── models/                          # SQLAlchemy Models
│   │   ├── user.py                      # User, UserCredit
│   │   ├── transcript.py                # Transcript
│   │   ├── enriched_segment.py          # EnrichedSegment
│   │   ├── voiceprint_library.py        # VoiceprintLibrary
│   │   ├── contextual_file.py           # ContextualFile
│   │   └── meeting_summary.py           # MeetingSummary
│   │
│   ├── schemas/                         # Pydantic Schemas
│   │   ├── auth.py                      # LoginRequest, TokenResponse
│   │   ├── transcript.py                # TranscriptCreate, TranscriptResponse
│   │   ├── metadata_schemas.py          # ParticipantMetadata, DocumentMetadata
│   │   └── deliverable.py               # SummaryRequest, PresentationRequest
│   │
│   └── utils/                           # Utilities
│       ├── jwt_handler.py               # JWT encoding/decoding
│       ├── s3_client.py                 # AWS S3 operations
│       └── logger.py                    # Logging setup
├── alembic/                             # Database Migrations
│   ├── versions/
│   │   ├── 001_initial_schema.py
│   │   ├── 002_add_voiceprint_dual_embeddings.py
│   │   └── 003_rename_metadata_column.py
│   └── env.py
├── tests/                               # Tests
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── docs/                                # Documentation
│   ├── SMART_TRANSCRIPTION_BFF_README.md
│   ├── ARCHITECTURE_BFF.md             # This file
│   ├── PIPELINE_WORKFLOW.md
│   ├── RAG_ENRICHMENT.md
│   └── LLM_PROMPTING.md
├── docker-compose.yml                   # Local development
├── Dockerfile                           # Production image
├── requirements.txt                     # Python dependencies
├── .env.example                         # Environment template
└── README.md                            # Project README

5.2 Layering Pattern

graph TB subgraph Routers["Routers (API Layer) — HTTP endpoints"] R1["Input validation (Pydantic)"] R2["Response formatting"] end subgraph Services["Services (Business Logic Layer) — Core logic"] S1["Orchestration"] S2["Complex workflows"] S3["External API calls"] end subgraph Models["Models (Data Layer) — Database"] M1["SQLAlchemy ORM"] M2["Queries"] end Routers --> Services Services --> Models style Routers fill:#dbeafe,stroke:#3b82f6,stroke-width:2px style Services fill:#e0e7ff,stroke:#6366f1,stroke-width:2px style Models fill:#f3e8ff,stroke:#a855f7,stroke-width:2px

Règles:
- Routers appellent Services (jamais Models directs)
- Services appellent Models
- Models ne connaissent pas Services
- Dependency Injection via FastAPI Depends()


6. Déploiement

6.1 Architecture de Déploiement

graph TB subgraph VPS["VPS OVH (Internet)"] Nginx["Nginx Reverse Proxy
SSL/TLS Termination"] BFF["Smart Transcription BFF
Docker Container :8001
FastAPI + Uvicorn"] PG[("PostgreSQL :5432
Schemas: st.*, meetnoo.*")] Redis[("Redis :6379
Streams + Cache")] Qdrant[("Qdrant :6333
Vector Database")] Nginx --> BFF BFF --> PG BFF --> Redis BFF --> Qdrant end subgraph GPU["GPU Server (Datacenter OVH)"] MeetNoo["MeetNoo Services :8000
Dramatiq + Ray Serve
NVIDIA A6000 48GB"] end VPS -->|Tailscale VPN
100.x.x.x| GPU style VPS fill:#e0f2fe,stroke:#0284c7,stroke-width:3px style GPU fill:#ffedd5,stroke:#f97316,stroke-width:3px style Nginx fill:#dbeafe,stroke:#3b82f6,stroke-width:2px style BFF fill:#bfdbfe,stroke:#2563eb,stroke-width:2px style PG fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style Redis fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style Qdrant fill:#f3f4f6,stroke:#6b7280,stroke-width:2px style MeetNoo fill:#fed7aa,stroke:#ea580c,stroke-width:2px

6.2 Docker Compose (Production)

version: '3.8'

services:
  smart-transcription:
    image: smart-transcription-bff:latest
    ports:
      - "8001:8001"
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/db
      - REDIS_URL=redis://redis:6379/0
      - QDRANT_HOST=qdrant
      - QDRANT_PORT=6333
      - MEETNOO_SERVICES_URL=http://100.x.x.x:8000
      - AWS_S3_BUCKET=smart-transcription-files
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - postgres
      - redis
      - qdrant
    restart: unless-stopped

  postgres:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB=smart_transcription
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
    restart: unless-stopped

  qdrant:
    image: qdrant/qdrant:latest
    volumes:
      - qdrant_data:/qdrant/storage
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:
  qdrant_data:

6.3 Variables d'Environnement

Déployé sur Dokploy (Smart Transcription BFF + MeetNoo GPU Services)

# ============================================
# DATABASE
# ============================================
DATABASE_URL=postgresql://postgres:root@smarttranscription-frontend-database-zuqgnh:5432/smart-transcription
# Alternative (Tailscale VPN): postgresql://postgres:root@100.119.216.100:5444/smart-transcription

# ============================================
# REDIS STREAMS
# ============================================
REDIS_URL=redis://default:***@smarttranscription-transcription-engine-redis-gikheg:6379/0
REDIS_STREAM_KEY=pipeline:events
REDIS_CONSUMER_GROUP=smart-trans-group
USE_REDIS_STREAMS=true

# ============================================
# QDRANT VECTOR DATABASE
# ============================================
QDRANT_URL=http://qdrant-dev:6333
QDRANT_API_KEY=***
VECTOR_DIMENSION=1024

# ============================================
# MEETNOO GPU SERVICES
# ============================================
MEETNOO_API_URL=http://meetnoo-api-dev:8000
# Alternative (Tailscale VPN): http://100.119.216.100:8000
MEETNOO_SERVICES_URL=http://meetnoo-api-dev:8000
MEETNOO_SERVICE_TOKEN=***
MEETNOO_TENANT_ID=smart-transcription
PIPELINE_API_KEY=internal-pipeline-key-change-me
INTERNAL_API_BASE_URL=http://localhost:8000

# ============================================
# WEBHOOK (MeetNoo → Smart Transcription)
# ============================================
SMART_TRANSCRIPTION_WEBHOOK_URL=http://smarttranscription-backend-issqne:8000
# Alternative (Tailscale VPN): http://100.119.216.100:8000

# ============================================
# AWS S3 STORAGE
# ============================================
AWS_ACCESS_KEY_ID=AKIAYFN4EM53QJZYOEVZ
AWS_SECRET_ACCESS_KEY=***
BUCKET_NAME=smarttranscription
REGION_NAME=eu-west-3

# ============================================
# OPENAI (Metadata Extraction + Summarization)
# ============================================
OPENAI_MODEL=gpt-4o-mini
OPENAI_METADATA_MODEL=gpt-4o-mini
OPENAI_METADATA_FALLBACK_MODEL=gpt-4.1-mini
OPENAI_MAX_RETRIES=3
OPENAI_TIMEOUT=60

# ============================================
# EMBEDDINGS (BGE-M3)
# ============================================
EMBEDDING_MODEL=BAAI/bge-m3
EMBEDDING_DEVICE=cpu
EMBEDDING_BATCH_SIZE=32
EMBEDDING_DIMENSION=1024

# ============================================
# RAG CONFIGURATION
# ============================================
USE_SEMANTIC_CHUNKING=true
SEMANTIC_CHUNK_BUFFER_SIZE=1
SEMANTIC_BREAKPOINT_THRESHOLD=95
RAG_SIMILARITY_THRESHOLD_LLM=0.4
RAG_SIMILARITY_THRESHOLD_REGEX=0.5
METADATA_CACHE_TTL=604800  # 7 days

# ============================================
# AUTHENTICATION & SECURITY
# ============================================
JWT_SECRET_KEY=SmartTranscription-Auth-Secret-Key-2025-09-27-Secure-backend
ACCESS_TOKEN_EXPIRE_MINUTES=1440  # 24 hours
KEYCLOAK_URL=https://auth-staging.meetnoo.com
KEYCLOAK_REALM=smart-transcript

# ============================================
# EXTERNAL APIS (OPTIONAL)
# ============================================
# PyAnnote (Voiceprint extraction)
PYANNOTE_API_KEY=sk_***

# Whisper (Transcription - si TRANSCRIPTION_BACKEND=openai)
WHISPER_API_KEY=sk-proj-***
TRANSCRIPTION_BACKEND=local  # local = MeetNoo GPU

# Gamma API (Documents)
GAMMA_API_KEY=sk-gamma-***

# ElevenLabs (Text-to-Speech)
ELEVENLABS_API_KEY=sk_***
ELEVENLABS_VOICE_NEUTRAL_ID=21m00Tcm4TlvDq8ikWAM
ELEVENLABS_VOICE_CREOLE_ID=pNInz6obpgDQGcFmaJgB
ELEVENLABS_VOICE_LOCAL_ID=EXAVITQu4vr4xnSDxMaL

# ============================================
# EMAIL (SMTP via Mailjet)
# ============================================
SMTP_HOST=in-v3.mailjet.com
SMTP_PORT=587
SMTP_USER=9bf0c56a4d5a1c8b0ac53e3ef458139c
SMTP_PASSWORD=***
FROM_EMAIL=no-reply@meetnoo.com
FROM_NAME=MeetNoo Palabre
BASE_URL=https://test.meetnoo.com

Notes importantes:
- Les clés sensibles sont masquées (***) dans cette documentation
- Dokploy gère automatiquement les DNS internes (meetnoo-api-dev, qdrant-dev)
- Tailscale VPN utilisé pour communication inter-services (100.119.216.100)
- Redis Streams pour communication asynchrone BFF ↔ MeetNoo


7. Sécurité

7.1 Authentification

# JWT Token Flow
@router.post("/api/auth/login/json")
async def login(credentials: LoginRequest, db: Session = Depends(get_db)):
    user = authenticate_user(db, credentials.email, credentials.password)
    if not user:
        raise HTTPException(401, "Invalid credentials")

    access_token = create_access_token(
        data={"sub": user.id, "email": user.email}
    )

    return {"access_token": access_token, "token_type": "bearer"}

# Protected Endpoint
@router.post("/create-with-rag")
async def create_transcription_with_rag(
    audio_file: UploadFile = File(...),
    title: Optional[str] = Form(None),
    language: str = Form("fr"),
    contextual_files: List[UploadFile] = File(default=[]),
    current_user: User = Depends(get_current_user),  # JWT validation
    db: Session = Depends(get_db)
):
    # Only authenticated users can create transcripts
    # RAG workflow: Upload → Index → Transcribe → Enrich
    ...

7.2 Authorization

# Role-based access
class UserRole(str, Enum):
    ADMIN = "admin"
    USER = "user"
    FREE_TIER = "free"

def require_role(required_role: UserRole):
    async def role_checker(current_user: User = Depends(get_current_user)):
        if current_user.role != required_role:
            raise HTTPException(403, "Insufficient permissions")
        return current_user
    return role_checker

@router.delete("/api/users/{user_id}")
async def delete_user(
    user_id: str,
    admin: User = Depends(require_role(UserRole.ADMIN))
):
    ...

7.3 Data Isolation

# Qdrant collection naming ensures user isolation
collection_name = f"user_{user_id}_transcript_{transcript_id}"

# PostgreSQL row-level filtering
transcripts = db.query(Transcript).filter(
    Transcript.user_id == current_user.id
).all()

# S3 prefix isolation
s3_key = f"users/{user_id}/audio/{filename}"

7.4 Secrets Management

# Environment variables (never commit)
.env

# Encrypted vault (production)
AWS Secrets Manager
HashiCorp Vault

# API Key rotation
PIPELINE_API_KEY rotated every 90 days

Navigation: ← README | Pipeline Workflow →