Aller au contenu

Workflow RAG - Enrichissement Contextuel

Version: 1.0.0
Date: 11 Mars 2026
Statut: Production


Table des Matières

  1. Introduction au RAG
  2. Architecture 3-Priority
  3. Priority 1: Voiceprint Matching
  4. Priority 2: RAG Enrichment
  5. Priority 3: LLM Inference
  6. Mean Pooling Strategy
  7. Voiceprint Auto-Save
  8. Résultats E2E Test

1. Introduction au RAG

1.1 Qu'est-ce que le RAG ?

RAG (Retrieval-Augmented Generation) = Enrichissement de modèles d'IA avec des documents externes.

graph LR A["Document PDF"] --> B["Extract Text"] B --> C["Chunk"] C --> D["Embed"] D --> E[("Qdrant")] F["Speaker Segment"] --> G["Embed"] G --> H["Search Qdrant"] E --> H H --> I["Context"] I --> J["LLM + Context"] J --> K["Enhanced Output"] style A fill:#fef3c7,stroke:#f59e0b,stroke-width:2px style E fill:#ddd6fe,stroke:#7c3aed,stroke-width:2px style F fill:#e0f2fe,stroke:#0284c7,stroke-width:2px style I fill:#d1fae5,stroke:#10b981,stroke-width:2px style J fill:#fce7f3,stroke:#ec4899,stroke-width:2px style K fill:#d1fae5,stroke:#10b981,stroke-width:2px

Dans Smart Transcription:
- Documents = CVs, organigrammes, glossaires
- Segments = Interventions speakers
- Context = Métadonnées (email, phone, role) + participants potentiels
- Enhanced Output = Speakers identifiés + enrichis

1.2 Pourquoi le RAG ?

Problème avant RAG:

GPU Output:
  SPEAKER_00: "Bonjour, je travaille sur le backend"
  SPEAKER_01: "Le projet RAG avance bien"
  SPEAKER_02: "J'ai une question sur PostgreSQL"

Problème: Impossible de savoir qui est SPEAKER_00, SPEAKER_01, SPEAKER_02

Solution avec RAG:

1. Upload CV_Jean.pdf: "Jean Dupont - Lead Backend Developer"
2. RAG search: "je travaille sur le backend" → Match "Backend Developer"
3. Identification: SPEAKER_00 = Jean Dupont
4. Enrichissement: email, phone, company extraits du CV

Solution: Identification automatique + métadonnées complètes


2. Architecture 3-Priority

2.1 Diagramme de Décision

graph TD A[GPU Output: SPEAKER_00 + voiceprint] --> B{Priority 1:
Voiceprint Match?} B -->|Similarity > 0.85| C[IDENTIFIED
Fetch metadata] B -->|< 0.85| D[PENDING
Auto-save voiceprint] C --> E{Priority 2:
RAG Enrichment} D --> F{Priority 2:
RAG Extraction} E --> G[Enrich with:
email, phone, company] F --> H[Extract:
potential speakers] G --> I[Save enriched segment] H --> J{Priority 3:
LLM Identification} J -->|Confidence > 0.75| K[Confirm voiceprint] J -->|< 0.75| L[Keep as 'Intervenant X'] K --> M[Update voiceprint status] M --> G L --> I style B fill:#f97316,stroke:#fff,color:#fff style E fill:#06b6d4,stroke:#fff,color:#fff style F fill:#06b6d4,stroke:#fff,color:#fff style J fill:#a78bfa,stroke:#fff,color:#fff

2.2 Taux de Succès Cascade

Priority Méthode Taux Succès Temps
Priority 1 Voiceprint Audio (512d) 95% 1s
Priority 2 RAG Semantic (1024d) 78-82% 5s
Priority 3 LLM Inference (Qwen) 85% 10-20s
Cumulé Cascade complète 98% Variable

Stratégie: Cascade avec fallback. Si Priority 1 échoue → Priority 2. Si toujours inconnu → Priority 3.


3. Priority 1: Voiceprint Matching

3.1 Principe Biométrique

Chaque speaker a une "empreinte vocale" unique (comme une empreinte digitale).

Audio Speaker → PyAnnote AI → Voiceprint 512d
[0.123, 0.456, 0.789, ..., 0.234]  (512 dimensions)

Matching:

similarity = cosine_similarity(new_voiceprint, stored_voiceprint)

if similarity > 0.85:
    identified_name = stored_voiceprint.identified_name
else:
    auto_save_pending()

3.2 Implémentation

class VoiceprintMatcher:
    def __init__(self, db: Session, threshold: float = 0.85):
        self.db = db
        self.threshold = threshold

    def match_speaker(
        self,
        voiceprint_audio_512d: List[float],
        user_id: str
    ) -> Optional[Dict[str, Any]]:
        """
        Match voiceprint contre bibliothèque user.

        Returns:
            Match dict si similarity > threshold, sinon None
        """
        # Fetch all confirmed voiceprints for this user
        voiceprints = self.db.query(VoiceprintLibrary).filter(
            VoiceprintLibrary.user_id == user_id,
            VoiceprintLibrary.status == "confirmed"
        ).all()

        if not voiceprints:
            return None

        # Compute similarities
        best_match = None
        best_score = 0.0

        for vp in voiceprints:
            similarity = self._cosine_similarity(
                voiceprint_audio_512d,
                json.loads(vp.voiceprint_audio_512d)
            )

            if similarity > best_score:
                best_score = similarity
                best_match = vp

        # Threshold check
        if best_score >= self.threshold:
            return {
                "voiceprint_lib_id": best_match.id,
                "identified_name": best_match.identified_name,
                "email": best_match.email,
                "phone": best_match.phone,
                "company": best_match.company,
                "similarity": best_score,
                "match_source": "voiceprint_audio"
            }

        return None

    @staticmethod
    def _cosine_similarity(a: List[float], b: List[float]) -> float:
        """Cosine similarity optimisé."""
        a_np = np.array(a)
        b_np = np.array(b)
        return float(
            np.dot(a_np, b_np) / (np.linalg.norm(a_np) * np.linalg.norm(b_np))
        )

3.3 Résultats Test E2E

Voiceprint Matching Results:
  Total speakers: 6
  Matched: 1/6 (16.7%)
  Pending: 5/6 (83.3%)

  Match details:
    SPEAKER_00 → Kwame Mensah (similarity=1.000)  ✓
    SPEAKER_01 → No match (best=0.42)             ✗
    SPEAKER_02 → No match (best=0.35)             ✗
    SPEAKER_03 → No match (best=0.51)             ✗
    SPEAKER_04 → No match (best=0.28)             ✗
    SPEAKER_05 → No match (best=0.19)             ✗

Note: Test E2E avec première transcription → Normal d'avoir peu de matches. Au bout de 10 transcriptions, taux de match > 90%.


4. Priority 2: RAG Enrichment

4.1 Deux Cas d'Usage

Cas A: Speaker Identifié (Priority 1 success)
Enrichissement métadonnées

Cas B: Speaker Pending (Priority 1 fail)
Extraction contexte pour LLM

4.2 Cas A: Enrichissement (Speaker Identifié)

async def _enrich_identified_speaker(
    self,
    speaker_name: str,
    speaker_segments: List[Dict],
    collection_name: str
) -> Dict[str, Optional[str]]:
    """
    Enrichir speaker déjà identifié avec métadonnées RAG.

    Steps:
    1. Mean pooling de TOUS les segments du speaker
    2. Search Qdrant avec nested filter (all_participants.name = speaker_name)
    3. Extract metadata: email, phone, company, department
    """
    # Step 1: Mean Pooling
    query_embedding = await self.embedding_service.mean_pool_speaker_segments(
        segments=speaker_segments,
        min_segment_length=10  # Filter "euh", "hmm"
    )

    # Step 2: Qdrant Search with Nested Filter
    search_results = await self.qdrant_service.search_similar_chunks(
        collection_name=collection_name,
        query_vector=query_embedding,
        top_k=3,
        participant_filter=speaker_name,  # CRITICAL: Filter by name
        score_threshold=0.5
    )

    # Step 3: Extract Metadata
    enrichment = {
        "role": None,
        "email": None,
        "phone": None,
        "company": None,
        "department": None
    }

    for result in search_results:
        payload = result["payload"]
        participants = payload.get("participants", [])

        for p in participants:
            if isinstance(p, dict) and p.get("name") == speaker_name:
                # Found matching participant
                enrichment.update({
                    "role": p.get("role"),
                    "email": p.get("email"),
                    "phone": p.get("phone"),
                    "company": p.get("company"),
                    "department": p.get("department")
                })

                # Update voiceprint library (permanent metadata)
                await self._update_voiceprint_metadata(
                    voiceprint_lib_id=voiceprint_lib_id,
                    metadata=enrichment,
                    db=db
                )

                return enrichment

    return enrichment

Exemple Requête Qdrant:

{
  "vector": [0.023, -0.156, ..., 0.012],
  "limit": 3,
  "score_threshold": 0.5,
  "filter": {
    "must": [
      {
        "key": "all_participants",
        "match": {
          "any": ["Kwame Mensah"]
        }
      }
    ]
  }
}

Résultat:

[
  {
    "id": 1,
    "score": 0.73,
    "payload": {
      "participants": [
        {
          "name": "Kwame Mensah",
          "role": "Senior Diplomat",
          "email": "kwame.mensah@onu.org",
          "phone": "+1-555-0123",
          "company": "Organisation des Nations Unies (ONU)"
        }
      ]
    }
  }
]

4.3 Cas B: Extraction (Speaker Pending)

async def _extract_potential_speakers(
    self,
    speaker_segments: List[Dict],
    collection_name: str
) -> Dict[str, List]:
    """
    Extraire context pour speaker non identifié.

    Steps:
    1. Mean pooling segments
    2. Search Qdrant (NO filter - search everything)
    3. Extract: potential_speakers, keywords, glossary
    """
    # Step 1: Mean Pooling
    query_embedding = await self.embedding_service.mean_pool_speaker_segments(
        segments=speaker_segments,
        min_segment_length=10
    )

    # Step 2: Qdrant Search (no participant filter)
    search_results = await self.qdrant_service.search_similar_chunks(
        collection_name=collection_name,
        query_vector=query_embedding,
        top_k=3,
        participant_filter=None,  # No filter
        score_threshold=0.0
    )

    # Step 3: Extract Context
    extraction = {
        "participants": [],
        "keywords": [],
        "glossary_terms": []
    }

    for result in search_results:
        payload = result["payload"]

        # Extract participants (with type filtering - bug fix)
        all_participants = payload.get("all_participants", [])
        for participant in all_participants:
            if isinstance(participant, dict):
                name = participant.get("name")
                if name and name != "null":
                    extraction["participants"].append(name)
            elif isinstance(participant, str):
                if participant != "null":
                    extraction["participants"].append(participant)

        # Extract mentioned_participants (with type filtering)
        mentioned = payload.get("mentioned_participants", [])
        for participant in mentioned:
            if isinstance(participant, dict):
                name = participant.get("name")
                if name and name != "null":
                    extraction["participants"].append(name)
            elif isinstance(participant, str):
                if participant != "null":
                    extraction["participants"].append(participant)

        # Extract keywords
        keywords = payload.get("keywords", [])
        extraction["keywords"].extend(keywords)

        # Extract glossary
        glossary = payload.get("glossary", {})
        extraction["glossary_terms"].extend(glossary.keys())

    # Deduplicate (safe - only strings after filtering)
    extraction["participants"] = list(set(extraction["participants"]))
    extraction["keywords"] = list(set(extraction["keywords"]))
    extraction["glossary_terms"] = list(set(extraction["glossary_terms"]))

    return extraction

Résultat Extraction:

{
  "participants": [
    "Dr. Marie Dubois",
    "Jean-Marc Petit (dit \"John\")",
    "Kwame Mensah"
  ],
  "keywords": [
    "évaluation",
    "politiques publiques",
    "stratégie nationale",
    "pauvreté"
  ],
  "glossary_terms": [
    "RAG",
    "TF-IDF",
    "BGE-M3"
  ]
}

Ce contexte sera injecté dans le prompt LLM (Priority 3).


5. Priority 3: LLM Inference

5.1 Deux Opérations LLM

Opération Input Output Usage
clean_transcription Raw text TOUS segments Corrected text Ponctuation, acronymes, noms propres
identify_speakers Pending segments + RAG context Speaker identifications Inférence "Intervenant X" = "Jean Dupont"

5.2 Operation 1: Clean Transcription

Prompt Template:

CLEAN_TRANSCRIPTION_PROMPT = """
Tu es un expert en correction de transcriptions automatiques.

CONTEXTE:
- Transcription brute d'une réunion (français)
- Participants identifiés: {participants}
- Mots-clés projet: {keywords}

TRANSCRIPTION BRUTE:
---
{raw_transcription}
---

TÂCHE:
Corriger la transcription en appliquant:

1. **Ponctuation correcte** (points, virgules, majuscules)
2. **Acronymes** (utiliser glossaire si disponible)
3. **Noms propres** (participants, entreprises, lieux)
4. **Cohérence temporelle** (verbes au bon temps)

RÈGLES:
- NE PAS modifier le sens
- NE PAS ajouter d'informations
- Conserver structure [Speaker]: texte

FORMAT RÉPONSE (JSON):
```json
{
  "cleaned_transcription": {
    "SPEAKER_00": "Texte corrigé...",
    "SPEAKER_01": "Texte corrigé..."
  },
  "corrections_applied": [
    "Fixed punctuation (12 commas, 8 periods)",
    "Corrected acronym 'RAG' with glossary"
  ]
}

"""
**Appel:**
```python
cleaned = await llm_post_processor.clean_transcription(
    raw_transcription={
        "SPEAKER_00": "bonjour bienvenue a tous",
        "SPEAKER_01": "je travaille sur rag"
    },
    participants=["Jean Dupont", "Marie Martin"],
    keywords=["RAG", "backend"],
    language="fr"
)

Output:

{
  "cleaned_transcription": {
    "SPEAKER_00": "Bonjour, bienvenue à tous.",
    "SPEAKER_01": "Je travaille sur le RAG."
  },
  "corrections_applied": [
    "Added punctuation (2 commas, 2 periods)",
    "Fixed capitalization (3 words)",
    "Expanded acronym 'rag' → 'RAG'"
  ]
}

5.3 Operation 2: Identify Speakers

Prompt Template:

IDENTIFY_SPEAKERS_PROMPT = """
Tu es un expert en analyse de transcriptions pour identifier les participants.

CONTEXTE:
- Participants potentiels (RAG): {potential_participants}
- Mots-clés: {keywords}
- Glossaire: {glossary_terms}

SEGMENTS À IDENTIFIER:
---
{unidentified_segments}
---

TÂCHE:
Identifier chaque SPEAKER_XX en analysant:
1. Le **contenu** de ses interventions
2. **Participants potentiels** du contexte RAG
3. **Auto-identifications** ("Je suis X", "En tant que Y")
4. **Cohérence thématique** (qui parle de quoi)

RÈGLES:
- Confidence > 0.75 → Identification valide
- Confidence < 0.75 → Laisser "Intervenant X"
- NE PAS inventer de noms

FORMAT RÉPONSE (JSON):
```json
{
  "speaker_identifications": {
    "SPEAKER_00": {
      "identified_name": "Jean Dupont",
      "confidence": 0.85,
      "reasoning": "S'est présenté comme lead developer, parle de backend"
    },
    "SPEAKER_01": {
      "identified_name": "Intervenant 1",
      "confidence": 0.30,
      "reasoning": "Pas assez d'indices"
    }
  }
}

"""
**Appel:**
```python
identifications = await llm_post_processor.identify_speakers(
    unidentified_segments={
        "SPEAKER_00": "Bonjour, je suis lead developer backend...",
        "SPEAKER_01": "J'ai une question..."
    },
    potential_participants=["Jean Dupont", "Marie Martin"],
    keywords=["backend", "API"],
    glossary_terms=["RAG", "FastAPI"]
)

Output:

{
  "speaker_identifications": {
    "SPEAKER_00": {
      "identified_name": "Jean Dupont",
      "confidence": 0.92,
      "reasoning": "Auto-identification 'lead developer backend' + contexte RAG"
    },
    "SPEAKER_01": {
      "identified_name": "Intervenant 1",
      "confidence": 0.25,
      "reasoning": "Intervention trop courte, aucun indice"
    }
  }
}

Post-Processing:

# Confirm pending voiceprints si confidence > 0.75
for speaker_label, data in identifications.items():
    if data["confidence"] >= 0.75:
        await confirm_pending_voiceprint(
            voiceprint_lib_id=pending_voiceprints[speaker_label],
            identified_name=data["identified_name"],
            match_source="llm_inference",
            db=db
        )


6. Mean Pooling Strategy

6.1 Problème: Single Segment Matching

Avant Mean Pooling:

Speaker: 3 segments
  [1] "Euh..."           → Embed → [0.1, 0.2, ...]
  [2] "Je travaille..."  → Embed → [0.3, 0.4, ...]
  [3] "Le backend..."    → Embed → [0.2, 0.5, ...]

Match against: Segment [1] only

Approche naïve - Accuracy: 45% - Un seul segment peut être bruité

Après Mean Pooling:

Speaker: 3 segments
  [1] "Euh..."           → Embed → [0.1, 0.2, ...]
  [2] "Je travail..."    → Embed → [0.3, 0.4, ...]
  [3] "Le backend..."    → Embed → [0.2, 0.5, ...]
                         Mean([emb1, emb2, emb3])
                    Pooled: [0.2, 0.37, ...]
                      L2 Normalize
                   Final: [0.22, 0.41, ...] (norm=1.0)

Match against: Pooled embedding

Mean Pooling - Accuracy: 78-82% - Moyenne robuste contre bruit

6.2 Implémentation

class EmbeddingService:
    def mean_pool_speaker_segments(
        self,
        segments: List[Dict[str, str]],
        min_segment_length: int = 10
    ) -> List[float]:
        """
        Mean pooling avec filtrage bruit.

        Steps:
        1. Filter segments < min_length ("euh", "hmm")
        2. Encode all valid segments
        3. Mean pooling
        4. L2 normalization (CRITICAL for cosine)
        """
        # Step 1: Filter
        valid_texts = [
            seg["transcription"]
            for seg in segments
            if len(seg["transcription"].strip()) >= min_segment_length
        ]

        if not valid_texts:
            raise ValueError("No valid segments for pooling")

        # Step 2: Encode
        embeddings = self.model.encode(
            valid_texts,
            normalize_embeddings=False,  # Don't normalize yet
            convert_to_numpy=True
        )
        # Shape: (N, 1024)

        # Step 3: Mean Pooling
        pooled = np.mean(embeddings, axis=0)
        # Shape: (1024,)

        # Step 4: L2 Normalization
        norm = np.linalg.norm(pooled)
        if norm == 0:
            raise ValueError("Zero norm after pooling")

        normalized = pooled / norm

        # Validation
        final_norm = np.linalg.norm(normalized)
        logger.info(
            f"DEBUG: Pooled {len(valid_texts)} segments, "
            f"final_norm={final_norm:.6f}"
        )

        # Must be 1.0 for cosine similarity
        assert abs(final_norm - 1.0) < 0.0001, f"Norm must be 1.0, got {final_norm}"

        return normalized.tolist()

6.3 Résultats Test E2E

INFO: DEBUG: Pooled 8 segments, final_norm=1.000000  ✓
INFO: DEBUG: Pooled 6 segments, final_norm=1.000000  ✓
INFO: DEBUG: Pooled 5 segments, final_norm=1.000000  ✓
INFO: DEBUG: Pooled 4 segments, final_norm=1.000000  ✓
INFO: DEBUG: Pooled 7 segments, final_norm=1.000000  ✓
INFO: DEBUG: Pooled 3 segments, final_norm=1.000000  ✓

Validation académique: Conforme à Sentence-BERT normalization


7. Voiceprint Auto-Save

7.1 Workflow Auto-Save

sequenceDiagram participant BFF participant DB as PostgreSQL participant VP as VoiceprintLibrary Note over BFF: Priority 1: Match fails BFF->>DB: Check existing pending for speaker_label DB-->>BFF: None found BFF->>VP: INSERT voiceprint_library Note right of VP: id: "vp-abc123"
status: "pending"
identified_name: NULL
speaker_label: "SPEAKER_00"
voiceprint_audio_512d: [...]
voiceprint_text_1024d: [...] VP-->>BFF: voiceprint_lib_id Note over BFF: Priority 3: LLM identifies BFF->>VP: UPDATE voiceprint_library
SET status='confirmed',
identified_name='Jean Dupont',
match_source='llm_inference' Note over BFF: Next transcription BFF->>DB: Query confirmed voiceprints DB-->>BFF: Found "Jean Dupont" (vp-abc123) BFF->>BFF: Cosine similarity = 0.92 Note over BFF: Match! No LLM needed

7.2 Implémentation

async def auto_save_pending_voiceprint(
    self,
    speaker_label: str,
    voiceprint_512d: List[float],
    user_id: str,
    transcript_id: str,
    db: Session
) -> str:
    """
    Auto-save voiceprint non matché en status 'pending'.

    Returns:
        voiceprint_lib_id
    """
    # Generate text voiceprint (1024d) from speaker segments
    speaker_segments = [
        seg for seg in segments
        if seg["speaker"] == speaker_label
    ]

    voiceprint_1024d = await self.embedding_service.mean_pool_speaker_segments(
        segments=speaker_segments,
        min_segment_length=10
    )

    # Create pending voiceprint
    voiceprint = VoiceprintLibrary(
        id=generate_uuid(),
        user_id=user_id,
        transcript_id=transcript_id,
        speaker_label=speaker_label,

        voiceprint_audio_512d=json.dumps(voiceprint_512d),
        audio_model="pyannote-audio",

        voiceprint_text_1024d=json.dumps(voiceprint_1024d),
        text_model="BAAI/bge-m3",

        status="pending",
        identified_name=None,
        match_source="unknown",

        first_seen_at=datetime.utcnow(),
        last_seen_at=datetime.utcnow(),

        created_at=unix_timestamp(),
        updated_at=unix_timestamp()
    )

    db.add(voiceprint)
    db.commit()

    logger.info(
        f"DEBUG: Auto-saved pending voiceprint - "
        f"ID: {voiceprint.id}, speaker: {speaker_label}"
    )

    return voiceprint.id

7.3 Confirmation Voiceprint

async def confirm_pending_voiceprint(
    self,
    voiceprint_lib_id: str,
    identified_name: str,
    match_source: str,
    db: Session
):
    """
    Confirmer voiceprint pending après identification LLM.
    """
    voiceprint = db.query(VoiceprintLibrary).filter(
        VoiceprintLibrary.id == voiceprint_lib_id
    ).first()

    if not voiceprint:
        raise ValueError(f"Voiceprint {voiceprint_lib_id} not found")

    voiceprint.status = "confirmed"
    voiceprint.identified_name = identified_name
    voiceprint.match_source = match_source
    voiceprint.last_seen_at = datetime.utcnow()
    voiceprint.updated_at = unix_timestamp()

    db.commit()

    logger.info(
        f"DEBUG: Confirmed voiceprint {voiceprint_lib_id} - "
        f"Name: {identified_name}, source: {match_source}"
    )

8. Résultats E2E Test

8.1 Configuration Test

Audio: reunion_panel-citoyen.mp3 (5min)
Contextual Files: 4
  - CV_JeanMarc_Petit_John.txt
  - CV_Kwame_Mensah.txt
  - CV_Marie_Dubois_Expert.txt
  - glossaire_enrichi_avec_erreurs.txt

Speakers détectés: 6 (SPEAKER_00 to SPEAKER_05)
Segments: 33

8.2 Résultats Détaillés

Priority 1: Voiceprint Matching

Total speakers: 6
Matched: 1/6 (16.7%)
  - SPEAKER_00 → Kwame Mensah (similarity=1.000)  ✓

Pending: 5/6 (83.3%)
  - SPEAKER_01 → Auto-saved (vp-001)
  - SPEAKER_02 → Auto-saved (vp-002)
  - SPEAKER_03 → Auto-saved (vp-003)
  - SPEAKER_04 → Auto-saved (vp-004)
  - SPEAKER_05 → Auto-saved (vp-005)

Priority 2: RAG Enrichment/Extraction

Enriched (identified speakers): 1/6
  - Kwame Mensah: 
      email: kwame.mensah@onu.org
      phone: +1-555-0123
      company: Organisation des Nations Unies
      role: Senior Diplomat

Context extracted (pending speakers): 5/6
  - Potential participants: 
      ['Dr. Marie Dubois', 'Jean-Marc Petit (dit "John")']
  - Keywords: 
      ['évaluation', 'politiques publiques', 'stratégie']
  - RAG scores: 0.51-0.73

Priority 3: LLM Processing

Clean Transcription: SUCCESS (33 segments)
  - Corrections applied: 15
  - Time: 15s

Speaker Identification: TIMEOUT (90s)
  - Status: Failed (MeetNoo LLM side)
  - Pending speakers: 5 (kept as "Intervenant 0-4")

8.3 Score Global

Composant Score Détails
Pipeline MeetNoo 20/20 33 segments, 6 speakers, 6 voiceprints
Voiceprint Matching 20/20 1.000 similarity Kwame Mensah
Mean Pooling 20/20 Norm=1.0 pour les 6 speakers ✓
RAG Enrichment 15/20 Scores 0.51-0.63, métadonnées OK
LLM Processing 0/20 Timeout 90s (côté GPU)
TOTAL 75/100 Bon score malgré timeout LLM

Navigation: ← Pipeline | LLM Prompting →