Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing

Audio segmentation is a key task for many speech technologies, most of which are based on neural networks, usually considered as black boxes, with high-level performances. However, in many domains, among which health or forensics, there is not only a need for good performance but also for explanations about the output decision. Explanations derived directly from latent representations need to satisfy "good" properties, such as informativeness, compactness, or modularity, to be interpretable. In this article, we propose an explainable-by-design audio segmentation model based on non-negative matrix factorization (NMF) which is a good candidate for the design of interpretable representations. This paper shows that our model reaches good segmentation performances, and presents deep analyses of the latent representation extracted from the non-negative matrix. The proposed approach opens new perspectives toward the evaluation of interpretable representations according to "good" properties.

Mots clés

Audio segmentation NMF explainability probing

Domaines

Traitement du signal et de l'image [eess.SP] Intelligence artificielle [cs.AI] Son [cs.SD]

Fichier principal

nmf_probing2024-4.pdf (592.3 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Théo Mariotte : Connectez-vous pour contacter le contributeur

https://univ-lemans.hal.science/hal-04617131

Soumis le : mercredi 19 juin 2024-11:34:38

Dernière modification le : jeudi 31 octobre 2024-09:06:03

Dates et versions

hal-04617131 , version 1 (19-06-2024)

Identifiants

HAL Id : hal-04617131 , version 1

Citer

Martin Lebourdais, Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega. Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing. Interspeech 2024, International Speech Communication Association (ISCA), Sep 2024, Kos / Greece, France. ⟨hal-04617131⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UNIV-LEMANS UT1-CAPITOLE GENCI LIUM LIUM-LST LTCI IDS S2A IP_PARIS IRIT TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP INSTITUT-MINES-TELECOM

796 Consultations

73 Téléchargements