Publications

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

Publications HAL

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

HAL thèses

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2025

Modèle physique variationnel pour l’estimation de réponses impulsionnelles de salles
- Lalay Louis
- Fontaine Mathieu
- Badeau Roland
, 2025. Estimer la réponse impulsionnelle d’une salle est essentiel pour des tâches comme la déréverbération, qui améliore la reconnaissance automatique de la parole. La plupart des méthodes existantes reposent soit sur du traitement du signal statistique, soit sur des réseaux de neurones profonds s’inspirant du traitement du signal. Cependant, la combinaison des modélisations statistique et physique reste largement inexploré en estimation de réponse impulsionnelle de salle. Cet article propose une approche novatrice intégrant les deux aspects à travers un modèle physique. La réponse de salle est décomposée en paramètres interprétables : un bruit blanc gaussien modulé par une décroissance exponentielle dépendante de la fréquence (modélisant l’absorption des murs) et un filtre autorégressif (modélisant par exemple la réponse du microphone). L’optimisation d’une fonction d’énergie libre variationnelle permet une estimation pratique des paramètres. Nous montrons que, connaissant les signaux secs et réverbérants, la méthode proposée surpasse la déconvolution classique dans des environnements bruités, comme le confirment les mesures objectives.
SigN: SIMBox Activity Detection Through Latency Anomalies at the Cellular Edge
- Kouam Anne Josiane
- Carneiro Viana Aline
- Martins Philippe
- Adjih Cédric
- Tchana Alain
, 2025. Despite their widespread adoption, cellular networks face growing vulnerabilities due to their inherent complexity and the integration of advanced technologies. One of the major threats in this landscape is Voice over IP (VoIP) to GSM gateways, known as SIMBox devices. These devices use multiple SIM cards to route VoIP traffic through cellular networks, enabling international bypass fraud with losses of up to $3.11 billion annually. Beyond financial impact, SIMBox activity degrades network performance, threatens national security, and facilitates eavesdropping on communications. Existing detection methods for SIMBox activity are hindered by evolving fraud techniques and implementation complexities, limiting their practical adoption in operator networks. This paper addresses the limitations of current detection methods by introducing SigN , a novel approach to identifying SIMBox activity at the cellular edge. The proposed method focuses on detecting remote SIM card association, a technique used by SIMBox appliances to mimic human mobility patterns. The method detects latency anomalies between SIMBox and standard devices by analyzing cellular signaling during network attachment. Extensive indoor and outdoor experiments demonstrate that SIMBox devices generate significantly higher attachment latencies, particularly during the authentication phase, where latency is up to 23 times greater than that of standard devices. We attribute part of this overhead to immutable factors such as LTE authentication standards and Internet-based communication protocols. Therefore, our approach offers a robust, scalable, and practical solution to mitigate SIMBox activity risks at the network edge. (10.1145/3708821.3733902)
DOI : 10.1145/3708821.3733902
Quelques éléments historiques portant sur la cyclostationnarité et son application en télécommunications
- Ciblat Philippe
, 2025. Dans ce papier, nous donnons quelques grandes dates sur la théorie de signaux cyclostationnaires, la première étant en 1958 du côté des Etats-Unis et la deuxième étant 1959 du côté de l'Union Soviétique suivi de quelques autres des deux côtés de l'Océan Atlantique. Nous nous attardons ensuite sur deux applications-phares de la cyclostationnarité que sont la synchronisation et l'égalisation autodidacte pour lesquelles la communauté française a joué un rôle conséquent dans les années 1990.
Déréverbération non-supervisée de la parole par modèle hybride
- Bahrman Louis
- Fontaine Mathieu
- Richard Gaël
, 2025, pp.1-4. Cet article introduit une nouvelle stratégie d'apprentissage pour améliorer des systèmes de déréverbération de la parole de manière non-supervisée en n'utilisant que des signaux réverbérants. La plupart des algorithmes existants nécessitent des paires de signaux (sec, réverbérant), qui sont difficiles à obtenir. Notre approche utilise en revanche des informations acoustiques limitées, comme le temps de réverbération (RT60), pour entraîner un système de déréverbération. Les résultats expérimentaux démontrent que notre méthode permet d'obtenir des performances plus cohérentes que l'état de l'art sur différentes mesures objectives.
La borne bayesienne de Schützenberger-van Trees : Un principe d'incertitude sur l'a posteriori
- Rioul Olivier
- Renaux Alexandre
, 2025. Cette communication propose une perspective historique sur la borne de Cramér-Rao bayésienne (BCRB), généralement attribuée à van Trees qui l’a découverte en 1968. Selon la loi de Stigler sur l’éponymie, aucune découverte scientifique ne porte le nom de son premier découvreur. C’est non seulement le cas de la borne Cramér-Rao elle-même – due notamment aux mathématiciens français Fréchet et Darmois – mais aussi de l’inégalité de van Trees. En effet, le médecin, généticien, épidémiologiste et mathématicien français Marcel-Paul (Marco) Schützenberger, dans un petit article d’une quinzaine de lignes seulement, écrit en 1956 – plus d’une décennie avant van Trees – avait non seulement démontré la BCRB, mais comme le montre une lecture approfondie de sa preuve, l’avait fait avec une démarche très originale en la reliant au principe d’incertitude de Weyl-Heisenberg sur l’a posteriori. Nous passons en revue et comparons les contributions de Schützenberger et de van Trees ainsi que celles de Gart en 1959. L’équivalence générale entre BCRB et principe d’incertitude sur l’a posteriori ouvre de nouvelles perspectives.
Unified Variational and Physics-aware Model for Room Impulse Response Estimation
- Lalay Louis
- Fontaine Mathieu
- Badeau Roland
, 2025. Room impulse response estimation is essential for tasks like speech dereverberation, which improves automatic speech recognition. Most existing methods rely on either statistical signal processing or deep neural networks designed to replicate signal processing principles. However, combining statistical and physical modeling for RIR estimation remains largely unexplored. This paper\footnote{This paper was submitted to Interspeech 2025} proposes a novel approach integrating both aspects through a theoretically grounded model. The RIR is decomposed into interpretable parameters: white Gaussian noise filtered by a frequency-dependent exponential decay (e.g. modeling wall absorption) and an autoregressive filter (e.g. modeling microphone response). A variational free-energy cost function enables practical parameter estimation. As a proof of concept, we show that given dry and reverberant speech signals, the proposed method outperforms classical deconvolution in noisy environments, as validated by objective metrics.
Emotional speech markers of psychiatric disturbance in Huntington’s disease
- Chenain Lucie
- Fabre Audrey
- Titeux Hadrien
- Morgado Graça
- Youssov Katia
- Clavel Chloé
- Bachoud-Lévi Anne-Catherine
Frontiers in Psychiatry, Frontiers, 2025, 16, pp.1633492-1:1633492-20. Introduction: Psychiatric disorders and difficulties in emotional expression represent a major problem in the management of Huntington's Disease (HD). To improve patient follow-up, we propose to investigate the link between emotional expression and psychiatric symptoms, measured by the Problem Behaviors Assessment (PBA) scale. To this aim we developed the first emotional/psychiatric speech corpus, emoHD. Methods: We included 102 HD gene carriers and 35 healthy controls (HC). Psychiatric symptoms were assessed using PBA sub-scales for Depression, Irritability/aggressivity, Apathy, and Obsessive/compulsive symptoms. Speech was annotated using three emotional descriptors: primary emotions, affective phenomena, and activation levels. Affective phenomena labels were selected based on PBA statements by external participants unaware of the study's aims. We analyzed (1) emotional descriptors' relationships, (2) emotional expression differences between HD and HC, and (3) the associations between emotions and psychiatric symptoms. Results: HD patients showed reduced emotional expressiveness than HC with more neutral activation levels (=0). Only the primary emotion "angry" was less expressed in HD compared to HC. In contrast they expressed more affective phenomena states like apathetic, confused, "depressed", "disoriented", "frustrated", and "pessimistic" than HC, whereas they expressed less "other" and "irritable" than HC. Expressed emotions were congruent with psychiatric symptoms (e.g., "anxious" and "nervous" are positively associated with Depression PBA sub-scale; "frustrated" with Irritability/aggressivity PBA sub-scale). (10.3389/fpsyt.2025.1633492)
DOI : 10.3389/fpsyt.2025.1633492
Enhancing Plasticity for First Session Adaptation Continual Learning
- Marouf Imad Eddine
- Roy Subhankar
- Lathuilière Stéphane
- Tartaglione Enzo
, 2025. The integration of large pre-trained models (PTMs) into Class-Incremental Learning (CIL) has fa- cilitated the development of computationally efficient strategies such as First-Session Adaptation (FSA), which fine-tunes the model solely on the first task while keeping it frozen for subsequent tasks. Although effective in homogeneous task sequences, these approaches struggle when faced with the heterogeneity of real-world task distributions. We introduce Plasticity-Enhanced Test-Time Adaptation in Class-Incremental Learning (PLASTIC), a method that reinstates plasticity in CIL while preserving model stability. PLASTIC leverages Test-Time Adaptation (TTA) by dynamically fine-tuning LayerNorm parameters on unlabeled test data, enabling adaptability to evolving tasks and improving robustness against data corruption. To prevent TTA-induced model divergence and maintain stable learning across tasks, we introduce a teacher-student distillation framework, ensur- ing that adaptation remains controlled and generalizable. Extensive experiments across multiple benchmarks demonstrate that PLASTIC consistently outperforms both conventional and state-of- the-art PTM-based CIL approaches, while also exhibiting inherent robustness to data corruptions. Code is available at: https://github.com/IemProg/PLASTIC.
Efficient Negative Weight Realization for Analog Resistive Neural Networks
- Kiraz Zulal
- Pham Dang-Kièn Germain
- Desgreys Patricia
, 2025. Most analog nonlinear resistive neural networks for machine learning training use doubling input and output neuron nodes to implement negative weights. However, this approach increases network size, modifies the gradient computation, and complicates circuit design. We propose an alternative circuit topology that retains a one-to-one correspondence between neurons in the original model and their analog counterparts. Our design employs a emph{single} input source for all first-layer weights, a emph{single} resistor per weight, and a bidirectional amplifier for the rest of the layers' weight to handle negative connections without duplicating neurons. We validate our design on a binary XOR classification task over SI{100}{} training epochs and SI{100}{} randomized initializations. Our textbf{single-resistor} approach achieved an average final error of SI{-6.6}{dB} and required approximately SI{568}{} minutes of total CPU time. In comparison, the textbf{doubled-node} design reached SI{-4.6}{dB} error and consumed around SI{1104}{} minutes of CPU time. This equates to nearly 49% less computation for the single-resistor circuit while preserving the standard gradient update procedure—demonstrating that negative weights can be realized more efficiently without doubling input/output neurons.
Immersed boundary–lattice Boltzmann mesoscale method for wetting problems
- Bellantoni Elisa
- Guglietta Fabio
- Pelusi Francesca
- Desbrun Mathieu
- Um Kiwon
- Nicolaou Mihalis
- Savva Nikos
- Sbragaglia Mauro
Physical Review E, American Physical Society (APS), 2025, 112 (2), pp.025305 (1-15). We develop a mesoscale computational model to describe the interaction of a droplet with a solid. The model is based on the hybrid combination of the immersed boundary and the lattice Boltzmann computational schemes: the former is used to model the non-ideal sharp interface of the droplet coupled with the inner and outer fluids, simulated with the lattice Boltzmann scheme. We further introduce an interaction force to model the wetting interactions of the droplet with the solid at mesoscale: this interaction force is designed with the key computational advantage of providing a regularization of the interface profile close to the contact line, avoiding abrupt curvature changes that could otherwise cause numerical instabilities. The proposed model substantially improves earlier immersed boundary - lattice Boltzmann models for wetting in that it allows a description of an ample variety of wetting interactions, ranging from hydrophobic to hydrophilic cases, without the need for any pre-calibration study on model parameters to be used. Model validations against theoretical results for droplet shape at equilibrium and scaling laws for droplet spreading dynamics are addressed. (10.1103/mp3p-8j22)
DOI : 10.1103/mp3p-8j22
Time-resolved second-order autocorrelation function of parametric down-conversion
- Horoshko Dmitri
- Srivastava Shivang
- Sośnicki Filip
- Mikołajczyk Michał
- Karpiński Michał
- Brecht Benjamin
- Kolobov Mikhail
Physical Review A, American Physical Society, 2025, 112 (2), pp.023703-1:023703-13. We study a possibility of measuring the time-resolved second-order autocorrelation function of one of two beams generated in type-II parametric down-conversion by means of temporal magnification of this beam, bringing its correlation time from the picosecond to the nanosecond scale, which can be resolved by modern photodetectors. We show that such a measurement enables one to infer directly the degree of global coherence of that beam, which is linked by a simple relation to the number of modes characterizing the entanglement between the two generated beams. We illustrate the proposed method by an example of photon pairs generated in a periodically poled potassium titanyl phosphate (KTP) crystal with a symmetric group velocity matching for various durations of the pump pulse, resulting in different numbers of modes. Our theoretical model also shows that the magnified double-heralded autocorrelation function of one beam exhibits a local maximum around zero delay time, corresponding to photon bunching at a short time scale. (10.1103/7ckm-tm3r)
DOI : 10.1103/7ckm-tm3r
SpectreShield: Design and Analysis of Spectre Countermeasures on RISC-V Using gem5
- Khan Mahreen
- Mushtaq Maria
- Pacalet Renaud
- Apvrille Ludovic
, 2025. <div><p>Speculative execution attacks like Spectre exploit microarchitectural side effects to leak sensitive data during transient execution. While various software and hardware countermeasures have been proposed for x86 and ARM architectures, their effectiveness and microarchitectural impact remain underexplored on RISC-V platforms. To study such attacks and evaluate these countermeasures, simulation tools like the gem5 simulator provide detailed insights into microarchitectural state changes during speculation. In this paper, we present the first comprehensive evaluation of Spectre-v1 countermeasures on the RISC-V architecture using the gem5 full-system simulator. We implement and assess four Spectre-v1 mitigations: index masking (CM1), randomized offset (CM2), fence-based serialization (CM3), and bitwise selection (CM4). Our experiments reveal that, in the absence of mitigations, Spectre-v1 enables 100% secret key recovery. In contrast, all proposed countermeasures reduce the recovery rate to below 1%, with branch mispredictions decreasing by 41.7%-46.3%. The paper analyzes the securityperformance trade-offs of each approach. Beyond demonstrating their effectiveness, we quantify their microarchitectural impact, measuring reductions in squashed instructions, DRAM latency variability, and return address stack mispredictions. This paper provides a practical framework for evaluating transient execution defenses and advances secure-by-design RISC-V processors.</p></div>
Bayesian Stream Tuner: Dynamic Hyperparameter Optimization for Real-Time Data Streams
- Verma Nilesh
- Bifet Albert
- Pfahringer Bernhard
- Bahri Maroua
, 2025, 2, pp.2871-2882. Hyperparameter optimization is crucial for maximizing machine learning model performance, yet most existing algorithms are designed for batch or offline scenarios and assume static data distributions. Such assumptions fall short in data stream settings, where models must adapt to evolving inputs in real time. To address these limitations, we propose the Bayesian Stream Tuner (BST), a novel framework for online hyperparameter optimization in nonstationary data streams. BST maintains a dynamic set of candidate hyperparameter configurations and periodically refines them using an incremental Bayesian model, which estimates configuration performance based on recent data statistics and hyperparameter values. This systematic exploration and refinement strategy allows BST to detect and respond to concept drift by resetting its adaptation mechanisms whenever necessary, ensuring strong performance under changing distributions. Our theoretical analysis establishes sublinear regret bounds for BST in dynamic environments, and extensive experiments on classification and regression tasks demonstrate that BST consistently outperforms state-of-the-art online hyperparameter optimization methods in both predictive accuracy and adaptability, making it a powerful solution for real-time hyperparameter tuning in evolving data streams. (10.1145/3711896.3736852)
DOI : 10.1145/3711896.3736852
Satellite Image Time-Series Data Augmentation Using an Attention Mechanism Variational Recurrent Autoencoder
- Chaabane Ferdaous
- Tupin Florence
, 2025. Data scarcity presents a significant challenge in satellite image analysis, particularly for developing robust models in remote sensing applications. High-quality and abundant data are essential for accurate predictions; however, acquiring Satellite Image Time-Series (SITS) data is often constrained by factors such as limited temporal coverage and the high cost of Very High Resolution (VHR) acquisitions. To address this issue, we propose a novel Attention-based Variational Recurrent Autoencoder (AVRAE) designed for generating synthetic satellite image time-series data. This method extends the evidence lower bound (ELBO) of variational inference to incorporate the temporal dependencies essential for satellite data. A recurrent neural network-based autoencoder framework is employed, integrated with an attention mechanism to effectively capture both short-and long-term temporal relationships. The AVRAE framework synthesizes realistic and statistically representative satellite time-series data, enabling enhanced analysis for remote sensing applications. Evaluations using real-world satellite datasets demonstrate that AVRAE produces coherent and statistically valid synthetic data, thereby improving VHR SITS data quality for deep learning-based remote sensing applications.
An Information Theoretic Proof of the Chernoff-Hoeffding Inequality
- Rioul Olivier
- Solé Patrick
Information Processing Letters, Elsevier, 2025, 190, pp.106582. The Chernoff bound is a well-known upper bound on the tail of binomial distributions of parameter 1/2 involving the binary entropy function. Hoeffding's inequality (or the Chernoff-Hoeffding inequality) is a generalization for binomial distributions of parameter 1 -1/q, involving the q-ary entropy function (with q ≥ 2), which can be written in terms of the Kullback-Leibler divergence and is related to the bound in Fano's inequality. We give an information theoretic proof of that bound, and sketch some applications to channel and source coding. We also derive a refined bound which is always sharper. (10.1016/j.ipl.2025.106582)
DOI : 10.1016/j.ipl.2025.106582
Long run convergence of discrete-time interacting particle systems of the McKean-Vlasov type
- Bianchi Pascal
- Hachem Walid
- Priser Victor
Stochastic Processes and their Applications, Elsevier, 2025. We consider a discrete-time system of n coupled random vectors, a.k.a. interacting particles. The dynamics involve a vanishing step size, some random centered perturbations, and a mean vector field which induces the coupling between the particles. We study the doubly asymptotic regime where both the number of iterations and the number n of particles tend to infinity, without any constraint on the relative rates of convergence of these two parameters. We establish that the empirical measure of the interpolated trajectories of the particles converges in probability, in an ergodic sense, to the set of recurrent Mc-Kean-Vlasov distributions. A first application example is the granular media equation, where the particles are shown to converge to a critical point of the Helmholtz energy. A second example is the convergence of stochastic gradient descent to the global minimizer of the risk, in a wide two-layer neural networks using random features.
Melody-Lyrics Matching with Contrastive Alignment Loss
- Wang Changhong
- Olvera Michel
- Richard Gaël
, 2025. The connection between music and lyrics is far beyond semantic bonds. Conceptual pairs in the two modalities such as rhythm and rhyme, note duration and syllabic stress, and structure correspondence, raise a compelling yet seldom-explored direction in the field of music information retrieval. In this paper, we present melody-lyrics matching (MLM), a new task which retrieves potential lyrics for a given symbolic melody from text sources. Rather than generating lyrics from scratch, MLM essentially exploits the relationships between melody and lyrics. We propose a self-supervised representation learning framework with contrastive alignment loss for melody and lyrics. This has the potential to leverage the abundance of existing songs with paired melody and lyrics. No alignment annotations are required. Additionally, we introduce sylphone, a novel representation for lyrics at syllable-level activated by phoneme identity and vowel stress. We demonstrate that our method can match melody with coherent and singable lyrics with empirical results and intuitive examples. We open source code and provide matching examples on the companion webpage: https://github.com/changhongw/mlm.
On the spectral decomposition of the complex Robin Laplacian
- Badeau Roland
Journal of the Acoustical Society of America, Acoustical Society of America, 2025, 158 (1), pp.838-848. The mathematical properties of the Laplacian on a bounded domain are well-known when the boundary condition is of the first type (Dirichlet) or second type (Neumann). In both cases, this operator is self-adjoint and, therefore, diagonalizable, its spectrum is discrete, and the set of eigenfunctions can be chosen to form an orthonormal basis of the Hilbert space of square-integrable functions on the domain. However, in the case of the third type (Robin) boundary condition, the same is true only when the parameter is real-valued. On the contrary, when this parameter is complex-valued, the Laplacian may not even be diagonalizable. In this paper, the spectral decomposition of the complex Robin Laplacian is investigated in the most general case possible, and a formula that decomposes any square-integrable function on the set of its (generalized) eigenfunctions is provided. This result is applied to the Green's function of the Helmholtz equation, whose existence, unicity, and closed-form expression are established in this general setting, and the statistical wave field theory, which provides the statistical laws of waves propagating in a bounded domain. (10.1121/10.0037233)
DOI : 10.1121/10.0037233
Benchmarking the Benchmarks: Reproducing Climate-Related NLP Tasks
- Calamai Tom
- Balalau Oana
- Suchanek Fabian M
, 2025. Significant efforts have been made in the NLP community to facilitate the automatic analysis of climate-related corpora by tasks such as climate-related topic detection, climate risk classification, question answering over climate topics, and many more. In this work, we perform a reproducibility study on 8 tasks and 29 datasets, testing 6 models. We find that many tasks rely heavily on surface-level keyword patterns rather than deeper semantic or contextual understanding. Moreover, we find that 96% of the datasets contain annotation issues, with 16.6% of the sampled wrong predictions of a zero-shot classifier being actually clear annotation mistakes, and 38.8% being ambiguous examples. These results call into question the reliability of current benchmarks to meaningfully compare models and highlight the need for improved annotation practices. We conclude by outlining actionable recommendations to enhance dataset quality and evaluation robustness.
Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights
- Nouri Célia
- Cointet Jean-Philippe
- Clavel Chloé
, 2025. Detecting abusive language in social media conversations poses significant challenges, as identifying abusiveness often depends on the conversational context, characterized by the content and topology of preceding comments. Traditional Abusive Language Detection (ALD) models often overlook this context, which can lead to unreliable performance metrics. Recent Natural Language Processing (NLP) approaches that incorporate conversational context often rely on limited or overly simplified representations of this context, leading to inconsistent and sometimes inconclusive results. In this paper, we propose a novel approach that utilizes graph neural networks (GNNs) to model social media conversations as graphs, where nodes represent comments, and edges capture reply structures. We systematically investigate various graph representations and context windows to identify the optimal configurations for ALD. Our GNN model outperforms both context-agnostic baselines and linear context-aware methods, achieving significant improvements in F1 scores. These findings demonstrate the critical role of structured conversational context and establish GNNs as a robust framework for advancing context-aware ALD. Our code is available at https://github.com/celia-nouri/ConversationALD/.
StreamMLOps: Online Learning in Practice from Big Data Streams & Real-Time Applications
- Barry Mariam
- Montiel Jacob
- Bifet Albert
- Manchev Nikolay
- Wadkar Sameer
- Halford Max
- Chiky Raja
- El Jaouhari Saad
- Shakman Katherine B
- Al Fehaily Joudi
- Le Deit Fabrice
- Tran Vinh-Thuy
- Guerizec Eric
, 2025. <div><p>Learning and serving from evolving streaming data to real-time inference in production is a challenging problem. Traditionally, data is partitioned and processed in batches to train machine learning models. In dynamic environments, models' performance drops over time (model degradation), requiring new models to be trained and deployed in their place. This paper deals with the MLOps aspects of deploying online and continual learning models addressing the requirements in the production of real-time applications. We have demonstrated that Online Learning methods can be scaled horizontally in production to meet the high-velocity streaming feature pipeline. The design is based on open platforms and the paper demonstrates an MLOps strategy to execute Online Learning and Predictions, perform Online Learning on a stream and deploy an online learning model version without stream interruption. The approach is suitable for highly regulated industries like banking which also have high throughput requirements. Experiments on high-dimensional and feature-evolving data streams (Malicious URL detection) demonstrate the effectiveness and efficiency of online learning models in terms of time, space and F1-score. Finally, we provide some best practices for using architectural design to deploy these dynamic models on a stream and perform Online Learning and deploy them without stopping the streaming pipeline using open-source technology such as Kafka, Flink, MLflow and river.</p></div> (10.1109/ICDE55515.2023.00272)
DOI : 10.1109/ICDE55515.2023.00272
Probabilistic Modeling and Deep Learning for Hyperspectral Unmixing
- Hadjeres Rassim
, 2025. Hyperspectral unmixing is a crucial tool for analyzing and extracting features from hyperspectral remote sensing data. It aims to estimate, for each pixel, the spectral signatures of pure materials, known as endmembers, along with their associated fractional abundances. Over the past decade, it has been a highly active research area, with more than a hundred of publications annually since 2011. This reflects the growing interest in developing advanced methods for accurately decomposing mixed spectral signatures. In line with the broader trends in image processing and computer vision, where deep learning and neural networks have become mainstream, many recent unmixing approaches have embraced these techniques. This document presents two contributions to the hyperspectral unmixing problem from a deep learning perspective. In our first contribution, we place ourselves under the linear mixing model, and we address the challenge of limited labeled training data for machine and deep learning-based unmixing algorithms. Although supervised learning is among the most effective training frameworks for computer vision tasks, its application in hyperspectral unmixing has been limited due to the scarcity of labeled hyperspectral datasets. To overcome this, we propose a pipeline for synthetically generating labeled unmixing data directly from the hyperspectral image to be unmixed, enabling the self-supervised training of deep neural networks. We demonstrate the effectiveness of our data generation approach by training an unrolled unmixing method, LPALM, which achieves superior unmixing results compared to other state-of-the-art conventional and deep learning-based methods that rely on unsupervised training. In our second contribution, we explore the use of variational autoencoders (VAEs) to address the unmixing problem under the assumption of spectral variability. VAEs, through their latent representations, leverage a key property of spectral variability: the variability of endmember signatures is confined to a low-dimensional manifold within the high-dimensional spectral space. This characteristic suggests that the spectral signatures of most materials are influenced by a limited set of physico-chemical parameters. By capturing this low-dimensional structure, VAEs offer a powerful framework for modeling and addressing spectral variability in hyperspectral unmixing. Building on previous work that employed Beta distributions to model endmember variability, we introduce a novel VAE model with a Beta-distributed latent representations called Beta-latent VAE. %, and first assess its ability to capture the endmember distribution on a series of materials extracted from a hyperspectral image.We motivate the use of the Beta-latent VAE framework for the endmember variability estimation task through a series of experiments, and then build on this framework to propose three variations of VAE-based unmixing networks: two models operating at the pixel level and one spatial-spectral model. For the two last models, we also leverage on Dirichlet-distributed VAE framework to estimate our abundances via Dirichlet distributions, leading us to fully probabilistic VAE-based unmixing networks. We showcase the performance of our unmixing networks in comparison to other state-of-the-art variability-accounting unmixing methods on synthetic and real data and show very interesting results.
Survey on forecasting for electric vehicle charging-power demand
- Yang Wen
- Laurenty Ignacio
- Fontaine Mathieu
- d'Alché-Buc Florence
, 2025.
Investigating Raman backscattering decay and the perspective of time-multiplexed quantum communications
- Verdier Pierre-Enguerrand
- Alléaume Romain
- Rivera Thomas
Optics Express, Optical Society of America - OSA Publishing, 2025, 33 (15), pp.31029-31041. We have studied the temporal dynamics of Raman scattering caused by classical power in optical fiber and its impact on counter-propagating quantum signals. We investigated, on the entire telecom bands, the duration during which the quantum channel cannot be used in a time-division multiplexing context. Thereby, we estimated performance in terms of secure key rates within the framework of time-division multiplexing. By applying our model to the discrete variable quantum key distribution (DV-QKD) protocol BB84 in different optical communication contexts, we demonstrate the feasibility of counter-propagating time-multiplexing classical and quantum communications. Our results highlight a better preservation of the maximum communication distance for quantum channels compared to other multiplexing schemes. (10.1364/OE.561961)
DOI : 10.1364/OE.561961
Don’t Forget Your Inverse DDIM for Image Editing
- Gomez-Trenado Guillermo
- Mesejo Pablo
- Cordón Oscar
- Lathuilière Stéphane
IEEE Computational Intelligence Magazine, Institute of Electrical and Electronics Engineers, 2025, 20 (3), pp.10-18. The field of text-to-image generation has undergone significant advancements with the introduction of diffusion models. Nevertheless, the challenge of editing real images persists, as most methods are either computationally intensive or produce poor reconstructions. This paper introduces SAGE (Self-Attention Guidance for image Editing) - a novel technique leveraging pre-trained diffusion models for image editing. SAGE builds upon the DDIM algorithm and incorporates a novel guidance mechanism utilizing the self-attention layers of the diffusion U-Net. This mechanism computes a reconstruction objective based on attention maps generated during the inverse DDIM process, enabling efficient reconstruction of unedited regions without the need to precisely reconstruct the entire input image. Thus, SAGE directly addresses the key challenges in image editing. The superiority of SAGE over other methods is demonstrated through quantitative and qualitative evaluations and confirmed by a statistically validated comprehensive user study, in which all 47 surveyed users preferred SAGE over competing methods. Additionally, SAGE ranks as the top-performing method in seven out of 10 quantitative analyses and secures second and third places in the remaining three. (10.1109/MCI.2025.3563859)
DOI : 10.1109/MCI.2025.3563859

Retour aux années