Publications

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

Publications HAL

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

HAL thèses

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2025

Quasi-optimal Sampling from Gibbs States via Non-commutative Optimal Transport Metrics
- Capel Ángela
- Gondolf Paul
- Kochanowski Jan
- Rouzé Cambyse
Annales Henri Poincaré, Springer Verlag, 2025, pp.1-59. We study the problem of sampling from and preparing quantum Gibbs states of local commuting Hamiltonians on hypercubic lattices of arbitrary dimension. We prove that any such Gibbs state which satisfies a clustering condition that we coin decay of matrix-valued quantum conditional mutual information (MCMI) can be quasi-optimally prepared on a quantum computer. We do this by controlling the mixing time of the corresponding Davies evolution in a normalized quantum Wasserstein distance of order one. To the best of our knowledge, this is the first time that such a non-commutative transport metric has been used in the study of quantum dynamics, and the first time quasi-rapid mixing is implied by solely an explicit clustering condition. Our result is based on a weak approximate tensorization and a weak modified logarithmic Sobolev inequality for such systems, as well as a new general weak transport cost inequality. If we furthermore assume a constraint on the local gap of the thermalizing dynamics, we obtain rapid mixing in trace distance for interactions beyond the range of two, thereby extending the state-of-the-art results that only cover the nearest neighbour case. We conclude by showing that systems that admit effective local Hamiltonians, like quantum CSS codes at high temperature, satisfy this MCMI decay and can thus be efficiently prepared and sampled from. (10.1007/s00023-025-01637-0)
DOI : 10.1007/s00023-025-01637-0
Huygens’ Metasurface for Sub-7 GHz MIMO Antenna Beamsteering
- Medrar Ghiles
- Lepage Anne Claire
- Begaud Xavier
, 2025.
Quantum Gibbs states are locally Markovian
- Chen Chi-Fang
- Rouzé Cambyse
, 2025. The Markov property entails the conditional independence structure inherent in Gibbs distributions for general classical Hamiltonians, a feature that plays a crucial role in inference, mixing time analysis, and algorithm design. However, much less is known about quantum Gibbs states. In this work, we show that for any Hamiltonian with a bounded interaction degree, the quantum Gibbs state is locally Markov at arbitrary temperature, meaning there exists a quasi-local recovery map for every local region. Notably, this recovery map is obtained by applying a detailed-balanced Lindbladian with jumps acting on the region. Consequently, we prove that (i) the conditional mutual information (CMI) for a shielded small region decays exponentially with the shielding distance, and (ii) under the assumption of uniform clustering of correlations, Gibbs states of general non-commuting Hamiltonians on $D$-dimensional lattices can be prepared by a quantum circuit of depth $e^{O(\log^D(n/ε))}$, which can be further reduced assuming certain local gap condition. Our proofs introduce a regularization scheme for imaginary-time-evolved operators at arbitrarily low temperatures and reveal a connection between the Dirichlet form, a dynamic quantity, and the commutator in the KMS inner product, a static quantity. We believe these tools pave the way for tackling further challenges in quantum thermodynamics and mixing times, particularly in low-temperature regimes.
Sample Complexity of Locally Differentially Private Quantum Hypothesis Testing
- Cheng Hao-Chung
- Hirche Christoph
- Rouzé Cambyse
, 2024, pp.2921-2926. Quantum state discrimination is an important problem in many information processing tasks. In this work we are concerned with finding the best possible sample complexity when the states are preprocessed by a quantum channel that is required to be locally differentially private. We give achievability and converse bounds that nearly match the best known classical bounds. On the way, we prove several novel inequalities between quantum divergences that should be of independent interest. (10.1109/ISIT57864.2024.10619433)
DOI : 10.1109/ISIT57864.2024.10619433
Heisenberg-limited Hamiltonian learning continuous variable systems via engineered dissipation
- Möbus Tim
- Bluhm Andreas
- Gefen Tuvia
- Tong Yu
- Werner Albert
- Rouzé Cambyse
, 2025. Discrete and continuous variables oftentimes require different treatments in many learning tasks. Identifying the Hamiltonian governing the evolution of a quantum system is a fundamental task in quantum learning theory. While previous works mostly focused on quantum spin systems, where quantum states can be seen as superpositions of discrete bit-strings, relatively little is known about Hamiltonian learning for continuous-variable quantum systems. In this work we focus on learning the Hamiltonian of a bosonic quantum system, a common type of continuous-variable quantum system. This learning task involves an infinite-dimensional Hilbert space and unbounded operators, making mathematically rigorous treatments challenging. We introduce an analytic framework to study the effects of strong dissipation in such systems, enabling a rigorous analysis of cat qubit stabilization via engineered dissipation. This framework also supports the development of Heisenberg-limited algorithms for learning general bosonic Hamiltonians with higher-order terms of the creation and annihilation operators. Notably, our scheme requires a total Hamiltonian evolution time that scales only logarithmically with the number of modes and inversely with the precision of the reconstructed coefficients. On a theoretical level, we derive a new quantitative adiabatic approximation estimate for general Lindbladian evolutions with unbounded generators. Finally, we discuss possible experimental implementations.
Modified logarithmic Sobolev inequalities for CSS codes
- Stengele Sebastian
- Capel Ángela
- Gao Li
- Lucia Angelo
- Pérez-García David
- Pérez-Hernández Antonio
- Rouzé Cambyse
- Warzel Simone
, 2025. We consider the class of Davies quantum semigroups modelling thermalization for translation-invariant Calderbank-Shor-Steane (CSS) codes in D dimensions. We prove that conditions of Dobrushin-Shlosman-type on the quantum Gibbs state imply a modified logarithmic Sobolev inequality with a constant that is uniform in the system's size. This is accomplished by generalizing parts of the classical results on thermalization by Stroock, Zegarlinski, Martinelli, and Olivieri to the CSS quantum setting. The results in particular imply the rapid thermalization at any positive temperature of the toric code in 2D and the star part of the toric code in 3D, implying a rapid loss of stored quantum information for these models.
Adaptive Learned Message Passing Algorithms for Decoding Error Correcting Codes
- Yousefi Mansoor
- Tasdighi Alireza
, 2025. <div><p>The weighted belief propagation (WBP) for the decoding of the linear block codes is considered. In WBP, the Tanner graph of the code is unrolled with respect to the iterations of the belief propagation decoder. Then, weights are assigned to the edges of the resulting recurrent network, and optimized offline using a training data set. In this paper, an adaptive neural decoder is proposed, where the weights of the decoder are determined for each received word. Two variants of this decoder are investigated. In the parallel weighted min-sum (WMS) decoder, the weights take values in a discrete set. A number of WMS decoders are run in parallel to search for the best sequence of weights in realtime. In the two-stage decoder, a small neural network is used to determine the weights of the WMS decoder for each received word. The findings show that the adaptive neural decoders offer substantial improvements in the bit error rate compared to their static counterparts for several codes, at about the same computational complexity.</p></div>
Enriching Taxonomies using Large Language Models
- Ghamlouch Zeinab
- Alam Mehwish
, 2025. Taxonomies play a vital role in structuring and categorising information across domains. However, many existing taxonomies suffer from limited coverage and outdated or ambiguous nodes, reducing their effectiveness in knowledge retrieval. To address this, we present Taxoria, a novel taxonomy enrichment pipeline that leverages Large Language Models (LLMs) to enhance a given taxonomy. Unlike approaches that extract internal LLM taxonomies, Taxoria uses an existing taxonomy as a seed and prompts an LLM to propose candidate nodes for enrichment. These candidates are then validated to mitigate hallucinations and ensure semantic relevance before integration. The final output includes an enriched taxonomy with provenance tracking and visualisation of the final merged taxonomy for analysis.
Automatic Analysis of Collaboration Through Human Conversational Data Resources: A Review
- Yu Yi
- Boritchev Maria
- Clavel Chloé
, 2025. Collaboration is a task-oriented, high-level human behavior. In most cases, conversation serves as the primary medium for information exchange and coordination, making conversational data a valuable resource for the automatic analysis of collaborative processes. In this paper, we focus on verbal aspects of collaboration and conduct a review of collaboration analysis using task-oriented conversation resources, encompassing related theories, coding schemes, tasks, and modeling approaches. We aim to address the question of how to utilize task-oriented human-human conversational data for collaboration analysis. We hope our review will serve as a practical resource and illuminate unexplored areas for future collaboration analysis.
How NixOS could have detected the XZ supply-chain attack for the benefit of all thanks to reproducible-builds
- Malka Julien
, 2025. <div><p>In March 2024, a sophisticated backdoor was discovered in xz, a core compression library in Linux distributions, covertly inserted over three years by a malicious maintainer, Jia Tan. The attack, which enabled remote code execution via ssh, was only uncovered by chance when Andres Freund investigated a minor performance issue. This incident highlights the vulnerability of the open-source supply chain and the effort attackers are willing to invest in gaining trust and access. In this article, I analyze the backdoor's mechanics and explore how bitwise build reproducibility could have helped detect it.</p></div>
Iffy-Or-Not: Critically Evaluating Potential Misinformation With Fallacy Detection and Socratic Questioning Using LLMs
- Lim Gionnieve
- Kim Juho
- Perrault Simon
ACM Transactions on Computer-Human Interaction, Association for Computing Machinery, 2025, 33 (1), pp.8:1-8:40. Social platforms have expanded opportunities for deliberation with the comments being used to inform one's opinion. However, using such information to form opinions is challenged by unsubstantiated or false content. To enhance the quality of opinion formation and potentially confer resistance to misinformation, we developed Iffy-Or-Not ( ION ), a browser extension that seeks to invoke critical thinking when reading texts. With three features guided by argumentation theory, ION highlights fallacious content, suggests diverse queries to probe them with, and offers deeper questions to consider and chat with others about. From a user study ( $N=18$ ), we found that ION encourages users to be more attentive to the content, suggests queries that align with or are preferable to their own, and poses thought-provoking questions that expands their perspectives. However, some participants expressed aversion to ION due to misalignments with their information goals and thinking predispositions. Potential backfiring effects with ION are discussed. (10.1145/3771935)
DOI : 10.1145/3771935
Make me an Expert: Distilling from Generalist Black-Box Models into Specialized Models for Semantic Segmentation
- Benigmim Yasser
- Roy Subhankar
- Oublal Khalid
- Marouf Imad Eddine
- Essid Slim
- Kalogeiton Vicky
- Lathuilière Stéphane
, 2025. The rise of Artificial Intelligence as a Service (AIaaS) democratizes access to pre-trained models via Application Programming Interfaces (APIs), but also raises a fundamental question: how can local models be effectively trained using black-box models that do not expose their weights, training data, or logits, a constraint in which current domain adaptation paradigms are impractical ? To address this challenge, we introduce the Black-Box Distillation (B2D) setting, which enables local model adaptation under realistic constraints: (1) the API model is open-vocabulary and trained on large-scale general-purpose data, and (2) access is limited to one-hot predictions only. We identify that open-vocabulary models exhibit significant sensitivity to input resolution, with different object classes being segmented optimally at different scales, a limitation termed the "curse of resolution". Our method, ATtention-Guided sCaler (ATGC), addresses this challenge by leveraging DINOv2 attention maps to dynamically select optimal scales for black-box model inference. ATGC scores the attention maps with entropy to identify informative scales for pseudo-labelling, enabling effective distillation. Experiments demonstrate substantial improvements under black-box supervision across multiple datasets while requiring only one-hot API predictions. Our code is available at https://github.com/yasserben/ATGC. (10.48550/arXiv.2509.00509)
DOI : 10.48550/arXiv.2509.00509
Secure Group Key Dissemination Protocol in Cooperative Vehicular Platooning
- Braiteh Farah-Emma
- Bassi Francesca
- Khatoun Rida
, 2025. Cooperative vehicular platoons improve road safety and reduce congestion through synchronized maneuvers enabled by Vehicle-to-Vehicle (V2V) communication. Vehicles are authenticated using certificates from the Cooperative Intelligent Transport Systems (C-ITS) Public Key Infrastructure (PKI). Short-term certificates, serving as vehicle identifiers, change over time and distance, which may lead to legitimate members being misidentified as attackers in the platoon, resulting in false positives. Additionally, sensitive data, such as platoon IDs, must be protected from impersonation attacks by external vehicles. To address these challenges, we propose a secure group key-based authentication framework that uses post-quantum cryptography and Shamir’s Secret Sharing for key exchange. This ensures accurate member authentication and protection of platoon data. Security analysis using the Scyther tool along with simulations using PLEXE simulator demonstrate the protocol’s effectiveness in securing platoon operations against cyber threats.
MMA-RAG: A Survey on Multimodal Agentic Retrieval-Augmented Generation
- Perlić Vladana
- Lebailly Stéphane
- Malvone Vadim
- Nguyen Van-Tam
- Urard Pascal
, 2025. Multimodal Agentic Retrieval-Augmented Generation (MMA-RAG) marks a significant advancement in AI, empowering large language models to integrate and reason over diverse data types, including text, images, audio, and structured data. This survey provides the first comprehensive overview of the MMA-RAG paradigm, tracing its evolution from traditional text-based RAG to sophisticated multimodal and agentic frameworks. We systematically review foundational literature, analyze key architectures and dominant design patterns, and survey applications across domains such as scientific question answering, document understanding, and healthcare. We conduct a comparative analysis of system components, evaluation benchmarks, and agentic capabilities like planning and tool use, offering a holistic view of the current landscape. Our key insights highlight how multimodal integration and autonomous agents mitigate hallucinations and enhance contextual reasoning, while also surfacing persistent challenges in cross-modal alignment, evaluation, and scalability. We conclude by outlining open research directions and practical implications for next-generation AI systems.
Supervised learning methods for offline reinforcement learning
- Ghanem Abdelghani
, 2025. Offline Reinforcement Learning (RL) enables policy learning from static trajectory data without environment interaction, presenting unique challenges for effective representation learning and optimization. This thesis investigates supervised learning methods for offline RL, with a focus on sequence modeling approaches using transformer architectures. We present several key contributions that advance both theoretical understanding and empirical performance in this domain.First, we propose Multi-Objective Decision Transformers (MO-DT), which jointly optimize action, state, and return prediction to encourage richer attention patterns compared to single-task approaches. To address the non-smoothness of action distributions, we introduce Trust Region Decision Transformers (TRDT), which augment trajectories with action-space regions to smooth representations and improve cross-modal attention. Second, we develop the Reward-Guided Decision Translator (RGDT), an encoder-decoder architecture that recasts offline RL as sequence-to-sequence modeling, predicting next states rather than actions while directly conditioning on sequences of future returns.Our theoretical contributions include a comprehensive framework based on modified gradient flow analysis that reveals how multi-task training fundamentally shapes optimization dynamics. We prove that gradient descent implicitly encourages task disagreement by minimizing inner products between task gradients, with multi-objective training introducing first-order regularization and sequential training adding potentially harmful second-order corrections. Furthermore, we establish sample complexity bounds for offline RL sequence modeling, identifying critical transitions between small-data and large-data regimes and revealing trade-offs between context coverage breadth and sampling depth.Empirically, our methods significantly outperform vanilla Decision Transformers and match or exceed state-of-the-art baselines on D4RL locomotion benchmarks. Our theoretical predictions accurately forecast optimization trajectories and provide actionable principles for designing effective multi-task training strategies in offline RL. Together, these contributions demonstrate how principled supervised learning approaches can effectively address the challenges of learning from static trajectory data.
Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering
- Marouf Imad Eddine
- Tartaglione Enzo
- Lathuilière Stéphane
- van de Weijer Joost
, 2025, pp.1-13. Continual Learning in Visual Question Answering (VQACL) requires models to learn new visual-linguistic tasks (plasticity) while retaining knowledge from previous tasks (stability). The multimodal nature of VQACL presents unique challenges, requiring models to balance stability across visual and textual domains while maintaining plasticity to adapt to novel objects and reasoning tasks. Existing methods, predominantly designed for unimodal tasks, often struggle to balance these demands effectively. In this work, we introduce QUestion-only replay with Attention Distillation (QUAD), a novel approach for VQACL that leverages only past task questions for regularisation, eliminating the need to store visual data and addressing both memory and privacy concerns. QUAD achieves stability by introducing a question-only replay mechanism that selectively uses questions from previous tasks to prevent overfitting to the current task's answer space, thereby mitigating the out-of-answer-set problem. Complementing this, we propose attention consistency distillation, which uniquely enforces both intra-modal and inter-modal attention consistency across tasks, preserving essential visual-linguistic associations. Extensive experiments on VQAv2 and NExT-QA demonstrate that QUAD significantly outperforms state-of-the-art methods, achieving robust performance in continual VQA.
Bayesian Methods for Blind Beamforming in MIMO systems using Reflective Intelligent Surfaces
- Chêne Thomas
, 2025. To address the ever increasing data rate in fifth-generation (5G), multiple solutions are envisioned.Densifying the network by adding Base Stations(BS) that each have a smaller coverage allows to reuse the spectrum over a geographic area. To reduce the cost of conventional BS with high energy consumption and hardware cost, it is envisioned in Centralized Radio Access Network (CRAN) architecture to share and offload the computational power and ressources at a Central Processor(CP). The link connecting the Remote Radio Header (RRH) to the CP (referred as the fronthaul link) is likely to be overwhelmed by the data traffic since it has a limited capacity. This limits the performances of envisioned CRAN systems, and as the computational power is offloaded to the CP, RRH have to implement a low complexity compression protocol before sending their received data to the CP.Since Mmmwaves are more susceptible to blockage and absorption, the wireless environment needs to be modified in a cost-effective way.Reflective Intelligent Surfaces(RIS) are a recent technology composed of many passive reflecting elements. They are envisaged to be deployed on walls, ceilings of buildings to create a link between the User Equipment(UE) and the BS. The main challenge of RIS technology is to determine the optimal configuration of the reflecting elements. Channel acquisition for those systems poses formidable challenges as the size of the channel matrix increases with the number of passive elements, and as the RIS is passive and cannot estimate the channel.The first part of this thesis, aims at reducing the effects of a limited fronthaul capacity in a CRAN system. We model the limited capacities of the links between the RRH and the CP as a bit budget allocated to each RRH. We jointly optimize a compression protocol at the RRH and a decoder at the CP by training a neural network with a bit budget constraint.The second part of this thesis, aims at configuring a RIS to maximize the achievable rate between a BS and UE, without knowledge of the channel. We model the channel as a random variable and use an adaptive protocol that receives feedback and updates its knowledge of the channel. This bayesian method can be "accelerated" in order to reduce the number of pilots required to be sent.We first propose to efficiently "query" the channel by maximizing the amount of information each pilots brings us on relevant parameters of the channel.Then we formulate the problem as a minimization of a path length between an initial state without knowledge of the best configuration of the RIS, and a final state with almost certainty to find the best configuration.Finally, we propose some modifications to Bayesian Optimization to maximize the Received Signal Strength at the BS.
Mixed Criticality Mission Planning for Autonomous Robot Fleets
- Cordeiro Franco Petrone
, 2025. This thesis explores the problem of managing uncertainty in multi-robot critical systems planning. The first contribution consists of adapting Mixed-Criticality concepts from safety-critical systems to the robot planning domain. Drawing from previous work in real-time scheduling problems, this thesis reconceptualizes how robots prioritize critical tasks when resources become constrained. The approach classifies robot actions according to their objective's importance and implements multiple cost modes to handle varying environmental conditions. The second contribution is the development of a single-robot framework based on Monte-Carlo Tree-Search that demonstrates increased objective achievement in normal environments while guaranteeing critical objective execution during exceptional conditions. The third contribution is extending this solution to multi-robot systems through an approach that includes robot partitioning strategies and a robust synchronization process for online replanning, enabling robots to adapt to changing conditions in real-time. This multi-robot implementation called RESCUE tackles the additional challenge of preventing objective duplication across robots while maintaining system flexibility. This approach is also generalized to multiple levels of criticality. Finally, the contributions are evaluated through simulation by comparing them to existing Monte-Carlo Tree-Search solutions. The experimental results validate that the framework successfully balances competing priorities: maximizing objective completion during normal operation while ensuring critical task execution during environmental challenges. These contributions advance the field of adaptive planning for uncertain robotic environments with objective criticality by providing a more robust and resilient approach to resource allocation in the face of unpredictable conditions.
Asynchronous Gossip Algorithms for Rank-Based Statistical Methods
- van Elst Anna
- Colin Igor
- Clémençon Stephan
, 2025, pp.448-455. <div><p>As decentralized AI and edge intelligence become increasingly prevalent, ensuring robustness and trustworthiness in such distributed settings has become a critical issue-especially in the presence of corrupted or adversarial data. Traditional decentralized algorithms are vulnerable to data contamination as they typically rely on simple statistics (e.g., means or sum), motivating the need for more robust statistics. In line with recent work on decentralized estimation of trimmed means and ranks, we develop gossip algorithms for computing a broad class of rank-based statistics, including L-statistics and rank statisticsboth known for their robustness to outliers. We apply our method to perform robust distributed two-sample hypothesis testing, introducing the first gossip algorithm for Wilcoxon rank-sum tests. We provide rigorous convergence guarantees, including the first convergence rate bound for asynchronous gossip-based rank estimation. We empirically validate our theoretical results through experiments on diverse network topologies.</p></div> (10.1109/FLTA67013.2025.11336445)
DOI : 10.1109/FLTA67013.2025.11336445
SMACC: Sketching Motion for Articulated Characters with Comics-based annotations
- Legrand Amandine
- Parakkat Amal Dev
- Rohmer Damien
, 2025, pp.1-13. <div><p>We introduce SMACC, a sketch-based system for animating short sequences of 3D articulated characters inspired by 2D comic motion line annotations. SMACC relies on classical rules of motion depiction used in comic books, allowing the depiction of dynamism in static images while being universally understood. Building on this, SMACC introduces an algorithmic interpretation of these principles in the context of a 3D character animation, guided by three fundamental types of motion lines: trajectory, circumfixing and impact. The adaptation to rigged 3D characters relies on the automatic computation of how these motion cues spatially influence the character's skeleton, achieved through a global analysis of sketch annotations relative to the character's pose. The resulting animation is generated by encoding the kinematic clues and constraints into joint angular velocities. Finally, the proof-of-concept demonstrated by SMACC is validated through a user study, which evaluates the effectiveness and accuracy of this sketch-based approach applied to 3D character animation.</p></div> (10.2312/pg.20251255)
DOI : 10.2312/pg.20251255
Secure implementation for post-quantum cryptography
- Spyropoulos Maxime
, 2025. The goal of the thesis is to improve the software security of components implementing post-quantum cryptography. More specifically, the aim is to identify and correct vulnerabilities to auxiliary channel attacks.
Physically Informed Spatial Regularization for Sound Event Localization and Detection
- Liu Haocheng
- Di Carlo Diego
- Nugraha Aditya Arie
- Yoshii Kazuyoshi
- Richard Gaël
- Fontaine Mathieu
, 2025. Building Sound Event Localization and Detection (SELD) models that are robust to diverse acoustic environments remains one of the major challenges in multichannel signal processing, as reflections and reverberation can significantly confuse both the source direction and event detection. Introducing priors such as microphone geometry or room impulse response (RIR) into the model has proven effective in addressing this issue. Existing methods typically incorporate such priors in a deterministic way, often through data augmentation to enlarge data diversity. However, the uncertainty arising from the complex nature of audio acoustics remains largely underexplored in the SELD literature and naturally call for incorporating a stochastic modeling of acoustic prior. In this paper, we propose regularizing deep learning based SELD models with a physically constructed spatial covariance matrix (SCM) based on the estimated direction of arrival (DOA) and sound event detection (SED).
IS³ : Generic Impulsive--Stationary Sound Separation in Acoustic Scenes using Deep Filtering
- Berger Clémentine
- Stamatiadis Paraskevas
- Badeau Roland
- Essid Slim
, 2025. We are interested in audio systems capable of performing a differentiated processing of stationary backgrounds and isolated acoustic events within an acoustic scene, whether for applying specific processing methods to each part or for focusing solely on one while ignoring the other. Such systems have applications in real-world scenarios, including robust adaptive audio rendering systems (e.g., EQ or compression), plosive attenuation in voice mixing, noise suppression or reduction, robust acoustic event classification or even bioacoustics. To this end, we introduce IS³, a neural network designed for Impulsive--Stationary Sound Separation, that isolates impulsive acoustic events from the stationary background using a deep filtering approach, that can act as a pre-processing stage for the above-mentioned tasks. To ensure optimal training, we propose a sophisticated data generation pipeline that curates and adapts existing datasets for this task. We demonstrate that a learning-based approach, build on a relatively lightweight neural architecture and trained with well-designed and varied data, is successful in this previously unaddressed task, outperforming the Harmonic--Percussive Sound Separation masking method, adapted from music signal processing research, and wavelet filtering on objective separation metrics.
Meaning Representation Frameworks and Reasoning in the Era of Large Language Models
- Sadeddine Zacchary
, 2025. Large Language Models (LLMs) are now used for a wide range of tasks, many of which require reasoning abilities. However, these abilities remain limited and lack transparency. This thesis explores how to improve the reasoning, transparency and robustness of LLMs by integrating symbolic structures.First, we conduct an analysis of the societal issues arising from the new role of LLMs in our access to knowledge. Fifteen major issues are identified, as well as current and potential mitigation strategies, drawing on both technical solutions and regulatory approaches.The thesis then focuses on Meaning Representation Frameworks (MRFs), which encode the semantics of natural language into graph structures. A comprehensive survey of MRFs is presented, introducing a new classification based on their structural properties, as well as the available resources, empirical use and research directions. This repositions MRFs as computational artifacts capable of complementing neural models in complex tasks.Building upon this, the thesis introduces VANESSA, a neuro-symbolic reasoning system that integrates MRFs with LLMs, as well as new representation and symbolic parsing process. VANESSA uses this representation to decompose reasoning problems into three simpler subtasks: parsing, natural language inference (NLI) and formal solving. Experimental results show that VANESSA achieves performance comparable to LLMs on logical reasoning tasks, while producing outputs that are traceable and explainable, illustrating the added value of hybrid architectures.Finally, the thesis addresses the problem of step-by-step verification of reasoning chains, which are produced by LLMs. A novel benchmark of nearly 5,000 annotated reasoning steps is presented, assessing both logical validity and factual correctness. Though LLMs are able to detect some errors, neuro-symbolic approaches such as VANESSA achieve comparable performance while providing valuable transparency.Overall, the thesis advocates a hybrid vision of language-based artificial intelligence, where LLMs and symbolic structures are not competing paradigms but complementary tools. It opens new perspectives towards AI systems that are not only powerful, but also responsible, trustworthy and interpretable, combining the flexibility of neural models with the rigor of symbolic reasoning.
Phase Diagram of Dropout for Two-Layer Neural Networks in the Mean-Field Regime
- Chizat Lénaïc
- Marion Pierre
- Yesbay Yerkin
, 2025. Dropout is a standard training technique for neural networks that consists of randomly deactivating units at each step of their gradient-based training. It is known to improve performance in many settings, including in the large-scale training of language or vision models. As a first step towards understanding the role of dropout in large neural networks, we study the large-width asymptotics of gradient descent with dropout on two-layer neural networks with the mean-field initialization scale. We obtain a rich asymptotic phase diagram that exhibits five distinct nondegenerate phases depending on the relative magnitudes of the dropout rate, the learning rate, and the width. Notably, we find that the well-studied "penalty" effect of dropout only persists in the limit with impractically small learning rates of order O(1/width). For larger learning rates, this effect disappears and in the limit, dropout is equivalent to a "random geometry" technique, where the gradients are thinned randomly after the forward and backward pass have been computed. In this asymptotic regime, the limit is described by a mean-field jump process where the neurons' update times follow independent Poisson or Bernoulli clocks (depending on whether the learning rate vanishes or not). For some of the phases, we obtain a description of the limit dynamics both in path-space and in distribution-space. The convergence proofs involve a mix of tools from mean-field particle systems and stochastic processes. Together, our results lay the groundwork for a renewed theoretical understanding of dropout in large-scale neural networks.

Retour aux années