Sorry, you need to enable JavaScript to visit this website.
Partager

Publications

 

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

 

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

 

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2025

  • Extrapolation of quantum measurement data
    • Manos Konstantinos
    • Weilenmann Mirjam
    • Navascués Miguel
    , 2025. We consider the problem of predicting future expectation values of a collection of quantum observables, given their noisy expectation values at past times. The measured observables, the initial state of the physical system and even the Hilbert space are unknown; we nonetheless assume a promise on the energy distribution of the state. Investigating to what extent extrapolation is possible in this framework, we discover highly problematic datasets that allow full predictability at any future time $τ$, but only when past averages are known up to precision superexponential in $τ$. We also find families of "self-testing datasets", which allow practical predictability under reasonable noise levels and whose approximate realization singles out specific Hamiltonians, states and measurement operators. We identify "aha! datasets", which drastically increase the predictability of the future statistics of an unrelated measurement, as well as "fog banks": fairly simple datasets that exhibit complete unpredictability at some future time $τ$, but full predictability for a later time $τ'>τ$. Finally, we prove that the extrapolation problem is efficiently solvable up to arbitrary precision through hierarchies of semidefinite programming relaxations.
  • Correlation Self-Testing of Quantum Theory against Generalised Probabilistic Theories with Restricted Relabelling Symmetry
    • Sengupta Kuntal
    • Weilenmann Mirjam
    • Colbeck Roger
    , 2025. Correlation self-testing of quantum theory involves identifying a task or set of tasks whose optimal performance can be achieved only by theories that can realise the same set of correlations as quantum theory in every causal structure. Following this approach, previous work has ruled out various classes of generalised probabilistic theories whose joint state spaces have a certain regularity in the sense of a (discrete) rotation symmetry of the bipartite state spaces. Here we consider theories whose bipartite state spaces lack this regularity. We form them by taking the convex hull of all the local states and a finite number of non-local states. We show that a criterion of compositional consistency is needed in such theories: for a measurement effect to be valid, there must exist at least one measurement that it is part of. This goes beyond previous consistency criteria and corresponds to a strengthening of the no-restriction hypothesis. We show that quantum theory outperforms these theories in a task called the adaptive CHSH game, which shows that they can be ruled out experimentally. We further show a connection between compositional consistency and Tsirelson's bound.
  • Proximal gradient descent on the smoothed duality gap to solve saddle point problems
    • Fercoq Olivier
    , 2025. In this paper, we minimize the self-centered smoothed gap, a recently introduced optimality measure, in order to solve convex-concave saddle point problems. The self-centered smoothed gap can be computed as the sum of a convex, possibly nonsmooth function and a smooth weakly convex function. Although it is not convex, we propose an algorithm that minimizes this quantity, effectively reducing convex-concave saddle point problems to a minimization problem. Its worst case complexity is comparable to the one of the restarted and averaged primal dual hybrid gradient method, and the algorithm enjoys linear convergence in favorable cases.
  • Toward the Automatic Detection of Word Meaning Negotiation Indicators in Conversation
    • Garí Soler Aina
    • Labeau Matthieu
    • Clavel Chloé
    , 2025, pp.24580–24596. Word Meaning Negotiations (WMN) are sequences in conversation where speakers collectively discuss and shape word meaning. These exchanges can provide insight into conversational dynamics and word-related misunderstandings, but they are hard to find in corpora. In order to facilitate data collection and speed up the WMN annotation process, we introduce the task of detecting WMN indicators - utterances where a speaker signals the need to clarify or challenge word meaning. We train a wide range of models and reveal the difficulty of the task. Our models have better precision than previous regular-expression based approaches and show some generalization abilities, but have moderate recall. However, this constitutes a promising first step toward an iterative process for obtaining more data. (10.18653/v1/2025.findings-emnlp.1337)
    DOI : 10.18653/v1/2025.findings-emnlp.1337
  • Mm, Wat?" Detecting Other-initiated Repair Requests in Dialogue
    • Ngo Anh
    • Rollet Nicolas
    • Pelachaud Catherine
    • Clavel Chloé
    , 2025. <div><p>Maintaining mutual understanding is a key component in human-human conversation to avoid conversation breakdowns, in which repair, particularly Other-Initiated Repair (OIR, when one speaker signals trouble and prompts the other to resolve), plays a vital role. However, Conversational Agents (CAs) still fail to recognize user repair initiation, leading to breakdowns or disengagement. This work proposes a multimodal model to automatically detect repair initiation in Dutch dialogues by integrating linguistic and prosodic features grounded in Conversation Analysis. The results show that prosodic cues complement linguistic features and significantly improve the results of pretrained text and audio embeddings, offering insights into how different features interact. Future directions include incorporating visual cues, exploring multilingual and cross-context corpora to assess the robustness and generalizability.</p></div>
  • iKnow-audio: Integrating Knowledge Graphs with Audio-Language Models
    • Olvera Michel
    • Wang Changhong
    • Stamatiadis Paraskevas
    • Richard Gaël
    • Essid Slim
    , 2025. Contrastive Language–Audio Pretraining (CLAP) models learn by aligning audio and text in a shared embedding space, enabling powerful zero-shot recognition. However, their performance is highly sensitive to prompt formulation and language nuances, and they often inherit semantic ambiguities and spurious correlations from noisy pretraining data. While prior work has explored prompt engineering, adapters, and prefix tuning to address these limitations, the use of structured prior knowledge remains largely unexplored. We present iKnow-audio, a framework that integrates knowledge graphs with audio-language models to provide robust semantic grounding. iKnow-audio builds on the Audio-centric Knowledge Graph (AKG), which encodes ontological relations comprising semantic, causal, and taxonomic connections reflective of everyday sound scenes and events. By training knowlege graph embedding models on the AKG and refining CLAP predictions through this structured knowledge, iKnow-audio improves disambiguation of acoustically similar sounds and reduces reliance on prompt engineering. Comprehensive zero-shot evaluations across six benchmark datasets demonstrate consistent gains over baseline CLAP, supported by embedding-space analyses that highlight improved relational grounding. Resources are publicly available at https://github.com/michelolzam/iknow-audio.
  • Formalizing an Iterated Morphological Erosion for the Discovery of Musical Patterns and Their Variations
    • Lascabettes Paul
    • Quaetaert Nils
    • Andreatta Moreno
    • Bloch Isabelle
    , 2026, 16296, pp.501-513. The discovery of patterns in a point-set representation of music consists in identifying repeating subsets of points. In this task, musical symbolic data is modeled as a discrete set of points, usually in R n , where each point represents a musical note and its coordinates represent the characteristics of the note, such as its onset or its pitch value. While numerous algorithms have been developed to discover all exact repetitions and extract musically relevant patterns, recent research has turned toward the discovery of patterns that repeat with some variations. Because the morphological erosion of a pattern provides its occurrences, we propose an adaptation of this operation to also obtain its variations with respect to a given approximation. This approach not only reveals certain variations of the pattern, but also enables to associate specific points to the pattern despite the fact that they were not initially present due to the constraints of strict repetition. We demonstrate that the proposed formalism satisfies certain fundamental properties for the musical pattern discovery task, such as the fact that iterating erosion produces cycles of patterns and its translation values. Finally, we apply these operations to the corpus of fugues from Bach's Well-Tempered Clavier, highlighting the usefulness of the proposed approach. (10.1007/978-3-032-09544-2_36)
    DOI : 10.1007/978-3-032-09544-2_36
  • Bridging Educational Theories of Cognitive Load to Visualization Design and Evaluation
    • Cabouat Anne-Flore
    • Ciccione Lorenzo
    • Huron Samuel
    • Isenberg Tobias
    • Isenberg Petra
    , 2025, pp.33-64. We explore the validity and applicability of educational and cognitive science theoretical frameworks for designing and evaluating data visualizations. Specifically, we are interested in using well-known frameworks from other domains to learn about how the subjective readability of a visualization relates to the perceived cognitive load required to acquire knowledge from it. To that end, we conducted an online randomized study in which each participant performed learning tasks on two different data visualizations. One was presented in three successive parts, following the segmenting principle from the Cognitive Theory of Multimedia Learning, and the other was presented as a single image. Although most learners preferred the segmented style, this treatment did not significantly affect the overall mental effort they reported. Subjective measures of extraneous cognitive load, however, significantly and negatively correlated with visualizations' perceived readability measures. In other words, if a learner found a visualization more readable, they felt it required less mental effort to parse relevant information from it for learning. In addition to a qualitative analysis of learners' preferences, we also contribute an interdisciplinary perspective on cognitive processing of visualizations and a discussion of implications for designing and evaluating data visualizations beyond educational contexts. (10.1109/EduVIS69391.2025.00009)
    DOI : 10.1109/EduVIS69391.2025.00009
  • Exploring Touch Interactions for Input Visualization in Personal Informatics
    • Bhartia June
    • Bressa Nathalie
    • Huron Samuel
    , 2025. Input visualizations enable people not only to view but also to manipulate data directly through a visualization. Limited research has been done on how users expect to input data into common visualization idioms on touch-based devices in personal informatics context. To fill this gap, we conducted a gesture elicitation study on inputting data into visualizations through five different visualization idioms. Participants suggested gestures for manipulating data points and categories, suggesting consistent patterns in gesture choice influenced by operation type, visual encoding, and perceived interaction complexity. Our analysis, using five lenses -interaction target, gesture type, GUI usage, data continuity, and number of stepsshows preferences for direct manipulation, symmetry, and GUI use for abstract tasks.
  • FLORA: Unsupervised Knowledge Graph Alignment by Fuzzy Logic
    • Peng Yiwen
    • Bonald Thomas
    • Suchanek Fabian M.
    , 2025, pp.196–215. Knowledge graph alignment is the task of matching equivalent entities (that is, instances and classes) and relations across two knowledge graphs. Most existing methods focus on pure entity-level alignment, computing the similarity of entities in some embedding space. They lack interpretable reasoning and need training data to work. In this paper, we propose FLORA, a simple yet effective method that (1) is unsupervised, i.e., does not require training data, (2) provides a holistic alignment for entities and relations iteratively, (3) is based on fuzzy logic and thus delivers interpretable results, (4) provably converges, (5) allows dangling entities, i.e., entities without a counterpart in the other KG, and (6) achieves state-of-the-art results on major benchmarks. (10.1007/978-3-032-09527-5_11)
    DOI : 10.1007/978-3-032-09527-5_11
  • TEP-ones: A simple yet effective approach for transferability estimation of pruned backbones
    • Spadaro Gabriele
    • Bragagnolo Andrea
    • Renzulli Riccardo
    • Grangetto Marco
    • Giraldo Jhony
    • Fiandrotti Attilio
    • Tartaglione Enzo
    Neurocomputing, Elsevier, 2025, 668, pp.132209:01-132209-13. In deep learning, the conventional transfer learning paradigm involves fine-tuning a model pre-trained on a complex source task to adapt it to a simpler target task, capitalizing on abundant training data. Concurrently, the paradigm of neural network pruning has emerged as a powerful strategy for enhancing model efficiency, reducing complexity, and optimizing resource utilization. This paper focuses on pruned model transferability estimation for resource-constraint scenarios, where the goal is to rank the performance of pruned pre-trained models on a downstream task without fine-tuning. To this end, from a formal analysis of the intra-class mutual information between samples belonging to the same target class, we observe that, as pruning increases, a sweet phase naturally arises, where the model benefits from better features at the encoder’s output. From this, we derive a Transferability Estimation for Pruned Backbones (TEP-ones) that eases the choice of which pruned model (without the need to train the classifier) is the best candidate for transfer learning. We publicly released the code and pre-trained pruned models at https://github.com/EIDOSLAB/TEP-ones. (10.1016/j.neucom.2025.132209)
    DOI : 10.1016/j.neucom.2025.132209
  • MR-CoCo: an Open Mixed Reality Testbed for Co-located Couple Product Configuration and Decision-Making – A Sailboat Case Study
    • Vangi Fabio
    • Medeiros Daniel
    • Dastan Mine
    • Fiorentino Michele
    IEEE Transactions on Visualization and Computer Graphics, Institute of Electrical and Electronics Engineers, 2025, 31 (11), pp.9636-9644. The literature has demonstrated the advantages of Mixed Reality (MR) for product configuration by providing a more engaging and effective end-user experience. While collaborative and remote design tools in MR have been widely explored in previous studies, a noticeable gap remains in the exploration of co-located product configuration for couples. This gap is noteworthy since in many industries, couples (e.g., friends, partners) often make purchasing decisions together in physical retail environments. In this paper, we introduce MR-CoCo, an open MR testbed designed to explore collaborative configurations by co-located couples, both in the role of customers. The testbed is developed in Unity and features: (i) a shared MR space with virtual product 3D model anchoring, (ii) shared visualization of the current configuration, (iii) a versatile UI for selecting configuration areas, (iv) hand gestures for 3D drag and drop of colors and materials from 3D catalog to the product. A case study of the personalization of a sailboat is provided as proof of concept. The user study involved 24 couples (48 participants in total), simulating a purchasing experience and the related configuration using MR-CoCo. We assessed usability through post-experience evaluations, with the System Usability Scale (SUS) and the Co-Presence Configuration Questionnaire (CCQ) to measure collaboration and decision-making. The results demonstrated a high level of usability and perceived quality of collaboration. We also explore guidelines that can be used for remote collaboration applications, enabling configuration across a wide range of industries (e.g., automotive and clothing). (10.1109/TVCG.2025.3616734)
    DOI : 10.1109/TVCG.2025.3616734
  • UNE ANALYSE CRITIQUE DES EXIGENCES EN MATIÈRE DE DOCUMENTATION TECHNIQUE DE L'ARTICLE 11 ET DE L'ANNEXE IV DU RÈGLEMENT EUROPÉEN SUR L'INTELLIGENCE ARTIFICIELLE (IA ACT)
    • Maxwell Winston
    • Breidenstein Alicia
    , 2025. Cette contribution examine les exigences en matière de documentation technique pour les systèmes d'IA à haut risque en application de l'article 11 et de l'annexe IV de l'IA Act, soulignant les incohérences et difficultés d'interprétation dans le texte.
  • Determining the Intrinsic Structure of Public Software Development History: an Exploratory Study
    • Pietri Antoine
    • Rousseau Guillaume
    • Zacchiroli Stefano
    Empirical Software Engineering, Springer Verlag, 2025, 31 (5), pp.1-51. Collaborative software development has produced a wealth of software source code artifacts (source files and directories, commits, releases, etc.) that have been studied for decades by researchers in empirical software engineering. Due to code reuse and the fork-based development model, those artifacts form a globally interconnected graph of a size comparable to the graph of the Web. Little is known yet about the network structure of this graph; such knowledge is useful to determine the best practical approaches to efficiently analyze very large subsets of it (if not all of it) in a methodologically sound manner. In this paper we determine the most salient network topology properties of the global public software development history as captured by state-of-the-art version control systems (VCS). As our corpus we use Software Heritage, one of the largest and most diverse publicly available archives of VCS data-encompassing 9 billion unique source code files and 2 billion unique commits coming from about 150 million projects or, as a graph, 19 billion nodes and 221 billion edges. We explore topology characteristics such as: degree distributions; distribution of connected component sizes; and distribution of shortest path lengths. We characterize these topology aspects for both the entire graph and relevant subgraphs. (10.1007/s10664-025-10741-y)
    DOI : 10.1007/s10664-025-10741-y
  • Partial Independence Suffices to Rule Out Real Quantum Theory Experimentally
    • Weilenmann Mirjam
    • Gisin Nicolas
    • Sekatski Pavel
    Physical Review Letters, American Physical Society, 2025, 135 (18), pp.180201. The role of complex quantities in quantum theory has been puzzling physicists since the beginnings. It is thus natural to ask whether, in order to describe our experiments, the mathematical structure of complex Hilbert spaces it is built on is really necessary. Recently, it was shown that this structure is inevitable in network scenarios with independent sources. More precisely, Real Quantum Theory cannot explain the predictions of (Complex) Quantum Theory [Renou et al., Nature 600, 2021]. Here, we revisit the independence assumption underlying this work. We show that assuming partial independence is sufficient for showing the inadequacy of Real Quantum Theory. We derive a tradeoff between source independence and the Bell value achievable in Real Quantum Theory, which also lower bounds the source correlations required to explain previous experiments by means of real quantum systems. We further show that 1 bit of entanglement is necessary and sufficient for recovering the complex quantum correlations by means of Real Quantum Theory in the scenario from [Renou et al., Nature 600, 2021]. Finally, building on [McKague et al., PRL 102, 2009], we provide a construction to simulate any complex quantum setup with m independent sources by means of Real Quantum Theory, by allowing the sources to share a m real-qubit entangled state in the first round of the experiment. (10.1103/3fv7-p8cs)
    DOI : 10.1103/3fv7-p8cs
  • A historical perspective on the Schützenberger-van Trees inequality: A posterior uncertainty principle
    • Rioul Olivier
    , 2025, 16034 (2), pp.1-10. The Bayesian Cramér-Rao Bound (BCRB) is generally at- tributed to Van Trees who published it in 1968. According to Stigler’s law of eponymy, no scientific discovery is named after its first discoverer. This is the case not only for the Cramér-Rao bound itself—due in particular to the French mathematicians Fréchet and Darmois—but also for the van Trees inequality: The French physician, geneticist, epidemiologist and mathematician Marcel-Paul (Marco) Schützenberger, in a paper of just fifteen lines written in 1956—more than a decade before van Trees—had not only derived the BCRB but, as a close examination of his proof shows, used a very original approach based on the Weyl-Heisenberg uncertainty principle on the square root of the posterior distribution. This work reviews and extends Schützenberger’s approach to Fisher information matrices, which opens up new perspectives.
  • Huygens’ Metasurface for Sub-7 GHz MIMO Antenna Beamsteering
    • Medrar Ghiles
    • Lepage Anne Claire
    • Begaud Xavier
    , 2025.
  • Quantum Gibbs states are locally Markovian
    • Chen Chi-Fang
    • Rouzé Cambyse
    , 2025. The Markov property entails the conditional independence structure inherent in Gibbs distributions for general classical Hamiltonians, a feature that plays a crucial role in inference, mixing time analysis, and algorithm design. However, much less is known about quantum Gibbs states. In this work, we show that for any Hamiltonian with a bounded interaction degree, the quantum Gibbs state is locally Markov at arbitrary temperature, meaning there exists a quasi-local recovery map for every local region. Notably, this recovery map is obtained by applying a detailed-balanced Lindbladian with jumps acting on the region. Consequently, we prove that (i) the conditional mutual information (CMI) for a shielded small region decays exponentially with the shielding distance, and (ii) under the assumption of uniform clustering of correlations, Gibbs states of general non-commuting Hamiltonians on $D$-dimensional lattices can be prepared by a quantum circuit of depth $e^{O(\log^D(n/ε))}$, which can be further reduced assuming certain local gap condition. Our proofs introduce a regularization scheme for imaginary-time-evolved operators at arbitrarily low temperatures and reveal a connection between the Dirichlet form, a dynamic quantity, and the commutator in the KMS inner product, a static quantity. We believe these tools pave the way for tackling further challenges in quantum thermodynamics and mixing times, particularly in low-temperature regimes.
  • Sample Complexity of Locally Differentially Private Quantum Hypothesis Testing
    • Cheng Hao-Chung
    • Hirche Christoph
    • Rouzé Cambyse
    , 2024, pp.2921-2926. Quantum state discrimination is an important problem in many information processing tasks. In this work we are concerned with finding the best possible sample complexity when the states are preprocessed by a quantum channel that is required to be locally differentially private. We give achievability and converse bounds that nearly match the best known classical bounds. On the way, we prove several novel inequalities between quantum divergences that should be of independent interest. (10.1109/ISIT57864.2024.10619433)
    DOI : 10.1109/ISIT57864.2024.10619433
  • Modified logarithmic Sobolev inequalities for CSS codes
    • Stengele Sebastian
    • Capel Ángela
    • Gao Li
    • Lucia Angelo
    • Pérez-García David
    • Pérez-Hernández Antonio
    • Rouzé Cambyse
    • Warzel Simone
    , 2025. We consider the class of Davies quantum semigroups modelling thermalization for translation-invariant Calderbank-Shor-Steane (CSS) codes in D dimensions. We prove that conditions of Dobrushin-Shlosman-type on the quantum Gibbs state imply a modified logarithmic Sobolev inequality with a constant that is uniform in the system's size. This is accomplished by generalizing parts of the classical results on thermalization by Stroock, Zegarlinski, Martinelli, and Olivieri to the CSS quantum setting. The results in particular imply the rapid thermalization at any positive temperature of the toric code in 2D and the star part of the toric code in 3D, implying a rapid loss of stored quantum information for these models.
  • Heisenberg-limited Hamiltonian learning continuous variable systems via engineered dissipation
    • Möbus Tim
    • Bluhm Andreas
    • Gefen Tuvia
    • Tong Yu
    • Werner Albert
    • Rouzé Cambyse
    , 2025. Discrete and continuous variables oftentimes require different treatments in many learning tasks. Identifying the Hamiltonian governing the evolution of a quantum system is a fundamental task in quantum learning theory. While previous works mostly focused on quantum spin systems, where quantum states can be seen as superpositions of discrete bit-strings, relatively little is known about Hamiltonian learning for continuous-variable quantum systems. In this work we focus on learning the Hamiltonian of a bosonic quantum system, a common type of continuous-variable quantum system. This learning task involves an infinite-dimensional Hilbert space and unbounded operators, making mathematically rigorous treatments challenging. We introduce an analytic framework to study the effects of strong dissipation in such systems, enabling a rigorous analysis of cat qubit stabilization via engineered dissipation. This framework also supports the development of Heisenberg-limited algorithms for learning general bosonic Hamiltonians with higher-order terms of the creation and annihilation operators. Notably, our scheme requires a total Hamiltonian evolution time that scales only logarithmically with the number of modes and inversely with the precision of the reconstructed coefficients. On a theoretical level, we derive a new quantitative adiabatic approximation estimate for general Lindbladian evolutions with unbounded generators. Finally, we discuss possible experimental implementations.
  • Adaptive Learned Message Passing Algorithms for Decoding Error Correcting Codes
    • Yousefi Mansoor
    • Tasdighi Alireza
    , 2025. <div><p>The weighted belief propagation (WBP) for the decoding of the linear block codes is considered. In WBP, the Tanner graph of the code is unrolled with respect to the iterations of the belief propagation decoder. Then, weights are assigned to the edges of the resulting recurrent network, and optimized offline using a training data set. In this paper, an adaptive neural decoder is proposed, where the weights of the decoder are determined for each received word. Two variants of this decoder are investigated. In the parallel weighted min-sum (WMS) decoder, the weights take values in a discrete set. A number of WMS decoders are run in parallel to search for the best sequence of weights in realtime. In the two-stage decoder, a small neural network is used to determine the weights of the WMS decoder for each received word. The findings show that the adaptive neural decoders offer substantial improvements in the bit error rate compared to their static counterparts for several codes, at about the same computational complexity.</p></div>
  • Enriching Taxonomies using Large Language Models
    • Ghamlouch Zeinab
    • Alam Mehwish
    , 2025. Taxonomies play a vital role in structuring and categorising information across domains. However, many existing taxonomies suffer from limited coverage and outdated or ambiguous nodes, reducing their effectiveness in knowledge retrieval. To address this, we present Taxoria, a novel taxonomy enrichment pipeline that leverages Large Language Models (LLMs) to enhance a given taxonomy. Unlike approaches that extract internal LLM taxonomies, Taxoria uses an existing taxonomy as a seed and prompts an LLM to propose candidate nodes for enrichment. These candidates are then validated to mitigate hallucinations and ensure semantic relevance before integration. The final output includes an enriched taxonomy with provenance tracking and visualisation of the final merged taxonomy for analysis.
  • Automatic Analysis of Collaboration Through Human Conversational Data Resources: A Review
    • Yu Yi
    • Boritchev Maria
    • Clavel Chloé
    , 2025. Collaboration is a task-oriented, high-level human behavior. In most cases, conversation serves as the primary medium for information exchange and coordination, making conversational data a valuable resource for the automatic analysis of collaborative processes. In this paper, we focus on verbal aspects of collaboration and conduct a review of collaboration analysis using task-oriented conversation resources, encompassing related theories, coding schemes, tasks, and modeling approaches. We aim to address the question of how to utilize task-oriented human-human conversational data for collaboration analysis. We hope our review will serve as a practical resource and illuminate unexplored areas for future collaboration analysis.
  • How NixOS could have detected the XZ supply-chain attack for the benefit of all thanks to reproducible-builds
    • Malka Julien
    , 2025. <div><p>In March 2024, a sophisticated backdoor was discovered in xz, a core compression library in Linux distributions, covertly inserted over three years by a malicious maintainer, Jia Tan. The attack, which enabled remote code execution via ssh, was only uncovered by chance when Andres Freund investigated a minor performance issue. This incident highlights the vulnerability of the open-source supply chain and the effort attackers are willing to invest in gaining trust and access. In this article, I analyze the backdoor's mechanics and explore how bitwise build reproducibility could have helped detect it.</p></div>