Sorry, you need to enable JavaScript to visit this website.
Partager

Publications

 

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

 

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

 

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2026

  • A posteriori closure of turbulence models: Are symmetries preserved?
    • Freitas André
    • Um Kiwon
    • Desbrun Mathieu
    • Buzzicotti Michele
    • Biferale Luca
    European Journal of Mechanics - B/Fluids, Elsevier, 2026, 119, pp.204496. Turbulence modeling remains a longstanding challenge in fluid dynamics. Recent advances in data- driven methods have led to a surge of novel approaches aimed at addressing this problem. This work builds upon our recent work [Phys. Rev. Fluids 10, 044602 (2025)], where we introduced a new closure for a shell model of turbulence using an a posteriori (or solver-in-the-loop) approach. Unlike most deep learning-based models, our method explicitly incorporates physical equations into the neural network framework, ensuring that the closure remains constrained by the underlying physics benefiting from enhanced stability and generalizability. In this paper, we further analyze the learned closure, probing its capabilities and limitations. In particular, we look at joint probability density functions between resolved and unresolved variables, as well as the scale invariance of multipliers (ratios between adjacent shells) within the inertial range. Although our model excels in reproducing high-order statistical moments, it breaks this known symmetry near the cutoff, indicating a fundamental limitation. We discuss the implications of these findings for subgrid-scale modeling in 3D turbulence and outline directions for future research. (10.1016/j.euromechflu.2026.204496)
    DOI : 10.1016/j.euromechflu.2026.204496
  • SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations
    • Berriche Manon
    • Nouri Célia
    • Clavel Chloé
    • Cointet Jean-Philippe
    , 2026. We introduce SPOT (Stopping Points in Online Threads), the first annotated corpus translating the sociological concept of stopping point into a reproducible NLP task. Stopping points are ordinary critical interventions that pause or redirect online discussions through a range of forms (irony, subtle doubt or fragmentary arguments) that frameworks like counterspeech or social correction often overlook. We operationalize this concept as a binary classification task and provide reliable annotation guidelines. The corpus contains 43,305 manually annotated French Facebook comments linked to URLs flagged as false information by social media users, enriched with contextual metadata (article, post, parent comment, page or group, and source). We benchmark fine-tuned encoder models (CamemBERT) and instruction-tuned LLMs under various prompting strategies. Results show that fine-tuned encoders outperform prompted LLMs in F1 score by more than 10 percentage points, confirming the importance of supervised learning for emerging non-English social media tasks. Incorporating contextual metadata further improves encoder models F1 scores from 0.75 to 0.78. We release the anonymized dataset, along with the annotation guidelines and code in our code repository, to foster transparency and reproducible research.
  • PHYSICS-INFORMED LEARNING OF NEURAL SCATTERING FIELDS TOWARDS MEASUREMENT-FREE MESH-TO-HRTF ESTIMATION
    • Martinez Tancrède
    • Carlo Diego Di
    • Nugraha Aditya Arie
    • Fontaine Mathieu
    • Yoshii Kazuyoshi
    , 2026. <div><p>This paper describes neural simulation of the scattered pressure field from a plane wave around a scattering object in both continuous 2D and 3D domains. This task has typically been treated as a regression problem that aims to train a physicsinformed neural network (PINN) using pressure measurements at discrete positions. This approach, however, needs to train the whole network for each incident wave direction. To address this, we propose a measurement-free simulator based on a PINN purely driven by the Helmholtz equation with the Robin boundary condition and the Sommerfeld radiation condition with the aid of the perfectly matched layer (PML) framework. More specifically, we design a physics-informed scattering hypernetwork (PHISK) that can generalize to incident waves from any direction via low-rank adaptation (LoRA) of a PINN trained for a specific configuration. The experiment shows that the proposed method accurately simulated sound scattering around various objects, adapting to unseen incident wave directions with minimal performance loss, and realized reasonable simulation of head-related transfer functions (HRTFs) from complex mesh data of a human head.</p></div>
  • SIRUP: A DIFFUSION-BASED VIRTUAL UPMIXER OF STEERING VECTORS FOR HIGHLY-DIRECTIVE SPATIALIZATION WITH FIRST-ORDER AMBISONICS
    • Picard Emilio
    • Carlo Diego Di
    • Nugraha Aditya Arie
    • Fontaine Mathieu
    • Yoshii Kazuyoshi
    , 2026. <div><p>This paper presents virtual upmixing of steering vectors captured by a fewer-channel spherical microphone array. This challenge has conventionally been addressed by recovering the directions and signals of sound sources from first-order ambisonics (FOA) data, and then rendering the higher-order ambisonics (HOA) data using a physics-based acoustic simulator. This approach, however, struggles to handle the mutual dependency between the spatial directivity of source estimation and the spatial resolution of FOA ambisonics data. Our method, named SIRUP, employs a latent diffusion model architecture. Specifically, a variational autoencoder (VAE) is used to learn a compact encoding of the HOA data in a latent space and a diffusion model is then trained to generate the HOA embeddings, conditioned by the FOA data. Experimental results showed that SIRUP achieved a significant improvement compared to FOA systems for steering vector upmixing, source localization, and speech denoising.</p></div>
  • INSTANT: COMPRESSING GRADIENTS AND ACTIVATIONS FOR RESOURCE-EFFICIENT TRAINING
    • Doan Tuan-Kiet
    • Tran Trung-Hieu
    • Tartaglione Enzo
    • Simidjievski Nikola
    • Nguyen Van-Tam
    , 2026. <div><p>Deep learning has advanced at an unprecedented pace. This progress has led to a significant increase in its complexity. However, despite extensive research on accelerating inference, training deep models directly within a resource-constrained budget remains a considerable challenge due to its high computational and memory requirements. In this paper, we introduce INSTANT (compressIng gradieNtS and acTivAtions for resource-efficieNt Training), a method designed to address both the computational and the memory bottlenecks when training. INSTANT reduces resource demands during backpropagation by projecting gradients and activations into a low-rank subspace and performing computation within that compressed representation. Experimental results demonstrate that INSTANT achieves a 15× reduction in computational cost and 32× reduction in activation memory with negligible impact on model performance. The code is available at INSTANT. * Equal contribution.</p><p>• We introduce a low-cost calibration technique to generate calibrated orthonormal bases for tensor projection, enabling significant reductions in memory and computations (Sec. 3.2). • We project activation tensors and gradients onto these orthonormal bases. To our knowledge, this is the first work to exploit the low-rank structure of activation gradients for all types of data distribution. We provide an error analysis of our gradient compression, illustrating that a high compression ratio is achievable with limited performance degradation (Sec. 3.3). • We evaluate INSTANT across multiple datasets and model architectures, consistently demonstrating good performance, achieving up to 32× memory savings and 15× computational cost reduction with only a 1% trade-off in accuracy compared to vanilla fine-tuning (Sec. 4).</p></div> <div>RELATED WORK<p>Activation compression. Activation compression is a recently emerging research direction that addresses the memory challenges during training. This approach offers several key advantages based on the following observations: (i) model weights remain uncompressed during training, thereby preserving their expressive capacity; (ii) activations are often large and exhibit significant redundancy, making them suitable for compression (Sakr &amp; Khailany, 2024; Miles et al., 2024). (Nguyen et al., 2024) applies SVD to compress activations to reduce huge memory usage for activations. However, this approach raises substantial computational overhead due to the high cost of performing SVD in each training iteration. (Sakr &amp; Khailany, 2024) (ESPACE) tackles SVD computational expense by using calibrated subspaces, which are periodically updated, to compress activations. They enable activation compression in the forward pass, reducing computational overhead in both the forward and backward phases. However, ESPACE is prone to error accumulation, as it relies on the universal fixed subspace across varying activations.</p><p>Optimizer state compression. Weight gradients are inherently low-rank (Yang et al., 2023a). Previous studies (Bernstein et al., 2018; Vogels et al., 2019) have leveraged this characteristic to address communication bottlenecks in distributed learning by reducing inter-device data transmission. GaLore (Zhao et al., 2024) and its variances (Muhamed et al., 2024; Shamshoum et al., 2025) leverage the low-rank property of weight gradients for compressing them to reduce memory usage in the optimizer state significantly. CompAct Shamshoum et al. ( 2025) further reduces the memory overhead</p></div>
  • NeuroSnitch: Exploiting Inter-Spike Interval Statistics for Timing Side-Channel Attacks on Noisy Neuromorphic Systems
    • Khan Mahreen
    • Mushtaq Maria
    • Apvrille Ludovic
    , 2026. <div><p>Neuromorphic computing promises energy-efficient solutions for embedded and edge systems, but introduces unique security challenges and a new attack surface. This paper presents NeuroSnitch, a first-ever timing side-channel attack to leverage subtle statistical variations in Inter-Spike Intervals (ISIs) on Spiking Neural Networks (SNNs) to extract secret information. We show that secret data, when modulating a neuron's input current, can be profiled through higher-order ISI statistics-mean, variance, skewness, and kurtosis-even under realistic noise sources, including observation noise, current fluctuation, and voltage jitter. Using the Leaky Integrateand-Fire (LIF) neuron model, we demonstrate that a Random Forest classifier can achieve 98.41% character-level classification accuracy on noisy ISI traces, enabling complete recovery of a 33-character secret string. This work exposes a previously underexplored and robust timing leakage vector in SNNs, underscoring the urgent need for tailored security measures in this emerging computing paradigm, particularly for sensitive embedded and IoT applications.</p></div> (10.1145/YYYYYYY.YYYYYYY)
    DOI : 10.1145/YYYYYYY.YYYYYYY
  • Drop the mask! GAMM—A Taxonomy for Graph Attributes Missing Mechanisms
    • Serrano Richard
    • Jeudy Baptiste
    • Laclau Charlotte
    • Largeron Christine
    , 2026. Exploring missing data in attributed graphs introduces unique challenges beyond those found in tabular datasets. In this work, we extend the taxonomy for missing data mechanisms to attributed graphs by proposing GAMM (Graph Attributes Missing Mechanisms), a framework that systematically links missingness probability to both node attributes and the underlying graph structure. Our taxonomy enriches the conventional definitions of masking mechanisms by introducing graph-specific dependencies. We empirically demonstrate that state-of-the-art imputation methods, while effective on traditional masks, significantly struggle when confronted with these more realistic graph-aware missingness scenarios.
  • Scaffolded Vulnerability: Chatbot-Mediated Reciprocal Self-Disclosure and Need-Supportive Interaction in Couples
    • Jiang Zhuoqun
    • Yeo Shunyi
    • Herremans Dorien
    • Perrault Simon
    , 2026, pp.1-39. Reciprocal self-disclosure and need-supportive behavior are essential for close relationships, yet prior systems rarely engage the motivational underpinnings, autonomy, competence, and relatedness, that help partners internalize supportive behaviors. We introduce a Self-Determination Theory-guided chatbot that mediates selfdisclosure between romantic partners by scaffolding these needs through structured questions and reflection follow-ups. In a randomized study (N=72; 36 couples), we compared three conditions: Partner Support (PS: chatbot support + partner-reflection scaffolds), Direct Support (DS: chatbot support only), and Basic Prompt (BP: questions only). PS conversations were longest and most engaged; PS and DS elicited deeper disclosures and stronger relatedness support than BP. Within PS, reflection phases concentrated partnerprovided need support. Controlled motivation decreased across conditions, closeness increased only in PS, and vitality declined in BP. We contribute empirical evidence that SDT-guided mediation amplifies support and closeness, a design blueprint for relatedness technologies, and an SDT framework for advancing AI-mediated conversation design. (10.1145/3772318.3791370)
    DOI : 10.1145/3772318.3791370
  • Craft-Based Data Physicalization: Opportunities and Challenges
    • Bakhtiari Bahare
    • Daneshzand Foroozan
    • Sauvé Kim
    • Bressa Nathalie
    • Huron Samuel
    • Oehlberg Lora
    • Carpendale Sheelagh
    • Somanath Sowmya
    • Perin Charles
    , 2026. This three-hour workshop will gather data visualization and HCI researchers and practitioners to explore the possibilities of data representation using craft techniques. Participants will submit a 2-4 page document including (i) a statement of their craft experience, (ii) representative images of physicalizations they have created using this craft technique, and (iii) a discussion of opportunities and challenges for physicalizing data in their craft domain. During the workshop, participants and organizers will work in groups to brainstorm ways of representing data through their shared craft of interest. Then, every group proposes a synthesis of opportunities and challenges of the craft technique they worked with. Together, the community will chart a research agenda on how craft can ex- pand the design space of data physicalization, inform the creation of more expressive and accessible authoring tools, and raise new questions around aesthetics, accuracy, and the role of slow making in data representation.
  • Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity
    • Robbes Romain
    • Matricon Théo
    • Degueule Thomas
    • Hora Andre
    • Zacchiroli Stefano
    , 2026. In 2025, coding agents have seen a very rapid adoption. Coding agents leverage Large Language Models (LLMs) in ways that are markedly different from LLM-based code completion, making their study critical. Moreover, unlike LLM-based completion, coding agents leave visible traces in software repositories, enabling the use of MSR techniques to study their impact on SE practices. This paper documents the promises, perils, and heuristics that we have gathered from studying coding agent activity on GitHub. (10.1145/3793302.3793375)
    DOI : 10.1145/3793302.3793375
  • AI Agents and the Future of Deliberation: Designing Human-AI Collaboration for Democratic Dialogue
    • Zhang Weiyu
    • Yeo Shunyi
    • Perrault Simon
    • Pei Jiaxin
    • Liddo Anna De
    • Veri Francesco
    • Flechtner Maurice
    • Saggion Horacio
    , 2026. As societies grapple with increasing polarization and information complexity, the need for constructive, inclusive, and well-informed deliberation has reached an unparalleled level. At the same time, AI agents, ranging from LLMs and multi-agent simulations and systems to conversational assistants and reflective companions, are rapidly reshaping how people communicate, reason together, and form collective judgments. These technologies hold the potential to scale democratic participation, foster inclusivity by bridging linguistic and cultural barriers, and introduce new forms of collaborative reasoning. Yet they also pose epistemic challenges to established notions of authenticity, legitimacy, and human autonomy in civic dialogue. This panel brings together leading researchers from Asia, Europe and North America to examine how AI technologies are transforming deliberation as both a social process and a design problem. It interrogates AI's role in shaping deliberative norms, influencing group dynamics, and redefining what it means to "reason together" in hybrid human-AI spaces. Through interactive polling, structured debates, and audience co-deliberation, the session invites CHI participants to collectively explore how we can design responsible, inclusive, and trustworthy deliberation interfaces that preserve the democratic values of deliberation while embracing the creative potential of AI.
  • Cross-Core Covert Channel for RISC-V: Implementation, Countermeasures and Cross-Platform Analysis
    • Khan Mahreen
    • Mushtaq Maria
    • Pacalet Renaud
    • Apvrille Ludovic
    , 2026. <div><p>Cache-based covert channels exploit microarchitectural timing differences to enable unauthorized communication between processes. While extensively studied on x86 architectures, such channels remain underexplored in the emerging RISC-V ecosystem. This paper presents the design and implementation of a novel prefetcher and cache timing covert channel for RISC-V platforms that exploits the timing difference between cached and uncached memory accesses. Our implementation supports both standardized RISC-V cache management extensions (Zicbom and Zicbop) and vendor-specific instructions (T-Head C910 custom instructions), demonstrating cross-platform portability across heterogeneous RISC-V implementations. The sender encodes bits by selectively prefetching or flushing a shared cache line, while the receiver decodes information by measuring memory access latency. Through careful synchronization using POSIX shared memory and atomic operations, we achieve reliable bit transmission on both the RISC-V gem5 full-system simulator (Sifive U54 core) and physical RISC-V Beagle-V Ahead (T-Head C910 core). Our paper contributes to understanding the security implications of cache and prefetcher management instructions in RISC-V systems and provides a foundation for developing detection and mitigation strategies for this emerging architecture.</p></div>
  • It’s All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
    • Santini Cristian
    • van Erp Marieke
    • Alam Mehwish
    , 2026. Despite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs, and evolving semantic conventions. Existing solutions either require substantial training data or rely on domain-specific rules that limit scalability. In this paper, we present MHEL-LLaMo (Multilingual Historical Entity Linking with Large Language MOdels), an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM. MHEL-LLaMo leverages a multilingual bi-encoder (BELA) for candidate retrieval and an instruction-tuned LLM for NIL prediction and candidate selection via prompt chaining. Our system uses SLM's confidence scores to discriminate between easy and hard samples, applying an LLM only for hard cases. This strategy reduces computational costs while preventing hallucinations on straightforward cases. We evaluate MHEL-LLaMo on four established benchmarks in six European languages (English, Finnish, French, German, Italian and Swedish) from the 19th and 20th centuries. Results demonstrate that MHEL-LLaMo outperforms state-of-the-art models without requiring fine-tuning, offering a scalable solution for low-resource historical EL. Our error analysis reveals that 41\% of false predictions exhibit semantic proximity to ground truth entities, highlighting the LLM's accurate disambiguation of historical references.
  • 3D Imaging Contribution in Pediatric Surgical Oncology: A Multistakeholder Assessment Study
    • Pio Luca
    • Kassir Rani
    • La Barbera Giammarco
    • Lozach Cecile
    • Bonnot Enzo
    • Isla Thomas
    • Pablo de la Plata Alcalde Juan
    • Gori Pietro
    • Bloch Isabelle
    • Sarnacki Sabine
    Scientific Reports, Nature Publishing Group, 2026. Introduction: Medical imaging is crucial for surgical planning, yet surgeons struggle with mental transformation of 2D images into 3D representations, particularly in complex pediatric pelvic anatomy. This study evaluated perceived benefits of 3D imaging with tractography compared to conventional 2D MRI in pediatric pelvic tumor surgery.<p>Methods: A nationwide study assessed three groups: non-medical personnel (n=30), medical trainees (residents and fellows; primary analysis n=61, excluding 3 medical students), and senior pediatric surgeons (n=12). Using 3-Tesla MRI with specialized protocols including highresolution CoroT2cube and diffusion tensor imaging, participants evaluated five clinical cases in both 2D and 3D formats using 7-point Likert scales. Statistical analysis employed Wilcoxon paired tests with Bonferroni correction.</p><p>Results: All groups showed significant improvements in perceived understanding with 3D imaging. Non-medical personnel scores increased from 4.24 (±0.69) to 6.27 (±0.28) (p&lt;0.001), particularly in understanding disease and surgical objectives. Medical trainees improved from 5.08 (±0.61) to 6.42 (±0.49) (p&lt;0.001), with enhanced understanding of surgical objectives and anatomical relationships. Senior surgeons' scores increased from 5.02 (±0.69) to 6.33 (±0.52) (p&lt;0.001), showing significant improvements in preoperative planning and family communication. Effect sizes were substantial across groups (Cohen's d: 2.80, 1.90, and 1.52 respectively), though the within-subject design likely contributes to effect size inflation.</p><p>Discussion: This study provides preliminary evidence for perceived 3D imaging value in pediatric pelvic tumor surgery. Improved anatomical comprehension among non-medical personnel may benefit informed consent, while enhanced visualization aids surgical education and planning. High surgeon acceptance (92%) suggests strong acceptability, though these exploratory findings require validation before implementation recommendations can be made.</p><p>Prospective studies evaluating objective clinical outcomes, workflow integration and costeffectiveness require further study.</p> (10.1038/s41598-026-44543-z)
    DOI : 10.1038/s41598-026-44543-z
  • Microarchitectural Espionage: FPGA-Based Security Analysis of Branch Prediction in RISC-V Out-of-Order Cores
    • Khan Mahreen
    • Bin Mohd Shahfie Muhammad Emir
    • Mushtaq Maria
    • Pacalet Renaud
    • Apvrille Ludovic
    , 2026. <div><p>Modern processor microarchitectural optimizations, while enhancing performance, inadvertently introduce side channels that can leak sensitive information through timing variations. This paper presents an FPGA-based security testbed for studying branch predictor side-channel vulnerabilities in open-source RISC-V out-of-order cores. We demonstrate a configurable platform built on the Berkeley Out-of-Order Machine (BOOM) core, adapted for resource-constrained FPGA deployment with customizable branch predictor configurations. Through baremetal execution and cycle-accurate timing measurements, we implement and evaluate three classes of timing attacks: Conditional Branch Prediction Attacks (CBPA), Indirect Branch Prediction Attacks (IBPA), and a practical smart-lock application attack. Our results show that simplified one-level predictors exhibit deterministic timing separations of 9 to 17 cycles, enabling perfect secret recovery with 100% accuracy for 16-bit secrets within 500 measurement rounds. We further demonstrate practical attack scenarios, including the extraction of a randomly-generated 4-digit smart-lock code, and evaluate the impact of branch predictor complexity on attack feasibility. This work provides an open-source framework for reproducible microarchitectural security research on RISC-V platforms, enabling evaluation of both attacks and countermeasures.</p></div>
  • Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
    • Agarwal Manvi
    • Wang Changhong
    • Richard Gael
    , 2026. While music remains a challenging domain for generative models like Transformers, a two-pronged approach has recently proved successful: inserting musically-relevant structural information into the positional encoding (PE) module and using kernel approximation techniques based on Random Fourier Features (RFF) to lower the computational cost from quadratic to linear. Yet, it is not clear how such RFF-based efficient PEs compare with those based on rotation matrices, such as Rotary Positional Encoding (RoPE). In this paper, we present a unified framework based on kernel methods to analyze both families of efficient PEs. We use this framework to develop a novel PE method called RoPEPool, capable of extracting causal relationships from temporal sequences. Using RFF-based PEs and rotation-based PEs, we demonstrate how seemingly disparate PEs can be jointly studied by considering the interactions they induce between two descriptive levels of the data: the input, capturing quickly-varying components, and the prior, capturing slowly-varying components. For empirical validation, we use a symbolic music generation task, namely, melody harmonization. We show that RoPEPool, combined with highly-informative structural priors, outperforms all methods.
  • Readability as a multi-measure construct in data visualization
    • Cabouat Anne-Flore
    • Huron Samuel
    • Isenberg Tobias
    • Isenberg Petra
    , 2026. In this paper, we argue that readability cannot be meaningfully discussed without considering multiple complementary measures, and that relying on a single measure constitutes an epistemological choice that constrains the conclusions that can be drawn.
  • 5G-EcoSim: A Simulation Framework for Estimating 5G Energy Consumption Using Real-World Data and Analytical Models
    • Ghali Meriem
    • Busson Anthony
    • Coupechoux Marceau
    , 2026. The definition and deployment of next-generation mobile networks must incorporate considerations of sustainability and environmental impact. In this context, estimating the energy consumption of a mobile network deployment across a city or region is of utmost importance. Nationwide historical aggregate values are informative but do not allow to understand the underlying dynamics or to perform prospective studies. There is hence a need for bottom-up approaches to assess the energy consumption of mobile networks. However, this is a complex task, as it depends on numerous factors, including the number and spatial distribution of base stations, the underlying technologies and their configuration, as well as user demand patterns and their geographic distribution. This paper thus introduces a simulation framework aimed at estimating the energy consumption of 5G networks at urban, regional or national scale. The framework integrates radio propagation models operating in the 3.5 GHz band with publicly available datasets describing user and base station locations, as well as network traffic volumes. As a case study, we consider the 5G deployment in France and examine the spatial distribution of network load and the resulting energy consumption. The source code and datasets are publicly available, ensuring that the simulator is fully reproducible and easily adaptable to other use cases, countries, or regions.
  • Capsule networks do not need to model everything
    • Renzulli Riccardo
    • Tartaglione Enzo
    • Grangetto Marco
    Pattern Recognition, Elsevier, 2026, 171, Part A, pp.112119 (1-11). Capsule networks are biologically inspired neural networks that group neurons into vectors called capsules, each explicitly representing an object or one of its parts. The routing mechanism connects capsules in consecutive layers, forming a hierarchical structure between parts and objects, also known as a parse tree. Capsule networks often attempt to model all elements in an image, requiring large network sizes to handle complexities such as intricate backgrounds or irrelevant objects. However, this comprehensive modeling leads to increased parameter counts and computational inefficiencies. Our goal is to enable capsule networks to focus only on the object of interest, reducing the number of parse trees. We accomplish this with REM (Routing Entropy Minimization), a technique that minimizes the entropy of the parse tree-like structure. REM drives the model parameters distribution towards low entropy configurations through a pruning mechanism, significantly reducing the generation of intra-class parse trees. This empowers capsules to learn more stable and succinct representations with fewer parameters and negligible performance loss. (10.1016/j.patcog.2025.112119)
    DOI : 10.1016/j.patcog.2025.112119
  • Exploiting Subgradient Sparsity in Max-Plus Neural Networks
    • Enaieh Ikhlas
    • Fercoq Olivier
    , 2026. Deep Neural Networks are powerful tools for solving machine learning problems, but their training often involves dense and costly parameter updates. In this work, we use a novel Max-Plus neural architecture in which classical addition and multiplication are replaced with maximum and summation operations respectively. This is a promising architecture in terms of interpretability, but its training is challenging. A particular feature is that this algebraic structure naturally induces sparsity in the subgradients, as only neurons that contribute to the maximum affect the loss. However, standard backpropagation fails to exploit this sparsity, leading to unnecessary computations. In this work, we focus on the minimization of the worst sample loss which transfers this sparsity to the optimization loss. To address this, we propose a sparse subgradient algorithm that explicitly exploits the algebraic sparsity. By tailoring the optimization procedure to the non-smooth nature of Max-Plus models, our method achieves more efficient updates while retaining theoretical guarantees. This highlights a principled path toward bridging algebraic structure and scalable learning.
  • Decentralized Ranking Aggregation: Gossip Algorithms for Borda and Copeland Consensus
    • van Elst Anna
    • Le Caillec Kerrian
    • Colin Igor
    • Clémençon Stéphan
    , 2026. <div><p>The concept of ranking aggregation plays a central role in preference analysis, and numerous algorithms for calculating median rankings, often originating in social choice theory, have been documented in the literature, offering theoretical guarantees in a centralized setting, i.e., when all the ranking data to be aggregated can be brought together in a single computing unit. For many technologies (e.g. peer-to-peer networks, IoT, multi-agent systems), extending the ability to calculate consensus rankings with guarantees in a decentralized setting, i.e., when preference data is initially distributed across a communicating network, remains a major methodological challenge. Indeed, in recent years, the literature on decentralized computation has mainly focused on computing or optimizing statistics such as arithmetic means using gossip algorithms. The purpose of this article is precisely to study how to achieve reliable consensus on collective rankings using classical rules (e.g. Borda, Copeland) in a decentralized setting, thereby raising new questions, robustness to corrupted nodes, and scalability through reduced communication costs in particular. The approach proposed and analyzed here relies on random gossip communication, allowing autonomous agents to compute global ranking consensus using only local interactions, without coordination or central authority. We provide rigorous convergence guarantees, including explicit rate bounds, for the Borda and Copeland consensus methods. Beyond these rules, we also provide a decentralized implementation of consensus according to the median rank rule and local Kemenization. Extensive empirical evaluations on various network topologies and real and synthetic ranking datasets demonstrate that our algorithms converge quickly and reliably to the correct ranking aggregation. This work paves the way for principled collective decision-making in fully decentralized systems.</p></div>
  • The Hi-Audio Online Platform for Recording and Distributing Multi-Track Music Datasets
    • Gil Panal José M
    • David Aurélien
    • Richard Gaël
    , 2025. This paper introduces the Hi-Audio online platform, an open-source tool designed to support musicians and researchers in the field of Music Information Retrieval (MIR). The platform enables the recording, uploading, and sharing of multitrack musical compositions, aiming to build an open-access audio database to advance research in music technology. Uploaded audio files are automatically analyzed upon synchronization with the server, leveraging signal processing techniques and machine learning models to generate rich metadata. The platform facilitates remote and asynchronous collaboration via a web-based interface accessible at hiaudio.fr. Furthermore, a novel built-in method for accurate and robust round-trip latency estimation in the browser is proposed and integrated into the platform, demonstrating its applicability in real-world distributed recording scenarios. Finally, an initial user evaluation with musicians was conducted to assess usability and practical relevance under realistic usage conditions. The evaluation combined task-based performance analysis with standardized usability and workload measures. The results indicate high task completion rates for core recording functions and show that the platform can be used effectively by musicians with minimal prior training.
  • RV-Sec5: Enhancing RISC-V Security Evaluation via Targeted ISA-Level Instrumentation using gem5
    • Awais Muhammad
    • Mushtaq Maria
    • Naviner Lirida
    • Bruguier Florent
    • Haj Jawad
    , 2026, pp.In press. The modularity of the RISC-V Instruction Set Architecture (ISA) has accelerated its adoption in security-critical domains, yet it introduces significant challenges for pre-silicon security validation. Current evaluation methods often rely on high-level emulation that overlooks microarchitectural side effects or post-silicon testing that identifies vulnerabilities too late in the design cycle. This paper presents RV-Sec5, a systematic framework for ISA-level security evaluation that leverages the gem5 simulator. Unlike standard simulators, RV-Sec5 introduces a methodology to map high-level security invariants-such as privilege isolation and memory protection-directly to automated, cycle-accurate instrumentation points within the ISA decoder. This approach bridges the semantic gap between abstract security policies and low-level hardware execution. We demonstrate the framework's efficacy through a case study involving unauthorized Control and Status Register (CSR) modifications, showing how RV-Sec5 detects privilege escalation attempts and monitors microarchitectural anomalies, such as TLB flushes and cache state changes, in real-time. (10.1145/3793638.3793640)
    DOI : 10.1145/3793638.3793640
  • DRAGON: Robust Classification for Very Large Collections of Software Repositories
    • Balla Stefano
    • Zacchiroli Stefano
    • Degueule Thomas
    • Falleri Jean-Rémy
    • Robbes Romain
    , 2026. The ability to automatically classify source code repositories with "topics" that reflect their content and purpose is very useful, especially when navigating or searching through large software collections. However, existing approaches often rely heavily on README files and other metadata, which are frequently missing, limiting their applicability in real-world large-scale settings. We present DRAGON, a repository classifier designed for very large and diverse software collections. It operates entirely on lightweight signals commonly stored in version control systems: file and directory names, and optionally the README when available. In repository classification at scale, DRAGON improves F1@5 from 54.8% to 60.8%, surpassing the state of the art. DRAGON remains effective even when README files are absent, with performance degrading by only 6% w.r.t. when they are present. This robustness makes it practical for real-world settings where documentation is sparse or inconsistent. Furthermore, many of the remaining classification errors are near misses, where predicted labels are semantically close to the correct topics. This property increases the practical value of the predictions in real-world software collections, where suggesting a few related topics can still guide search and discovery. As a byproduct of developing DRAGON, we also release the largest open dataset to date for repository classification, consisting of 825 thousand repositories with associated ground-truth topics, sourced from the Software Heritage archive, providing a foundation for future large-scale and language-agnostic research on software repository understanding.
  • Unrolled Multiplicative Updates for Nonnegative Matrix Factorization applied to Hyperspectral Unmixing
    • Kervazo Christophe
    • Cohen Jérémy E.
    , 2026. HyperSpectral Unmixing (HSU), the problem of separating mixed spectra of overlapping materials in a hyperspectral image, has motivated dedicated algorithmic developments in the last two decades. On the one hand, traditional model-based algorithms frequently guarantee interpretable results. On the other hand, deep-learning-based approaches are often faster at inference time and may obtain better empirical results. This work utilizes the strengths of both approaches by building on the deep unrolling paradigm. Our contribution is twofold. First, we propose two new algorithms based on deep unrolling of the well-known Multiplicative Updates. The first, coined Non-Adaptive Learned Multiplicative Updates (NALMU), adopts a simple element-wise multiplicative scheme. The second, called Recursive Adaptive Learned Multiplicative Updates (RALMU), has more flexible updates and better take into account the spatial correlations in the abundances. Second, we relate NALMU to the minimization of an explicit cost function under some assumptions. Such guarantees are unique in the HSU field. NALMU and RALMU are tested on astrophysics and remote sensing datasets. They outperform the other deep learning-based HSU algorithms and classical iterative schemes for the endmember estimates and obtain competitive results for the abundance estimates, even when trained in a self-supervised way. The code used in this paper will be made available upon publication.