Sorry, you need to enable JavaScript to visit this website.
Partager

Publications

 

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

 

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

 

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2025

  • Identifying quantum resources in encoded computations
    • Davis Jack
    • Fabre Nicolas
    • Chabaud Ulysse
    , 2024. What is the origin of quantum computational advantage? Providing answers to this far-reaching question amounts to identifying the key properties, or quantum resources, that distinguish quantum computers from their classical counterparts, with direct applications to the development of quantum devices. The advent of universal quantum computers, however, relies on error-correcting codes to protect fragile logical quantum information by robustly encoding it into symmetric states of a quantum physical system. Such encodings make the task of resource identification more difficult, as what constitutes a resource from the logical and physical points of view can differ significantly. Here we introduce a general framework which allows us to correctly identify quantum resources in encoded computations, based on phase-space techniques. For a given quantum code, our construction provides a Wigner function that accounts for how the symmetries of the code space are contained within the transformations of the physical space, resulting in an object capable of describing the logical content of any physical state, both within and outside the code space. We illustrate our general construction with the Gottesman--Kitaev--Preskill encoding of qudits with odd dimension. The resulting Wigner function, which we call the Zak-Gross Wigner function, is shown to correctly identify quantum resources through its phase-space negativity. For instance, it is positive for encoded stabilizer states and negative for the bosonic vacuum. We further prove several properties, including that its negativity provides a measure of magic for the logical content of a state, and that its marginals are modular measurement distributions associated to conjugate Zak patches. (10.48550/arXiv.2407.18394)
    DOI : 10.48550/arXiv.2407.18394
  • Fundamental limits and practical algorithms for wireless distributed computation and estimation systems
    • Bi Yue
    , 2025. Distributed systems are at the core of modern computing applications, enabling collaborative task execution across interconnected components. However, their distributed nature presents major challenges in communication efficiency. This thesis addresses these challenges by analyzing the theoretical limits of information and proposing solutions to enhance the performance of distributed computing (DC) and distributed estimation systems. In DC systems, task parallelization significantly reduces execution time, but the shuffle phase remains a bottleneck, particularly in wireless networks. This thesis introduces new coding schemes to optimize the computation-communication tradeoff in such environments, leveraging interference alignment (IA) and establishing theoretical bounds. Regarding distributed estimation, where multiple nodes collaborate to estimate a common parameter, two scenarios are explored: with and without a fusion center. In the first case, a framework is proposed to optimize multi-bit quantization and minimize the Cramer-Rao bound. In the second, a synchronous al- ´ gorithm with stochastic activation is developed to improve convergence while reducing data collisions. In summary, this thesis deepens the understanding of theoretical limits and proposes practical coding strategies for distributed systems, enhancing their efficiency and robustness across various environments.
  • Statistical wave field theory: Curvature term
    • Badeau Roland
    Journal of the Acoustical Society of America, Acoustical Society of America, 2025, 157 (3), pp.1650-1664. In a recent research paper, we introduced the statistical wave field theory, which establishes the statistical laws of waves propagating in a bounded volume. These laws hold after many reflections on the boundary surface and at high frequency. The statistical wave field theory is the first statistical theory of reverberation that provides the closed-form expression of the power distribution and the correlations of the wave field jointly over time, frequency and space, in terms of the geometry and the specific admittance of the boundary surface. In this paper, we refine the theory predictions, by investigating the impact of a curved boundary surface on the wave field statistics. In particular, we provide an improved closed-form expression of the reverberation time in room acoustics that holds at lower frequency. (10.1121/10.0036053)
    DOI : 10.1121/10.0036053
  • Approches hybrides en cryptographie quantique
    • Alléaume Romain
    • Nemoz Tristan
    Photoniques, EDP Sciences, 2025 (130), pp.46-48. La cryptographie quantique s’est largement définie comme visant une sécurité inconditionnelle, en alternative à la cryptographie dite classique reposant sur la difficulté calculatoire conjecturée de certains problèmes mathématiques. Plutôt que d’opposer cryptographie quantique et classique, hybrider approches calculatoires post-quantiques (PQC) et cryptographie quantique (QC) ouvre des perspectives nouvelles, pour une cryptographie pratique, plus sûre et offrant plus de fonctionnalités. (10.1051/photon/202513046)
    DOI : 10.1051/photon/202513046
  • Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry
    • Gurioli Andrea
    • Gabbrielli Maurizio
    • Zacchiroli Stefano
    , 2025. With the increasing popularity of LLM-based code completers, like GitHub Copilot, the interest in automatically detecting AI-generated code is also increasing-in particular in contexts where the use of LLMs to program is forbidden by policy due to security, intellectual property, or ethical concerns. We introduce a novel technique for AI code stylometry, i.e., the ability to distinguish code generated by LLMs from code written by humans, based on a transformer-based encoder classifier. Differently from previous work, our classifier is capable of detecting AI-written code across 10 different programming languages with a single machine learning model, maintaining high average accuracy across all languages (84.1% ± 3.8%). Together with the classifier we also release H-AIRosettaMP, a novel open dataset for AI code stylometry tasks, consisting of 121 247 code snippets in 10 popular programming languages, labeled as either human-written or AI-generated. The experimental pipeline (dataset, training code, resulting models) is the first fully reproducible one for the AI code stylometry task. Most notably our experiments rely only on open LLMs, rather than on proprietary/closed ones like ChatGPT. (10.48550/arXiv.2412.14611)
    DOI : 10.48550/arXiv.2412.14611
  • Bidding Efficiently in Simultaneous Ascending Auctions With Budget and Eligibility Constraints Using Simultaneous Move Monte Carlo Tree Search
    • Pacaud Alexandre
    • Bechler Aurelien
    • Coupechoux Marceau
    IEEE Transactions on Games, Institute of Electrical and Electronics Engineers, 2025, 17 (1), pp.210-223. For decades, simultaneous ascending auction (SAA) has been the most popular mechanism used for spectrum auctions. It has recently been employed by many countries for the allocation of 5G licences. Although SAA presents relatively simple rules, it induces a complex strategic game for which the optimal bidding strategy is unknown. Considering the fact that sometimes billions of euros are at stake in an SAA, establishing an efficient bidding strategy is crucial. In this work, we model the auction as a n-player simultaneous move game with complete information and propose the first efficient bidding algorithm that tackles simultaneously its four major strategic issues: the exposure problem, the own price effect, budget constraints, and the eligibility management problem. Our solution, called SMSα, is based on simultaneous move Monte Carlo Tree Search and relies on a new method for the prediction of closing prices. By introducing a new reward function in SMSα, we give the possibility to bidders to define their own level of risk-aversion. Through extensive numerical experiments on instances of realistic size, we show that SMSα largely outperforms state-of-the-art algorithms, notably by achieving higher expected utility while taking less risks. (10.1109/TG.2024.3424246)
    DOI : 10.1109/TG.2024.3424246
  • WiGNet: Windowed Vision Graph Neural Network
    • Spadaro Gabriele
    • Grangetto Marco
    • Fiandrotti Attilio
    • Tartaglione Enzo
    • Giraldo Jhony
    , 2024, pp.859-868. In recent years, Graph Neural Networks (GNNs) have demonstrated strong adaptability to various real-world challenges, with architectures such as Vision GNN (ViG) achieving state-of-the-art performance in several computer vision tasks. However, their practical applicability is hindered by the computational complexity of constructing the graph, which scales quadratically with the image size. In this paper, we introduce a novel Windowed vision Graph neural Network (WiGNet) model for efficient image processing. WiGNet explores a different strategy from previous works by partitioning the image into windows and constructing a graph within each window. Therefore, our model uses graph convolutions instead of the typical 2D convolution or self-attention mechanism. WiGNet effectively manages computational and memory complexity for large image sizes. We evaluate our method in the ImageNet-1k benchmark dataset and test the adaptability of WiGNet using the CelebA-HQ dataset as a downstream task with higher-resolution images. In both of these scenarios, our method achieves competitive results compared to previous vision GNNs while keeping memory and computational complexity at bay. WiGNet offers a promising solution toward the deployment of vision GNNs in real-world applications. We publicly released the code and pre-trained models at https://github.com/EIDOSLAB/WiGNet. (10.1109/WACV61041.2025.00093)
    DOI : 10.1109/WACV61041.2025.00093
  • Efficient Progressive Image Compression with Variance-Aware Masking
    • Presta Alberto
    • Tartaglione Enzo
    • Fiandrotti Attilio
    • Grangetto Marco
    • Cosman Pamela
    , 2025, pp.7692-7700. Learned progressive image compression is gaining momentum as it allows improved image reconstruction as more bits are decoded at the receiver. We propose a progressive image compression method in which an image is first represented as a pair of base-quality and top-quality latent representations. Next, a residual latent representation is encoded as the element-wise difference between the top and base representations. Our scheme enables progressive image compression with element-wise granularity by introducing a masking system that ranks each element of the residual latent representation from most to least important, dividing it into complementary components, which can be transmitted separately to the decoder in order to obtain different reconstruction quality. The masking system does not add further parameters or complexity. At the receiver, any elements of the top latent representation excluded from the transmitted components can be independently replaced with the mean predicted by the hyperprior architecture, ensuring reliable reconstructions at any intermediate quality level. We also in-troduced Rate Enhancement Modules (REMs), which refine the estimation of entropy parameters using already decoded components. We obtain results competitive with state-of-the-art competitors, while significantly reducing computational complexity, decoding time, and number of parameters. (10.1109/WACV61041.2025.00747)
    DOI : 10.1109/WACV61041.2025.00747
  • ELMGS: Enhancing Memory and Computation Scalability Through coMpression for 3D Gaussian Splatting
    • Ali Muhammad Salman
    • Bae Sung-Ho
    • Tartaglione Enzo
    , 2025, pp.2591-2600. 3D models have recently been popularized by the potentiality of end-to-end training offered first by Neural Radiance Fields and most recently by 3D Gaussian Splatting models. The latter has the big advantage of naturally providing fast training convergence and high editability. However, as the research around these is still in its infancy, there is still a gap in the literature regarding the model's scalability. In this work, we propose an approach enabling both memory and computation scalability of such models. More specifically, we propose an iterative pruning strategy that removes redundant information encoded in the model. We also enhance compressibility for the model by including a differentiable quantization and entropy coding estimator in the optimization strategy. Our results on popular benchmarks showcase the effectiveness of the proposed approach and open the road to the broad deployability of such a solution even on resource-constrained devices. (10.1109/WACV61041.2025.00257)
    DOI : 10.1109/WACV61041.2025.00257
  • Enabling Incremental SysML Model Verification: Managing Variability and Complexity Through Tagging and Model Reduction
    • Sultan Bastien
    • Apvrille Ludovic
    • Hotescu Oana
    • de Saqui-Sannes Pierre
    , 2025, pp.224-233. <div><p>Designing complex software systems with model-based approaches encounters the recognized state space explosion problem. Typically, only a subset of models can be formally verified, forcing reliance on simulation or testing to verify the entire system. Furthermore, most formal verification tools require a complete reevaluation of properties after even minor modifications to a model. Although incremental formal verification, particularly the incremental model-checking approach of TTool, has been proposed, it still requires modelers to manually select sub-models not facing state space explosion. Unfortunately, this manual model selection is susceptible to errors. This paper presents a twofold contribution to SysML models of software product lines. First, we introduce a SysML model tagging feature that enables designers to explicitly differentiate between various subsystems, such as core and optional features. Second, we develop and implement a model reduction algorithm using dependency graphs (DGs). This algorithm automatically deactivate model elements linked to specific tags, removing both the specified elements and all their logical dependencies provided the DG is acyclic. These two contributions are evaluated for their effectiveness in generating model variants. Together, they facilitate the creation of a core model and an associated set of models, each extended by additional model elements, and make it possible to rely on incremental model-checking. We have implemented the contributions in TTool and applied it to an integrated modular avionics system. This application enables to compare-both manual and automated-model reduction strategies and assess their benefits for TTool users. a</p></div> (10.5220/0013182300003896)
    DOI : 10.5220/0013182300003896
  • CATALOG: A Camera Trap Language-guided Contrastive Learning Model
    • Santamaria Julian
    • Isaza Claudia
    • Giraldo Jhony
    , 2025, pp.1197-1206. <div><p>Foundation Models (FMs) have been successful in various computer vision tasks like image classification, object detection and image segmentation. However, these tasks remain challenging when these models are tested on datasets with different distributions from the training dataset, a problem known as domain shift. This is especially problematic for recognizing animal species in camera-trap images where we have variability in factors like lighting, camouflage and occlusions. In this paper, we propose the Camera Trap Language-guided Contrastive Learning (CATALOG) model to address these issues. Our approach combines multiple FMs to extract visual and textual features from camera-trap data and uses a contrastive loss function to train the model. We evaluate CATALOG on two benchmark datasets and show that it outperforms previous state-of-theart methods in camera-trap image recognition, especially when the training and testing data have different animal species or come from different geographical areas. Our approach demonstrates the potential of using FMs in combination with multi-modal fusion and contrastive learning for addressing domain shifts in camera-trap image recognition. The code of CATALOG is publicly available at https://github.com/Julian075/CATALOG.</p></div> (10.1109/WACV61041.2025.00124)
    DOI : 10.1109/WACV61041.2025.00124
  • Till the Layers Collapse: Compressing a Deep Neural Network Through the Lenses of Batch Normalization Layers.
    • Liao Zhu
    • Hezbri Nour
    • Quétu Victor
    • Nguyen Van-Tam
    • Tartaglione Enzo
    , 2025, 39 (18), pp.18702-18710. Today, deep neural networks are widely used since they can handle a variety of complex tasks. Their generality makes them very powerful tools in modern technology. However, deep neural networks are often overparameterized. The usage of these large models consumes a lot of computation resources. In this paper, we introduce a method called Till the Layers Collapse (TLC), which compresses deep neural networks through the lenses of batch normalization layers. By reducing the depth of these networks, our method decreases deep neural networks' computational requirements and overall latency. We validate our method on popular models such as Swin-T, MobileNet-V2, and RoBERTa, across both image classification and natural language processing (NLP) tasks. (10.1609/aaai.v39i18.34058)
    DOI : 10.1609/aaai.v39i18.34058
  • Measuring Cross-Modal Interactions in Multimodal Models
    • Wenderoth Laura
    • Hemker Konstantin
    • Simidjievski Nikola
    • Jamnik Mateja
    , 2025, 39 (20), pp.21501-21509. Integrating AI in healthcare can greatly improve patient care and system efficiency. However, the lack of explainability in AI systems (XAI) hinders their clinical adoption, especially in multimodal decision-making that combines various data sources. The majority of existing XAI methods focus on unimodal models, which fail to capture cross-modal interactions that are crucial for understanding the combined impact of multiple data sources. Existing methods for quantifying cross-modal interactions are limited to two modalities, rely on labelled data, and depend on model performance, which is problematic in healthcare, where XAI must handle multiple data sources and provide individualised explanations. This paper introduces InterSHAP, a cross-modal interaction score that addresses the limitations of existing approaches. InterSHAP uses the Shapley interaction index to precisely separate and quantify the contributions of the individual modalities and their interactions without approximations. By integrating an open-source implementation with the SHAP package, we enhance reproducibility and ease of use. We show that InterSHAP accurately measures the presence of cross-modal interactions, can handle multiple modalities, and provides detailed explanations at a local level for individual data points. Furthermore, we apply InterSHAP to real medical multimodal datasets, and demonstrate its practical applicability for individualised explanations. (10.1609/aaai.v39i20.35452)
    DOI : 10.1609/aaai.v39i20.35452
  • HYGENE: A Diffusion-Based Hypergraph Generation Method
    • Gailhard Dorian
    • Tartaglione Enzo
    • Naviner Lirida
    • Giraldo Jhony
    , 2025, 39 (16), pp.16682-16690. Hypergraphs are powerful mathematical structures that can model complex, high-order relationships in various domains, including social networks, bioinformatics, and recommender systems. However, generating realistic and diverse hypergraphs remains challenging due to their inherent complexity and lack of effective generative models. In this paper, we introduce a diffusion-based Hypergraph Generation (HYGENE) method that addresses these challenges through a progressive local expansion approach. HYGENE works on the bipartite representation of hypergraphs, starting with a single pair of connected nodes and iteratively expanding it to form the target hypergraph. At each step, nodes and hyperedges are added in a localized manner using a denoising diffusion process, which allows for the construction of the global structure before refining local details. Our experiments demonstrated the effectiveness of HYGENE, proving its ability to closely mimic a variety of properties in hypergraphs. To the best of our knowledge, this is the first attempt to employ diffusion models for hypergraph generation. (10.1609/aaai.v39i16.33833)
    DOI : 10.1609/aaai.v39i16.33833
  • Decoding Persuasiveness in Eloquence Competitions: An Investigation into the LLM’s Ability to Assess Public Speaking
    • Barkar Alisa
    • Chollet Mathieu
    • Labeau Matthieu
    • Biancardi Beatrice
    • Clavel Chloé
    , 2025, pp.538-546. The increasing importance of public speaking (PS) skills has fueled the development of automated assessment systems, yet the integration of large language models (LLMs) in this domain remains underexplored. This study investigates the application of LLMs for assessing PS by predicting persuasiveness. We propose a novel framework where LLMs evaluate criteria derived from educational literature and feedback from PS coaches, offering new interpretable textual features. We demonstrate that persuasiveness predictions of a regression model with the new features achieve a Root Mean Squared Error (RMSE) of 0.6, underperforming approach with hand-crafted lexical features (RMSE 0.51) and outperforming direct zero-shot LLM persuasiveness predictions (RMSE of 0.8). Furthermore, we find that only LLM-evaluated criteria of language level is predictable from lexical features (F1-score of 0.56), disapproving relations between these features. Based on our findings, we criticise the abilities of LLMs to analyze PS accurately. To ensure reproducibility and adaptability to emerging models, all source code and materials are publicly available on GitHub. (10.5220/0013158400003890)
    DOI : 10.5220/0013158400003890
  • Lessons for Interactive Theorem Proving Researchers from a Survey of Coq Users
    • de Almeida Borges Ana
    • Casanueva Artís Annalí
    • Falleri Jean-Rémy
    • Gallego Arias Emilio Jesús
    • Martin-Dorel Érik
    • Palmskog Karl
    • Serebrenik Alexander
    • Zimmermann Théo
    Journal of Automated Reasoning, Springer Verlag, 2025, 69 (8), pp.1-29. The Coq Community Survey 2022 was an online public survey of users of the Coq proof assistant conducted during February 2022. Broadly, the survey asked about use of Coq features, user interfaces, libraries, plugins, and tools, views on renaming Coq and Coq improvements, and also demographic data such as education and experience with Coq and other proof assistants and programming languages. The survey received 466 submitted responses, making it the largest survey of users of an interactive theorem prover (ITP) so far. We present the design of the survey, a summary of key results, and analysis of answers relevant to ITP technology development and usage. In particular, we analyze user characteristics associated with adoption of tools and libraries and make comparisons to adjacent software communities. Notably, we find that experience has significant impact on Coq user behavior, including on usage of tools, libraries, and integrated development environments (IDEs). (10.1007/s10817-025-09720-1)
    DOI : 10.1007/s10817-025-09720-1
  • A Single-Layer Efficient Metasurface Absorber for RF Energy Harvesting Applications
    • Sharifi Raziyeh
    • Lepage Anne Claire
    • Niotaki Kyriaki
    • Begaud Xavier
    , 2025. In this contribution, a single-layer metasurface absorber is proposed for collecting the ambient radio frequency energy at 2.45 GHz. A design methodology is proposed to optimize the finite array performance. An absorber of 4×5 cells is designed and simulated to demonstrate that this methodology enables an improvement of the capturing efficiency from 65% to 90%. The proposed finite array is fabricated and its capturing efficiency is measured in order to be compared to a simulation.
  • Impact of Sub-Segment Representations in DASH on the Live Streaming Experience
    • Ugur Deniz
    • Bouqueau Romain
    • Stattmann Michael
    • Feuvre Jean Le
    , 2025, pp.72-73. The demand for low-latency live streaming continues to grow, necessitating advancements in transport and packaging technologies. Media-over-QUIC (MoQ) has emerged as a promising approach for scalable, low-latency streaming, though its adoption remains in early stages. Meanwhile, the 6th Edition of MPEG-DASH introduces Sub-Segment Representations (SSR) to reduce tune-in times by leveraging dual-encoding strategies with varying Group of Pictures (GOP) lengths. This paper evaluates the feasibility of SSR in ultra-low latency streaming environments, particularly its interaction with Content Delivery Networks (CDNs) and HTTP/3 stream prioritization. Through controlled experiments, we compare the performance of Low-Latency Low-Delay (L3D) playback against standard playback under different CDN configurations. Results indicate that L3D significantly reduces preroll buffering and stabilizes playback, even at sub-second latencies. Additionally, our findings highlight the role of QUIC and HTTP/3 in optimizing live streaming performance. These insights contribute to understanding SSR's potential in improving scalability, latency, and viewer experience in live streaming workflows. (10.1145/3715675.3715807)
    DOI : 10.1145/3715675.3715807
  • Distributed Coherent Sensing Over Deployed Fibers for Network as a Sensor Applications
    • Guerrier Sterenn
    • Dorize Christian
    • Abdelli Khouloud
    • Mardoyan Haïk
    • Pavani Henrique
    • Antonelli Cristian
    • Mecozzi Antonio
    • Koubaa Amin
    • Darwish Khalid
    • Biyahi Mohammed
    • Awwad Élie
    • Renaudier Jérémie
    Journal of Lightwave Technology, Institute of Electrical and Electronics Engineers (IEEE)/Optical Society of America(OSA), 2025, 43 (4), pp.1736-1745. We discuss the performance of Coherent-MIMO-DFS over deployed optical networks in various configurations and address technological challenges such as adaptation to various fiber types and disturbance identification. Two different field trial results are exploited, demonstrating the adaptability of our distributed fiber sensing interrogator to the diverse environments that can be encountered in the context of sensing over terrestrial telecommunication networks. We discuss the advantages of extracting the full backscattered Jones matrices using a DFS interrogator. Mechanical events including threats to the infrastructure are localized and identified and different identification methods are discussed. We draw a correspondence between lab measurements and field events based on the type of environment, and finally we present a classification method based on transfer learning with 90% accuracy. (10.1109/JLT.2024.3498070)
    DOI : 10.1109/JLT.2024.3498070
  • A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning
    • Sangare Amadou S.
    • Dunou Nicolas
    • Giraldo Jhony H.
    • Malliaros Fragkiskos D.
    Transactions on Machine Learning Research Journal, [Amherst Massachusetts]: OpenReview.net, 2022, 2025. Self-supervised learning has become a key method for training deep learning models when labeled data is scarce or unavailable. While graph machine learning holds great promise across various domains, the design of effective pretext tasks for self-supervised graph representation learning remains challenging. Contrastive learning, a popular approach in graph self-supervised learning, leverages positive and negative pairs to compute a contrastive loss function. However, current graph contrastive learning methods often struggle to fully use structural patterns and node similarities. To address these issues, we present a new method called Fused Gromov-Wasserstein Subgraph Contrastive Learning (FOSSIL). Our method integrates node-level and subgraph-level contrastive learning, seamlessly combining a standard node-level contrastive loss with the Fused Gromov-Wasserstein distance. This combination helps our method capture both node features and graph structure together. Importantly, our approach works well with both homophilic and heterophilic graphs and can dynamically create views for generating positive and negative pairs. Through extensive experiments on benchmark graph datasets, we show that FOSSIL outperforms or achieves competitive performance compared to current state-of-the-art methods.
  • Security and Robustness of Autonomous Driving Systems Against Physical Adversarial Attack
    • Chi Lijun
    , 2025. With iterative hardware upgrades and advancements in deep neural networks (DNNs), autonomous driving systems (ADS) are increasingly integrated in life. However, before this technology becomes widespread, a security issue that needs to be addressed is physical adversarial attacks. Such attacks can manipulate real-world objects to disrupt the perception of ADSs and cause traffic accidents. In addition, the diversity of physical attacks makes it difficult for passive defenders.This study addresses these challenges by analyzing, evaluating, and developing practical strategies to improve the robustness of ADS.It begins with a review of recent physical adversarial attacks that identifies specific threats to ADSs.It then introduces a novel public attention-based black-box attack that demonstrates how an attacker can exploit ADS awareness without full knowledge of the system, highlighting the need for enhanced defenses.Next, a lightweight detection framework is proposed for real-time laser-based attack detection. Additionally, a defense mechanism called Laser Shield is developed, using polarizers to block harmful laser signals and enhance ADS security.
  • Private Data Analysis over Encrypted Databases : Mixing Functional Encryption with Computational Differential Privacy
    • Alborch Escobar Ferran
    , 2025. In our current digitalized society, data is ruling the world. But as it is most of the time related to individuals, its exploitation should respect the privacy of the latter. This issue has raised the differential privacy paradigm, which permits to protect individuals when querying databases containing data about them. But with the emergence of cloud computing, it is becoming increasingly necessary to also consider the confidentiality of "on-cloud'' storage confidentiality of such vast databases, using encryption techniques. This thesis studies how to provide both privacy and confidentiality of such outsourced databases by mixing two primitives: computational differential privacy and functional encryption. First, we study the relationship between computational differential privacy and functional encryption for randomized functions in a generic way. We analyze the privacy of the setting where a malicious analyst may access the encrypted data stored in a server, either by corrupting or breaching it, and prove that a secure randomized functional encryption scheme supporting the appropriate family of functions guarantees the computational differential privacy of the system. Second, we construct efficient randomized functional encryption schemes for certain useful families of functions, and we prove them secure in the standard model under well-known assumptions. The families of functions considered are linear functions, used for example in counting queries, histograms and linear regressions, and quadratic functions, used for example in quadratic regressions and hypothesis testing. The schemes built are then used together with the first result to construct encrypted databases for their corresponding family of queries. Finally, we implement both randomized functional encryption schemes to analyze their efficiency. This shows that our constructions are practical for databases with up to 1 000 000 entries in the case of linear queries and databases with up to 10 000 database entries in the case of quadratic queries.
  • Laser Guard: Efficiently Detecting Laser-Based Physical Adversarial Attacks in Autonomous Driving
    • Chi Lijun
    • Msahli Mounira
    IEEE Access, IEEE, 2025, 13, pp.35219-35229. <div><p>The fast development of deep learning (DL) enables even resource-constrained devices to tackle complex artificial intelligence (AI) tasks, especially those related to environment perception in autonomous driving systems (ADS). However, AI models deployed in the real world are exposed to the threats of adversarial examples (AE). One specific type of physical attack utilizes laser beams or spots planted on images rather than crafted pixel-level perturbations to manipulate the victim deep neural networks (DNN) prediction. These attacks easily mislead traffic sign recognition and object detection in ADS. Laser-based adversarial attacks are cognitively stealthy but visually conspicuous, invalidating the previous defenses designed for digital attacks. This study considers two state-of-the-art (SOTA) laser-based attacks and establishes a benchmark comprising thousands of AEs. Such AEs have distinct pattern features, significant occupation, high contrast, and low variance. Based on the observation, a lightweight detection framework, Laser Guard, is proposed. Specifically, preprocessing methods are used to approximate the laserperturbed areas, followed by a statistics-based strategy to determine abnormalities in the given samples. This framework can be applied in a plug-and-play manner with DNNs in intelligent vehicles. Extensive experimental results show that the framework can effectively filter out about 70-75% of laser-based street sign AEs, and extends well to other objects, successfully filtering out 80%. The detection latency of objects AEs is marginal, with the average detection time for laser spots being approximately 24 ms, and for laser beams, it is around 57 ms.</p><p>INDEX TERMS Deep learning, adversarial attacks, detection-based defense, laser-based attacks, preprocessing.</p></div> (10.1109/ACCESS.2025.3540653)
    DOI : 10.1109/ACCESS.2025.3540653
  • Ambiguity and Invariance in Machine Listening
    • Perera David
    , 2025. Machine listening is a growing field with applications in security (audio surveillance), health (sound-based diagnosis), transportation (autonomous driving), manufacturing (predictive maintenance), and bioacoustics (ecosystem tracking). It addresses tasks like sound event detection, sound source localization, and speech separation. This thesis tackles two key challenges: first, the lack of training data in this field, which hinders deep neural networks, typically most effective using large data sets; second, the ambiguity in many tasks, where input-target relations are non-deterministic, which challenges the use of single-prediction models. To address the data shortage, we apply semi supervised invariance-based learning, which penalizes model variations near training data and enforces invariance, enhancing data efficiency and generalization capabilities. Using sound event detection as a case study, we investigate the impact of different data augmentations, their intensity, and the layer of the neural network used for penalization.To tackle ambiguity, we use Multiple Choice Learning (MCL), a framework that trains a multi-head neural network to produce a small set of plausible and diverse predictions, using a competitive training scheme that promotes the specialization of the predictions in different regions of the prediction space. We investigate the efficiency of MCL for machine listening, addressing two key challenges. First, MCL suffers from hypothesis collapse, where some network heads stop being updated by gradient descent. We mitigate this by introducing annealing, which ensures that collapsed heads receive gradients and guides the optimization toward better solutions. Second, we extend MCL’s discrete predictions to create a non-sparse estimator of the target probability distribution. We find that usual approaches fail to converge to the true target distribution when the number of predictions grows large. We identify the cause of this issue—kernel density leakage between heads—and propose kernel truncation as a solution, proving that it guarantees the convergence of the estimators. These methods are shown to improve performance in speech separation tasks.
  • Graph-based Moving Object Segmentation for underwater videos using semi-supervised learning
    • Kapoor Meghna
    • Prummel Wieke
    • Giraldo Jhony
    • Subudhi Badri Narayan
    • Zakharova Anastasia
    • Bouwmans Thierry
    • Bansal Ankur
    Computer Vision and Image Understanding, Elsevier, 2025, 252, pp.104290. Moving object segmentation (MOS) using passive underwater image processing is an important technology for monitoring marine habitats. It aids marine biologists studying biological oceanography and the associated fields of chemical, physical, and geological oceanography to understand marine organisms. Dynamic backgrounds due to marine organisms like algae and seaweed, and improper illumination of the environment pose challenges in detecting moving objects in the scene. Previous graph-learning methods have shown promising results in MOS, but are mostly limited to terrestrial surface videos such as traffic video surveillance. Traditional object modeling fails in underwater scenes, due to fish shape and color degradation in motion and the lack of extensive underwater datasets for deep-learning models. Therefore, we propose a semi-supervised graph-learning approach (GraphMOS-U) to segment moving objects in underwater environments. Additionally, existing datasets were consolidated to form the proposed Teleost Fish Classification Dataset, specifically designed for fish classification tasks in complex environments to avoid unseen scenes, ensuring the replication of the transfer learning process on a ResNet-50 backbone. GraphMOS-U uses a six-step approach with transfer learning using Mask R-CNN and a ResNet-50 backbone for instance segmentation, followed by feature extraction using optical flow, visual saliency, and texture. After concatenating these features, a k-NN Graph is constructed, and graph node classification is applied to label objects as foreground or background. The foreground nodes are used to reconstruct the segmentation map of the moving object from the scene. Quantitative and qualitative experiments demonstrate that GraphMOS-U outperforms state-of-the-art algorithms, accurately detecting moving objects while preserving fine details. The proposed method enables the use of graph-based MOS algorithms in underwater scenes. (10.1016/j.cviu.2025.104290)
    DOI : 10.1016/j.cviu.2025.104290