Sorry, you need to enable JavaScript to visit this website.
Partager

Publications

 

Les publications de nos enseignants-chercheurs sont sur la plateforme HAL :

 

Les publications des thèses des docteurs du LTCI sont sur la plateforme HAL :

 

Retrouver les publications figurant dans l'archive ouverte HAL par année :

2024

  • Neural Film Grain Rendering
    • Lesné Gwilherm
    • Gousseau Yann
    • Ladjal Saïd
    • Newson Alasdair
    , 2024. Film grain refers to the specific texture of film-acquired images, due to the physical nature of photographic film. Being a visual signature of such images, there is a strong interest in the film-industry for the rendering of these textures for digital images. Some previous works are able to closely mimic the physics of films and produce high quality results, but are computationally expensive. We propose a method based on a lightweight neural network and a texture aware loss function, achieving realistic results with very low complexity, even for large grains and high resolutions. We evaluate our algorithm both quantitatively and qualitatively with respect to previous work.
  • Regular variation in Hilbert spaces and principal component analysis for functional extremes
    • Clémençon Stéphan
    • Huet Nathan
    • Sabourin Anne
    Stochastic Processes and their Applications, Elsevier, 2024, 174, pp.104375. Motivated by the increasing availability of data of functional nature, we develop a general probabilistic and statistical framework for extremes of regularly varying random elements X in L² [0, 1]. We place ourselves in a Peaks-Over-Threshold framework where a functional extreme is defined as an observation X whose L² -norm ‖X‖ is comparatively large. Our goal is to propose a dimension reduction framework resulting into finite dimensional projections for such extreme observations. Our contribution is double. First, we investigate the notion of Regular Variation for random quantities valued in a general separable Hilbert space, for which we propose a novel concrete characterization involving solely stochastic convergence of real-valued random variables. Second, we propose a notion of functional Principal Component Analysis (PCA) accounting for the principal ‘directions’ of functional extremes. We investigate the statistical properties of the empirical covariance operator of the angular component of extreme functions, by upper-bounding the Hilbert–Schmidt norm of the estimation error for finite sample sizes. Numerical experiments with simulated and real data illustrate this work. (10.1016/j.spa.2024.104375)
    DOI : 10.1016/j.spa.2024.104375
  • Functional analysis on hypergraphs: Density and zeta functions – applications to molecular graphs and image analysis
    • Bloch Isabelle
    • Bretto Alain
    Information Sciences, Elsevier, 2024, 676, pp.120850. This paper introduces new descriptors and invariants for hypergraphs. We develop a new type of Zeta functions and density functions, that are proved to have useful invariance and monotony properties. Links with hypergraph entropies are established as well. New matrices linked with hypergraphs are also proposed, from which a new type of Laplacian associated with hypergraphs is derived. Two applications are then suggested, molecular graphs in chemistry and image analysis, where the proposed functions and invariants are useful characterizations. (10.1016/j.ins.2024.120850)
    DOI : 10.1016/j.ins.2024.120850
  • Tractable Circuits in Database Theory
    • Amarilli Antoine
    • Capelli Florent
    SIGMOD record, ACM, 2024, 53 (2), pp.6-20. This work reviews how database theory uses tractable circuit classes from knowledge compilation. We present relevant query evaluation tasks, and notions of tractable circuits. We then show how these tractable circuits can be used to address database tasks. We first focus on Boolean provenance and its applications for aggregation tasks, in particular probabilistic query evaluation. We study these for Monadic Second Order (MSO) queries on trees, and for safe Conjunctive Queries (CQs) and Union of Conjunctive Queries (UCQs). We also study circuit representations of query answers, and their applications to enumeration tasks: both in the Boolean setting (for MSO) and the multivalued setting (for CQs and UCQs). (10.1145/3685980.3685982)
    DOI : 10.1145/3685980.3685982
  • A model-based approach for assessing the security of cyber-physical systems
    • Teixeira de Castro Hugo
    • Hussain Ahmed
    • Blanc Gregory
    • El Hachem Jamal
    • Blouin Dominique
    • Leneutre Jean
    • Papadimitratos Panos
    , 2024 (121), pp.1-10. Cyber-Physical Systems (CPSs) complexity has been continuously increasing to support new life-impacting applications, such as Internet of Things (IoT) devices or Industrial Control Systems (ICSs). These characteristics introduce new critical security challenges to both industrial practitioners and academics. This work investigates how Model-Based System Engineering (MBSE) and attack graph approaches could be leveraged to model secure Cyber-Physical System solutions and identify high-impact attacks early in the system development life cycle. To achieve this, we propose a new framework that comprises (1) an easily adoptable modeling paradigm for Cyber-Physical System representation, (2) an attack-graph-based solution for Cyber-Physical System automatic quantitative security analysis, based on the MulVAL security tool, (3) a set of Model-To-Text (MTT) transformation rules to bridge the gap between SysML and MulVAL. We illustrated the validity of our proposed framework through an autonomous ventilation system example. A Denial of Service (DoS) attack targeting an industrial communication protocol was identified and displayed as attack graphs. In future work, we intend to connect the approach to dynamic security databases for automatic countermeasure selection. (10.1145/3664476.3670470)
    DOI : 10.1145/3664476.3670470
  • Impact of Scaling Up the Sensor Sampling Frequency on the Reliability of Edge Processing Systems in Tolerating Soft Errors Caused by Neutrons
    • Minelli de Carvalho Matheus
    • Laurini Luiz Henrique
    • Atukpor Emmanuel
    • Naviner Lirida
    • Possamai Bastos Rodrigo
    IEEE Sensors Letters, IEEE, 2024, 8 (9), pp.Article Sequence Number: 7004304. <p>In this letter, we reveal the impact of increasing the sensor sampling frequency on the reliability of a typical edge processing system operating under the effects of 14-MeV neutrons and thermal neutrons. The results of two types of accelerated radiation tests indicate the rates of failures induced by soft errors caused by 14-MeV and thermal neutrons grow as a function of the sensor sampling frequency. The rate of failures caused by 14-MeV neutrons rose by factor of 2.2 by shifting the sensor sampling frequency from around 140 to 430 Hz. The results also suggest that the design and calibration of edge processing systems should consider the sensor sampling frequency as a parameter to finely tradeoff the computing speed of the system for improving the reliability in tolerating soft errors caused by neutrons.</p> (10.1109/LSENS.2024.3435677)
    DOI : 10.1109/LSENS.2024.3435677
  • Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning
    • de Min Thomas
    • Mancini Massimiliano
    • Lathuilière Stéphane
    • Roy Subhankar
    • Ricci Elisa
    , 2024. Prompt tuning has emerged as an effective rehearsal-free technique for class-incremental learning (CIL) that learns a tiny set of task-specific parameters (or prompts) to instruct a pre-trained transformer to learn on a sequence of tasks. Albeit effective, prompt tuning methods do not lend well in the multi-label class incremental learning (MLCIL) scenario (where an image contains multiple foreground classes) due to the ambiguity in selecting the correct prompt(s) corresponding to different foreground objects belonging to multiple tasks. To circumvent this issue we propose to eliminate the prompt selection mechanism by maintaining task-specific pathways, which allow us to learn representations that do not interact with the ones from the other tasks. Since independent pathways in truly incremental scenarios will result in an explosion of computation due to the quadratically complex multi-head self-attention (MSA) operation in prompt tuning, we propose to reduce the original patch token embeddings into summarized tokens. Prompt tuning is then applied to these fewer summarized tokens to compute the final representation. Our proposed method Multi-Label class incremental learning via summarising pAtch tokeN Embeddings (MULTI-LANE) enables learning disentangled task-specific representations in MLCIL while ensuring fast inference. We conduct experiments in common benchmarks and demonstrate that our MULTI-LANE achieves a new state-of-the-art in MLCIL. Additionally, we show that MULTI-LANE is also competitive in the CIL setting. Source code available at https://github.com/tdemin16/multi-lane
  • RF Exposure Assessment in ITS-5.9 GHz V2X Connectivity and Vehicle Wireless Technologies: A Numerical and Experimental Approach
    • Yang Yizhen
    • Masini Barbara
    • Vermeeren Günter
    • van den Akker Daniel
    • Aerts Sam
    • Verloock Leen
    • Chiaramello Emma
    • Bonato Marta
    • Wiart Joe
    • Tognola Gabriella
    • Joseph Wout
    IEEE Access, IEEE, 2024, 11, pp.1-20. s Vehicle-to-Everything (V2X) communication technologies gain prominence, ensuring human safety from radiofrequency (RF) electromagnetic fields (EMF) becomes paramount. This study critically examines human RF exposure in the context of ITS-5.9 GHz V2X connectivity, employing a combination of numerical dosimetry simulations and targeted experimental measurements. The focus extends across Road-Side Units (RSUs), On-Board Units (OBUs), and, notably, the advanced vehicular technologies within a Tesla Model S, which includes Bluetooth, Long Term Evolution (LTE) modules, and millimeter-wave (mmWave) radar systems. Key findings indicate that RF exposure levels for RSUs and OBUs, as well as from Tesla’s integrated technologies, consistently remain below the International Commission on Non-Ionizing Radiation Protection (ICNIRP) exposure guidelines by a significant margin. Specifically, the maximum exposure level around RSUs was observed to be 10 times lower than ICNIRP reference level, and Tesla’s mmWave radar exposure did not exceed 0.29 W/m 2 , well below the threshold of 10 W/m 2 set for the general public. This comprehensive analysis not only corroborates the effectiveness of numerical dosimetry in accurately predicting RF exposure but also underscores the compliance of current V2X communication technologies with exposure guidelines, thereby facilitating the protective advancement of intelligent transportation systems against potential health risks. (10.1109/ACCESS.2024.3435566)
    DOI : 10.1109/ACCESS.2024.3435566
  • Simplicity Bias in Human-generated data
    • Dessalles Jean-Louis
    • Sileno Giovanni
    , 2024, 46, pp.3637-3643. Texts available on the Web have been generated by human minds. We observe that simple patterns are over-represented: abcdef is more frequent than arfbxg and 1000 appears more often than 1282. We suggest that word frequency patterns can be predicted by cognitive models based on complexity minimization. Conversely, the observation of word frequencies offers an opportunity to infer particular cognitive mechanisms involved in their generation.
  • Shielding the Connected Cars: A Dataset-Powered Defense Against DDoS
    • Wehby Ayoub
    • Khatoun Rida
    • Fadlallah Ahmad
    , 2024.
  • Sufficient Statistic and Recoverability via Quantum Fisher Information
    • Gao Li
    • Li Haojian
    • Marvian Iman
    • Rouzé Cambyse
    Communications in Mathematical Physics, Springer Verlag, 2024, 405 (8), pp.180. We prove that for a large class of quantum Fisher information, a quantum channel is sufficient for a family of quantum states, i.e., the input states can be recovered from the output by some quantum operation, if and only if, the quantum Fisher information is preserved under the quantum channel. This class, for instance, includes Winger– Yanase–Dyson skew information. On the other hand, interestingly, the SLD quantum Fisher information, as the most popular example of quantum analogs of Fisher information, does not satisfy this property. Our recoverability result is obtained by studying monotone metrics on the quantum state space, i.e. Riemannian metrics non-increasing under the action of quantum channels, a property often called data processing inequality. For two quantum states, the monotone metric gives the corresponding quantum χ2 divergence. We obtain an approximate recovery result in the sense that, if the quantum χ2 divergence is approximately preserved by a quantum channel, then two states can be approximately recovered by the Petz recovery map. We also obtain a universal recovery bound for the χ1/2 divergence. Finally, we discuss applications in the context of quantum thermodynamics and the resource theory of asymmetry. (10.1007/s00220-024-05053-z)
    DOI : 10.1007/s00220-024-05053-z
  • Robustness to spatially-correlated speckle in Plug-and-Play PolSAR despeckling
    • Ulondu-Mendes Cristiano
    • Denis Loïc
    • Deledalle Charles-Alban
    • Tupin Florence
    IEEE Transactions on Geoscience and Remote Sensing, Institute of Electrical and Electronics Engineers, 2024, 62. Synthetic Aperture Radar (SAR) provides valuable information about the Earth's surface in all-weather and dayand-night conditions. Due to the inherent presence of speckle phenomenon, a filtering step is often required to improve the performance of downstream tasks. In this paper, we focus on dealing with the spatial correlations of speckle, which impacts negatively many of the existing speckle filters. Taking advantage of the flexibility of variational methods based on the Plug-and-Play strategy, we propose to use a Gaussian denoiser trained to restore SAR scenes corrupted by colored Gaussian noise with correlation structures typical of a range of radar sensors. Our approach improves the robustness of Plug-and-Play despeckling techniques. Experiments conducted on simulated and real polarimetric SAR images show that the proposed method removes speckle efficiently in the presence of spatial correlations without introducing artifacts, with a good level of detail preservation. Our method can be readily applied, without network retraining or fine-tuning, to filter SAR images from various sensors, acquisition modes (SAR, PolSAR, InSAR, PolInSAR), spatial resolution, and even benefit from co-registered multi-temporal stacks, when available. The code of the trained models is made freely available at https://gitlab.telecom-paris.fr/ring/mulog-drunet. (10.1109/TGRS.2024.3432180)
    DOI : 10.1109/TGRS.2024.3432180
  • Plug-and-Play image restoration with Stochastic deNOising REgularization
    • Renaud Marien
    • Prost Jean
    • Leclaire Arthur
    • Papadakis Nicolas
    , 2024. Plug-and-Play (PnP) algorithms are a class of iterative algorithms that address image inverse problems by combining a physical model and a deep neural network for regularization. Even if they produce impressive image restoration results, these algorithms rely on a non-standard use of a denoiser on images that are less and less noisy along the iterations, which contrasts with recent algorithms based on Diffusion Models (DM), where the denoiser is applied only on re-noised images. We propose a new PnP framework, called Stochastic deNOising REgularization (SNORE), which applies the denoiser only on images with noise of the adequate level. It is based on an explicit stochastic regularization, which leads to a stochastic gradient descent algorithm to solve ill-posed inverse problems. A convergence analysis of this algorithm and its annealing extension is provided. Experimentally, we prove that SNORE is competitive with respect to state-of-the-art methods on deblurring and inpainting tasks, both quantitatively and qualitatively.
  • Winner-takes-all learners are geometry-aware conditional density estimators
    • Letzelter Victor
    • Perera David
    • Rommel Cédric
    • Fontaine Mathieu
    • Essid Slim
    • Richard Gael
    • Pérez Patrick
    , 2024. Winner-takes-all training is a simple learning paradigm, which handles ambiguous tasks by predicting a set of plausible hypotheses. Recently, a connection was established between Winner-takes-all training and centroidal Voronoi tessellations, showing that, once trained, hypotheses should quantize optimally the shape of the conditional distribution to predict. However, the best use of these hypotheses for uncertainty quantification is still an open question. In this work, we show how to leverage the appealing geometric properties of the Winner-takes-all learners for conditional density estimation, without modifying its original training scheme. We theoretically establish the advantages of our novel estimator both in terms of quantization and density estimation, and we demonstrate its competitiveness on synthetic and real-world datasets, including audio data.
  • Statistical wave field theory
    • Badeau Roland
    Journal of the Acoustical Society of America, Acoustical Society of America, 2024, 156 (1), pp.573 - 599. In this paper, we introduce the foundations of the Statistical Wave Field Theory. This theory establishes the statistical laws of waves propagating in a closed bounded volume, that are mathematically implied by the boundary-value problem of the wave equation. These laws are derived from the Sturm-Liouville theory and the mathematical theory of dynamical billiards. They hold after many reflections on the boundary surface, and at high frequency. This is the first statistical theory of reverberation which provides the closed-form expression of the power distribution and the correlations of the wave field jointly over time, frequency, and space inside the bounded volume, in terms of the geometry and the specific admittance of its boundary surface. The Statistical Wave Field Theory may find applications in various science fields, including room acoustics, electromagnetic theory, and nuclear physics. (10.1121/10.0027914)
    DOI : 10.1121/10.0027914
  • Impairement Aware Network Planning
    • Garbhapu Venkata Virajit
    , 2024. Optical networks are the backbone of global data communication, essential for meeting the ever-growing demand for high-speed, reliable networks. This thesis addresses two key challenges: the need for higher capacity and the integration of new optical functionalities. The relentless demand for capacity has pushed the limits of current infrastructure, and while solutions like additional fibers, extended spectrum, or new flex-rate transponders are possible, they come with significant capital expenditure. We propose network-wide heuristics that model linear and nonlinear impairments and suggest per-channel power allocations to maximize network-wide SNR, thus enhancing capacity. The second challenge involves integrating new optical functionalities, which requires consideration of both network-level and physical-layer interactions. Traditional SDN tools often overlook the physical layer, so we developed an optical network simulator that incorporates physical layer impairments into network planning. We demonstrate the integration of an example optical functionality Quantum Key Distribution (QKD), that enhances security through quantum mechanics principles. By optimizing wavelength placement to minimize Raman noise, we propose network-wide heuristics that improve the coexistence of QKD and classical signals in the same band. Addressing these challenges underscores the importance of impairment-aware network planning, forming the core of future optical network design to meet growing demands with enhanced efficiency, capacity, and security.
  • Incremental and Formal Verification of SysML Models
    • Coudert Sophie
    • Apvrille Ludovic
    • Sultan Bastien
    • Hotescu Oana
    • de Saqui-Sannes Pierre
    SN Computer Science, Springer, 2024, 5 (6), pp.714. Agile methods are now commonly used to design critical systems. They consist in progressively doing increments to a model, and subsequently checking that all previously checked properties are still satisfied. Yet, model- checking is not inherently incremental, which means that all proofs must be redone at each stage, where one would expect to redo proofs only for parts of the systems that have been impacted by the modification. This makes model evolution costly and hampers the use of agile development methods. The paper proposes to facilitate model updates (also called mutations): whenever a mutation is performed on a model, the algorithms introduced in this paper can determine which proofs remain valid and which ones must be performed again. The main idea to reduce the proof obligation is to identify new possible execution paths that need to be re-verified. Our algorithm reuses the results of proofs applied to a previous model version. The paper applies this approach on dependency graphs generated from SysML models: our generic propagation algorithm can rework mutated dependency graphs so as to deduce more simple properties to be proved on reduced dependency graphs. Our approach can handle reacha- bility properties and discusses extensions to liveness properties. The embedded system of an autonomous vehicle, characterized by real-time communication constraints, exemplifies the challenges and relevance of our approach. (10.1007/s42979-024-03027-5)
    DOI : 10.1007/s42979-024-03027-5
  • Sequence Selection With Dispersion-Aware Metric for Long-Haul Transmission Systems
    • Liu Jingtian
    • Awwad Élie
    • Hafermann Hartmut
    • Jaouën Yves
    Journal of Lightwave Technology, Institute of Electrical and Electronics Engineers (IEEE)/Optical Society of America(OSA), 2024, 42 (14), pp.4818-4828. Sequence selection (SS) potentially offers a pragmatic way to unlock nonlinear shaping gains in coherent optical fiber communications beyond those offered by probabilistic constellation shaping (PCS). We introduce a novel sign-dependent metric: the energy dispersion index (EDI) of sequences that endured chromatic dispersion, denoted as D-EDI, which exhibits more accurate opposite variations with the transmission performance compared to the standard EDI metric. Then, by applying D-EDI and EDI to the SS process, we present two signaling approaches denoted as D-SS and E-SS respectively. These approaches are designed to minimize rate loss and enhance transmission performance in nonlinear optical fiber transmission systems, catering to both short-distance and long-haul scenarios. With enumerative sphere shaping (ESS) as distribution matcher (DM), our simulation results reveal significant performance gains over ESS without SS, with improvements up to 0.4 bits/4D-symbol. These improvements were observed over a 205-km single-span standard single mode fiber link in wavelength-division multiplexing (WDM) transmission, with five dual-polarization channels, each operating at a net rate of 400 Gbit/s. Furthermore, we demonstrate that D-SS surpasses ESS without SS by 0.03 bits/4D-symbol in achievable information rate over a 30×80 km link in a single-wavelength, with 8 discrete multi-band (DMB) transmission, and an 880 Gbit/s net rate. Notably, our proposed D-SS scheme achieves similar performance to a sequence selection based on a full split-step Fourier method (SSFM) simulation and it consistently delivers throughput enhancements across various block lengths and selected sequence lengths. (10.1109/JLT.2024.3385109)
    DOI : 10.1109/JLT.2024.3385109
  • YAGO 4.5: A Large and Clean Knowledge Base with a Rich Taxonomy
    • Suchanek Fabian M.
    • Alam Mehwish
    • Bonald Thomas
    • Chen Lihu
    • Paris Pierre-Henri
    • Soria Jules
    , 2024, pp.131-140. Knowledge Bases (KBs) find applications in many knowledgeintensive tasks and, most notably, in information retrieval. Wikidata is one of the largest public general-purpose KBs. Yet, its collaborative nature has led to a convoluted schema and taxonomy. The YAGO 4 KB cleaned up the taxonomy by incorporating the ontology of Schema.org, resulting in a cleaner structure amenable to automated reasoning. However, it also cut away large parts of the Wikidata taxonomy, which is essential for information retrieval. In this paper, we extend YAGO 4 with a large part of the Wikidata taxonomy -while respecting logical constraints and the distinction between classes and instances. This yields YAGO 4.5, a new, logically consistent version of YAGO that adds a rich layer of informative classes. An intrinsic and an extrinsic evaluation show the value of the new resource. (10.1145/3626772.3657876)
    DOI : 10.1145/3626772.3657876
  • Decoding Attack Behaviors by Analyzing Patterns in Instruction-Based Attacks using Gem5
    • Awais Muhammad
    • Mushtaq Maria
    • Naviner Lirida
    • Bruguier Florent
    • Haj Jawad
    • Benoit Pascal
    , 2024. The diversity of Instruction Set Architectures (ISAs), each with unique limitations and optimization strategies, presents both opportunities and challenges in processor design. Modern processor vendors leverage these ISAs to enhance security, reliability, and performance. Recent security vulnerabilities, notably Spectre and Meltdown, have underscored the importance of robust hardware security measures. The recent discovery of attacks such as Specter and Meltdown had a high impact on the vendors regarding hardware security. Processor micro-architectures are susceptible to side-channel attacks, which exploit information leakage to identify vulnerabilities. Techniques such as speculative execution and branch prediction, commonly employed by processors from AMD, Intel, and ARM, while beneficial for performance optimization, inadvertently create avenues for such attacks. Additionally, the practice of out-of-order execution, designed to maximize efficiency, can be manipulated to form side channels, further compromising security. Additionally, shared memory resources, particularly cache memory, are another vector for attack. By analyzing access patterns to shared caches, attackers can construct cache-based side channels, facilitating sophisticated attacks like FLUSH+Reload and Prime+Probe.</p><p>In response to these threats, this work proposes a comprehensive mechanism for securing processor micro-architectures against side-channel attacks. Our methodology comprises five stages:</p><p>(1) identifying and developing attack vectors, (2) compiling these attacks across various architectures, (3) scripting simulations using the Gem5 tool, (4) running these simulations, and (5) analyzing the resultant attack traces to understand and mitigate vulnerabilities. These stages are detailed in the next paragraphs.
  • Cryptographic Accumulators: New Definitions, Enhanced Security, and Delegatable Proofs
    • Barthoulot Anaïs
    • Blazy Olivier
    • Canard Sébastien
    , 2024, 14861, pp.In press. Cryptographic accumulators, introduced in 1993 by Benaloh and De Mare, represent a set with a concise value and offer proofs of (non-)membership. Accumulators have evolved, becoming essential in anonymous credentials, e-cash, and blockchain applications. Various properties like dynamic and universal emerged for specific needs, leading to multiple accumulator definitions. In 2015, Derler, Hanser, and Slamanig proposed a unified model, but new properties, including zero-knowledge security, have arisen since. We offer a new definition of accumulators, based on Derler et al.’s, that is suitable for all properties. We also introduce a new security property, unforgeability of private evaluation, to protect accumulator from forgery and we verify this property in Barthoulot, Blazy, and Canard’s recent accumulator. Finally we provide discussions on security properties of accumulators and on the delegatable (non-)membership proofs property.
  • Learning deep kernel networks : application to efficient and robust structured prediction
    • El Ahmad Tamim
    , 2024. The task of predicting structured objects, e.g. graphs or sequences, is more demanding than the standard supervised regression or classification problems, where the outputs are usually low-dimensional vectors. It has recently attracted a lot of attention in various fields, such as computational biology and chemistry. Such structured spaces are usually high-dimensional, discrete, large, and lack of linear structure, which makes it difficult to design a versatile model, i.e. a model able to deal with various output types within a unified framework, together with strong theoretical foundations.In this thesis, we focus on surrogate kernel methods, and in particular Input Output Kernel Regression, a versatile and theoretically-funded structured prediction approach leveraging the kernel trick in both the input and output spaces. However, in practice, this method exhibits some flaws. As with other kernel-based methods, IOKR suffers from computational burdens at both the training and inference phases. Moreover, it benefits from a closed-form solution when combined with the squared loss, and it is challenging to employ a wider variety of losses. Finally, it is not efficient in handling complex inputs such as images or texts. Our goal is then to design an OKR model that is: scalable to large datasets, theoretically sound (i.e. for which excess risk bounds can be derived), compatible with a wider variety of losses, and able to learn representations from complex inputs.In the first part of this thesis, we focus on the input kernel, and introduce a new sub-Gaussian sketching distribution, called the p-sparsified sketches, in order to scale-up matrix-valued decomposable kernel machines with generic Lipschitz-continuous losses. Sketching consists in manipulating random linear projections to reduce computational complexity while maintaining good statistical performance. We additionally provide an excess risk bound of the estimator induced by this approach.In the second part, we introduce Sketched Input Sketched Output Kernel Regression, an IOKR-based method that leverages sketching on both the input and output kernels to induce a reduced-rank structured estimator. We derive its excess risk bound with sub-Gaussian or sub-sampling input/output sketches and show that it attains close-to-optimal learning rates. Besides, we demonstrate the strong empirical performance of SISOKR on datasets on which IOKR is intractable.In the last part, we apply sketching on the output kernel and introduce a deep neural architecture able to predict within the possibly infinite-dimensional output kernel's feature space. Indeed, we compute the basis induced by the eigenfunctions of the sketched output empirical covariance operator, and Deep Sketched Output Kernel Regression's neural network then computes an expansion within this basis and learns its coordinates during training. This unlocks the use of gradient-based methods for any loss which is the composition of the square loss with a sub-differentiable function, such as standard robust losses, and any neural architectures, such as transformers. Empirical validations of the approach are provided, in particular on a text-to-molecule dataset.
  • Statistical Modeling of Scenario-based indoor WBAN Channels
    • Youssef Badre
    • Roblin Christophe
    • Sibille Alain
    IEEE Transactions on Antennas and Propagation, Institute of Electrical and Electronics Engineers, 2024, 72 (8), pp.6549-6560. This article presents a parametric statistical path loss model for Wireless Body Area Network (WBAN) communications in the context of a scenario based approach for indoor environments. One of the specificities of WBANs is their numerous sources of variability (subject motion and morphology, antennas, local environment, etc.). We focus here on the influence of the environment, in the case of empty rooms. The model, developed for the first ultra wide band (UWB) sub-band (B = [3.1, 4.8] GHz), takes into account the sizes of the rooms (assumed to be parallelepipedic and empty) and the wall characteristics (via an average reflectivity coefficient). They also involve an elaborate categorization of environments. The following methodology was implemented, in order to avoid time-consuming and complex experimental campaigns while still having a relatively representative and sufficient number of statistical samples: firstly, a simplified ray tracing code enabled a large number of different rooms to be sampled at moderate computational cost; secondly, part of these simulations was supported by anechoic chamber measurements; and thirdly, the simulations were carried out using elaborate experimental designs, based on a categorization of environments and a fairly comprehensive study of building industry data. The parametric path loss models obtained significantly reduce their variance. (10.1109/TAP.2024.3421369)
    DOI : 10.1109/TAP.2024.3421369
  • Human pose estimation based biomechanical feature extraction for long lumps
    • Gan Qi
    • Nguyen Sao Mai
    • El-Yacoubi Mounîm
    • Fenaux Eric
    • Clémençon Stéphan
    , 2024, pp.1-6. Biomechanical features describing movements and poses of athletes have been proposed by experts to help study athletic performances, but the traditional way of measuring those features are high-cost, time-consuming and intrusive. In this paper, we propose a deep learning-based method that can estimate athletic biomechanical features from typical broadcast competition videos, i.e. single-camera-shot moving videos. This method involves state-of-the-art human pose estimation models and a biomechanical analysis to reconstruct the trajectory. We then leverage the reconstructed trajectory to estimate the target features. To evaluate the method, we gathered a dataset from the long jump World Championships of 2017 and 2018, comprising 22 expert-proposed long-jump biomechanical features about the trajectories, taking-off and landing characteristics. Our experiments show the effectiveness of the pipeline in automatically estimating the biomechanical features. By analysing the results, we identify the challenges towards high-accuracy athletes' feature estimations from monocular broadcast competition videos. Code is available at https://github.com/QGAN2019/Long_Jump_Feature_Estimation. (10.1109/HSI61632.2024.10613530)
    DOI : 10.1109/HSI61632.2024.10613530
  • Cyberbullying Detection Using Bidirectional Encoder Representations from Transformers (BERT)
    • Chbib Fadlallah
    • Sujud Razan
    • Khatoun Rida
    • Fahs Walid
    , 2024.