Sorry, you need to enable JavaScript to visit this website.
Share

Publications

2024

  • Towards On-Device Learning on the Edge: Ways to Select Neurons to Update Under a Budget Constraint
    • Quélennec Aël
    • Tartaglione Enzo
    • Mozharovskyi Pavlo
    • Nguyen Van-Tam
    , 2024, pp.685-694. n the realm of efficient on-device learning under extreme memory and computation constraints, a significant gap in successful approaches persists. Although considerable effort has been devoted to efficient inference, the main obstacle to efficient learning is the prohibitive cost of backpropagation. The resources required to compute gradients and update network parameters often exceed the limits of tightly constrained memory budgets. This paper challenges conventional wisdom and proposes a series of experiments that reveal the existence of superior sub-networks. Furthermore, we hint at the potential for substantial gains through a dynamic neuron selection strategy when fine-tuning a target task. Our efforts extend to the adaptation of a recent dynamic neuron selection strategy pioneered by Bragagnolo et al. (NEq), revealing its effectiveness in the most stringent scenarios. Our experiments demonstrate, in the average case, the superiority of a NEq-inspired approach over a random selection. This observation prompts a compelling avenue for further exploration in the area, highlighting the opportunity to design a new class of algorithms designed to facilitate parameter update selection. Our findings usher in a new era of possibilities in the field of on-device learning under extreme constraints and encourage the pursuit of innovative strategies for efficient, resource-friendly model fine-tuning. (10.1109/WACVW60836.2024.00080)
    DOI : 10.1109/WACVW60836.2024.00080
  • RF-EMF Exposure Assessment of Fetus During the First Trimester of Pregnancy
    • Sandeep Srikumar
    • Vard Alireza
    • Guxens Mònica
    • Bloch Isabelle
    • Wiart Joe
    IEEE Access, IEEE, 2024, 12, pp.75311-75322. This article describes the computational analysis of Radio Frequency - Electromagnetic Field (RF-EMF) exposure of Uterus-Fetus Units (UFUs) embedded inside the body of a 26 year old human female. Realistic UFU models are obtained from ultrasound images acquired for different fetuses and at specific development stages (7 weeks, 9 weeks and 11 weeks old), for which a deep-learning based segmentation method is developed. Each UFU model is then inserted into a computational electromagnetic model of a 26 year old female. The Specific Absorption Rate (SAR) of the fetus at commonly used wireless communication frequencies is estimated using a commercially available numerical electromagnetic solver. The Inverted F antenna (IFA), which is a commonly used mobile phone antenna was used as the excitation source. Fetus SAR values are reported for different combinations of excitation frequencies, phone positions and UFU ages. It was found that the fetus SAR for all the cases is well below the maximum allowable exposure limit of 80 mW/kg, as prescribed by ICNIRP. Furthermore, we replaced the embryo with uterus tissues and calculated the SAR in the uterus tissues (i.e. uterus tissues with same volume and shape, and at the same location as that of UFU). The uterus SAR values were found to be only marginally different from that of fetus SAR. (10.1109/ACCESS.2024.3404369)
    DOI : 10.1109/ACCESS.2024.3404369
  • The Impact of the COVID-19 Pandemic on Women's Contribution to Public Code
    • Casanueva Annalí
    • Rossi Davide
    • Zacchiroli Stefano
    • Zimmermann Théo
    Empirical Software Engineering, Springer Verlag, 2024. Despite its promise of openness and inclusiveness, the development of free and open source software (FOSS) remains significantly unbalanced in terms of gender representation among contributors. To assist open source project maintainers and communities in addressing this imbalance, it is crucial to understand the causes of this inequality. In this study, we aim to establish how the COVID-19 pandemic has influenced the ability of women to contribute to public code. To do so, we use the Software Heritage archive, which holds the largest dataset of commits to public code, and the difference in differences (DID) methodology from econometrics that enables the derivation of causality from historical data. Our findings show that the COVID-19 pandemic has disproportionately impacted women's ability to contribute to the development of public code, relatively to men. Further, our observations of specific contributor subgroups indicate that COVID-19 particularly affected women hobbyists, identified using contribution patterns and email address domains. (10.1007/s10664-024-10552-7)
    DOI : 10.1007/s10664-024-10552-7
  • A Pseudo-Metric between Probability Distributions based on Depth-Trimmed Regions
    • Staerman Guillaume
    • Mozharovskyi Pavlo
    • Colombo Pierre
    • Clémençon Stéphan
    • d'Alché-Buc Florence
    Transactions on Machine Learning Research Journal, [Amherst Massachusetts]: OpenReview.net, 2022, 2024. The design of a metric between probability distributions is a longstanding problem motivated by numerous applications in Machine Learning. Focusing on continuous probability distributions on the Euclidean space $\mathbb{R}^d$, we introduce a novel pseudo-metric between probability distributions by leveraging the extension of univariate quantiles to multivariate spaces. Data depth is a nonparametric statistical tool that measures the centrality of any element $x\in\mathbb{R}^d$ with respect to (w.r.t.) a probability distribution or a data set. It is a natural median-oriented extension of the cumulative distribution function (cdf) to the multivariate case. Thus, its upper-level sets -- the depth-trimmed regions -- give rise to a definition of multivariate quantiles. The new pseudo-metric relies on the average of the Hausdorff distance between the depth-based quantile regions w.r.t. each distribution. Its good behavior w.r.t. major transformation groups, as well as its ability to factor out translations, are depicted. Robustness, an appealing feature of this pseudo-metric, is studied through the finite sample breakdown point. Moreover, we propose an efficient approximation method with linear time complexity w.r.t. the size of the data set and its dimension. The quality of this approximation as well as the performance of the proposed approach are illustrated in numerical experiments.
  • Completeness, Recall, and Negation in Open-World Knowledge Bases: A Survey
    • Suchanek Fabian M.
    • Razniewski Simon
    • Arnaout Hiba
    • Ghosh Shrestha
    ACM Computing Surveys, Association for Computing Machinery, 2024. General-purpose knowledge bases (KBs) are a cornerstone of knowledge-centric AI. Many of them are constructed pragmatically from web sources, and are thus far from complete. This poses challenges for the consumption as well as the curation of their content. While several surveys target the problem of completing incomplete KBs, the first problem is arguably to know whether and where the KB is incomplete in the first place, and to which degree. In this survey, we discuss how knowledge about completeness, recall, and negation in KBs can be expressed, extracted, and inferred. We cover (i) the logical foundations of knowledge representation and querying under partial closed-world semantics; (ii) the estimation of this information via statistical patterns; (iii) the extraction of information about recall from KBs and text; (iv) the identification of interesting negative statements; and (v) relaxed notions of relative recall. This survey is targeted at two types of audiences: (1) practitioners who are interested in tracking KB quality, focusing extraction efforts, and building quality-aware downstream applications; and (2) data management, knowledge base and semantic web researchers who wish to understand the state of the art of knowledge bases beyond the open-world assumption. Consequently, our survey presents both fundamental methodologies and the results that they have produced, and gives practice-oriented recommendations on how to choose between different approaches for a problem at hand. CCS Concepts: • General and reference → Surveys and overviews; • Computing methodologies → Knowledge representation and reasoning; Artificial intelligence.
  • Exploring the potential of representation and transfer learning for anatomical neuroimaging: application to psychiatry
    • Dufumier Benoit
    • Gori Pietro
    • Petiton Sara
    • Louiset Robin
    • Mangin Jean-François
    • Grigis Antoine
    • Duchesnay Edouard
    NeuroImage, Elsevier, 2024. The perspective of personalized medicine for brain disorders requires efficient learning models for anatomical neuroimagingbased prediction of clinical conditions. There is now a consensus on the benefit of deep learning (DL) in addressing many medical imaging tasks, such as image segmentation. However, for single-subject prediction problems, recent studies yielded contradictory results when comparing DL with Standard Machine Learning (SML) on top of classical feature extraction. Most existing comparative studies were limited in predicting phenotypes of little clinical interest, such as sex and age, and using a single dataset. Moreover, they conducted a limited analysis of the employed image pre-processing and feature selection strategies. This paper extensively compares DL and SML prediction capacity on five multi-site problems, including three increasingly complex clinical applications in psychiatry namely schizophrenia, bipolar disorder and Autism Spectrum Disorder (ASD) diagnosis. To compensate for the relative scarcity of neuroimaging data on these clinical datasets, we also evaluate three pre-training strategies for transfer learning from brain imaging of the general healthy population: self-supervised learning, generative modelling and supervised learning with age. Overall, we find similar performance between randomly initialized DL and SML for the three clinical tasks and a similar scaling trend for sex prediction. This was replicated on an external dataset. We also show highly correlated discriminative brain regions between DL and linear ML models in all problems. Nonetheless, we demonstrate that self-supervised pre-training on large-scale healthy population imaging dataset (N ≈10k), along with Deep Ensemble, allows DL to learn robust and transferable representations to smaller-scale clinical datasets (N ≤ 1k). It largely outperforms SML on 2 out of 3 clinical tasks both in internal and external test sets. These findings suggest that the improvement of DL over SML in anatomical neuroimaging mainly comes from its capacity of learning meaningful and useful abstract representations of the brain anatomy, and it sheds light on the potential of transfer learning for personalized medicine in psychiatry.
  • Reducing the Silicon Area Overhead of Counter-Based Rowhammer Mitigations
    • France Loïc
    • Bruguier Florent
    • Novo David
    • Mushtaq Maria
    • Benoit Pascal
    IEEE Computer Architecture Letters, Institute of Electrical and Electronics Engineers, 2024, 23 (1), pp.61-64. Modern computer memories have shown to have reliability issues. The main memory is the target of a security threat called Rowhammer, which causes bit flips in adjacent victim cells of aggressor rows. Numerous countermeasures have been proposed, some of the most efficient ones relying on row access counters, with different techniques to reduce the impact on performance, energy consumption and silicon area. In these proposals, the number of counters is calculated using the maximum number of row activations that can be issued to the protected bank. As reducing the number of counters results in lower silicon area and energy overheads, this can have a direct impact on the production and usage costs. In this work, we demonstrate that two of the most efficient countermeasures can have their silicon area overhead reduced by approximately 50% without impacting the protection level by changing their counting granularity. (10.1109/LCA.2023.3328824)
    DOI : 10.1109/LCA.2023.3328824
  • The Non-Cancelling Intersections Conjecture
    • Amarilli Antoine
    • Monet Mikaël
    • Suciu Dan
    , 2024. In this note, we present a conjecture on intersections of set families, and a rephrasing of the conjecture in terms of principal downsets of Boolean lattices. The conjecture informally states that, whenever we can express the measure of a union of sets in terms of the measure of some of their intersections using the inclusion-exclusion formula, then we can express the union as a set from these same intersections via the set operations of disjoint union and subset complement. We also present a partial result towards establishing the conjecture.
  • Device-independent quantum key distribution based on routed Bell tests
    • Roy-Deloison Tristan Le
    • Lobo Edwin Peter
    • Pauwels Jef
    • Pironio Stefano
    , 2024. Photon losses are the main obstacle to fully photonic implementations of device-independent quantum key distribution (DIQKD). Motivated by recent work showing that routed Bell scenarios offer increased robustness to detection inefficiencies for the certification of long-range quantum correlations, we investigate DIQKD protocols based on a routed setup. In these protocols, in some of the test rounds, photons from the source are routed by an actively controlled switch to a nearby test device instead of the distant one. We show how to analyze the security of these protocols and compute lower bounds on the key rates using non-commutative polynomial optimization and the Brown-Fawzi-Fazwi method. We determine lower bounds on the asymptotic key rates of several simple two-qubit routed DIQKD protocols based on CHSH or BB84 correlations and compare their performance to standard protocols. We find that in an ideal case routed DIQKD protocols can significantly improve detection efficiency requirements, by up to $\sim 30\%$, compared to their non-routed counterparts. Notably, the routed BB84 protocol achieves a positive key rate with a detection efficiency as low as $50\%$ for the distant device, the minimal threshold for any QKD protocol featuring two untrusted measurements. However, the advantages we find are highly sensitive to noise and losses affecting the short-range correlations involving the additional test device.
  • Self-Supervised Learning of Multi-level Audio Representations for Music Segmentation
    • Buisson Morgan
    • Mcfee Brian
    • Essid Slim
    • Crayencour Hélène
    IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2024, pp.1-13. The task of music structure analysis refers to automatically identifying the location and the nature of musical sections within a song. In the supervised scenario, structural annotations generally result from exhaustive data collection processes, which represents one of the main challenges of this task. Moreover, both the subjectivity of music structure and the hierarchical characteristics it exhibits make the obtained structural annotations not fully reliable, in the sense that they do not convey a "universal ground-truth" unlike other tasks in music information retrieval. On the other hand, the quickly growing quantity of available music data has enabled weakly supervised and self-supervised approaches to achieve impressive results on a wide range of music-related problems. In this work, a self-supervised learning method is proposed to learn robust multi-level music representations prior to structural segmentation using contrastive learning. To this end, sets of frames sampled at different levels of detail are used to train a deep neural network in a disentangled manner. The proposed method is evaluated on both flat and multi-level segmentation. We show that each distinct sub-region of the output embeddings can efficiently account for structural similarity at their own targeted level of detail, which ultimately improves performance of downstream flat and multi-level segmentation. Finally, complementary experiments are carried out to study how the obtained representations can be further adapted to specific datasets using a supervised fine-tuning objective in order to facilitate structure retrieval in domains where human annotations remain scarce. (10.1109/TASLP.2024.3379894)
    DOI : 10.1109/TASLP.2024.3379894
  • MALLIAVIN STRUCTURE FOR CONDITIONALLY INDEPENDENT RANDOM VARIABLES
    • Decreusefond Laurent
    • Vuong Christophe
    , 2024. On any denumerable product of probability spaces, we extend the discrete Malliavin structure for conditionally independent random variables. As a consequence, we obtain the chaos decomposition for functionals of conditionally independent random variables. We also show how to derive some concentration results in that framework. The Malliavin-Stein method yields Berry-Esseen bounds for U-Statistics of such random variables. It leads to quantitative statements of conditional limit theorems: Lyapunov's central limit theorem, De Jong's limit theorem for multilinear forms. The latter is related to the fourth moment phenomenon. The final application consists of obtaining the rates of normal approximation for subhypergraph counts in random exchangeable hypergraphs including the Erdös-Rényi hypergraph model. The estimator of subhypergraph counts is an example of homogeneous sums for which we derive a new decomposition that extends the Hoeffding decomposition.
  • Check-Bit Region Exploration in Two-Dimensional Error Correction Codes
    • Freitas David
    • Mota David
    • Coelho David
    • Fontinele Humberto
    • Coelho Alexandre
    • Silveira Jarbas
    • Naviner Lirida
    • Mota João
    • Marcon César
    IEEE Access, IEEE, 2024, 12, pp.131830-131841. The diversity of nanosatellite applications is increasingly attracting the scientific community’s attention. The main component of these satellites is the OnBoard Computer (OBC), which is responsible for all control and processing. Also, OBC encompasses memory elements highly susceptible to failure; due to spatial radiation, errors in these memories can cause severe damage. As integrated circuit technology advances, cluster errors are more and more frequent. Error Correction Code (ECC) is one of the most used techniques for mitigating errors, and two-dimensional ECCs are used to reach higher error correction power. The paper aims to assess the number of checkbit regions to include for code enhancement. Our analysis investigates the impact of incorporating up to three checkbit regions. The results are analyzed through adjacent and exhaustive error injection tests and compared to other ECCs. Besides, reliability, redundancy, and hardware implementation costs are investigated, and an evaluation metric is proposed to choose the best ECC. Experiments with random error patterns show that the proposal with three crossed check-bit regions achieves a correction of 100% for up to four bitflips and greater than 90% for up to seven bitflips. Additionally, considering adjacent error patterns, the proposal achieves a correction greater than 97.4% with up to five bitflips. (10.1109/ACCESS.2024.3456582)
    DOI : 10.1109/ACCESS.2024.3456582
  • A Gaussian Process Based Approach for Validation of Multi-Variable Measurement Systems: Application to SAR Measurement Systems
    • Bujard Cédric
    • Neufeld Esra
    • Douglas Mark
    • Wiart Joe
    • Kuster Niels
    IEEE Access, IEEE, 2024, 12, pp.60404-60424. Resource-efficient and robust validation of systems designed to measure a multi-dimensional parameter space is an unsolved problem as it would require millions of test permutations for comprehensive validation coverage. In the paper, an efficient and comprehensive validation approach based on a Gaussian Process (GP) model of the test system has been developed that can operate system-agnostically, avoids calibration to a fixed set of known validation benchmarks, and supports large configuration spaces. The approach consists of three steps that can be performed independently by different parties: 1) GP model creation, 2) model confirmation, and 3) targeted search for critical cases. It has been applied to two systems that measure specific absorption rate (SAR) for compliance testing of wireless devices and apply different SAR measurement methods: a probe-scanning system (per IEC/IEEE 62209–1528), and a static sensor-array system (per IEC 62209–3). The results demonstrate that the approach is practical, feasible, suitable for proving effective equivalence, and can be applied to any measurement method and implementation. The presented method is sufficiently general to be of value not only for SAR system validation, but also in a wide variety of applications that require critical, independent, and efficient validation. (10.1109/ACCESS.2024.3393778)
    DOI : 10.1109/ACCESS.2024.3393778
  • On Ranking-based Tests of Independence
    • Limnios Myrto
    • Clémençon Stéphan
    , 2024. In this paper we develop a novel nonparametric framework to test the independence of two random variables $\mathbf{X}$ and $\mathbf{Y}$ with unknown respective marginals $H(dx)$ and $G(dy)$ and joint distribution $F(dx dy)$, based on {\it Receiver Operating Characteristic} (ROC) analysis and bipartite ranking. The rationale behind our approach relies on the fact that, the independence hypothesis $\mathcal{H}_0$ is necessarily false as soon as the optimal scoring function related to the pair of distributions $(H\otimes G,\; F)$, obtained from a bipartite ranking algorithm, has a ROC curve that deviates from the main diagonal of the unit square. We consider a wide class of rank statistics encompassing many ways of deviating from the diagonal in the ROC space to build tests of independence. Beyond its great flexibility, this new method has theoretical properties that far surpass those of its competitors. Nonasymptotic bounds for the two types of testing errors are established. From an empirical perspective, the novel procedure we promote in this paper exhibits a remarkable ability to detect small departures, of various types, from the null assumption $\mathcal{H}_0$, even in high dimension, as supported by the numerical experiments presented here.