Sorry, you need to enable JavaScript to visit this website.
Share

Publications

2022

  • On permutation quadrinomials with boomerang uniformity 4 and the best-known nonlinearity
    • Kim Kwang Ho
    • Mesnager Sihem
    • Choe Jong Hyok
    • Lee Dok Nam
    • Lee Sengsan
    • Jo Myong Chol
    Designs, Codes and Cryptography, Springer Verlag, 2022, 90 (6), pp.1437-1461. (10.1007/s10623-022-01047-x)
    DOI : 10.1007/s10623-022-01047-x
  • Formal modeling and verification for amplification timing anomalies in the superscalar TriCore architecture
    • Binder Benjamin
    • Asavoae Mihail
    • Brandner Florian
    • Ben Hedia Belgacem
    • Jan Mathieu
    International Journal on Software Tools for Technology Transfer, Springer Verlag, 2022, 24 (3), pp.415-440. (10.1007/s10009-022-00655-1)
    DOI : 10.1007/s10009-022-00655-1
  • Quantum key distribution and classical communication coherent deployment with shared hardware and joint digital signal processing
    • Aymeric Raphael
    • Jaouën Yves
    • Ware Cédric
    • Alléaume Romain
    , 2022 (1213307), pp.19. (10.1117/12.2621260)
    DOI : 10.1117/12.2621260
  • Performance optimization of reconfigurable intelligent surfaces in multipath channels
    • Sibille Alain
    , 2022, pp.1-4. (10.23919/AT-AP-RASC54737.2022.9814382)
    DOI : 10.23919/AT-AP-RASC54737.2022.9814382
  • RF-EMF exposure induced by distributed antenna system in the subway station
    • Mazloum Taghrid
    • Wang Shanshan
    • Wiart Joe
    , 2022, pp.1-2. We aim in the present paper to address the impact of installing indoor distributed antenna system (distAS) on the human exposure to radio-frequency electromagnetic field (RF-EMF). We note that distAS aims to extend coverage and improve wireless communication quality. We performed measurement campaigns in subway stations, where distAS are deployed. The impact of distAS on the exposure is studied by considering two scenarios where distAS are turned either on or off. The electric field strength is measured at different distances to the distAS, for all the frequency bands and operators. The results show that the DL exposure induced by distAS is very low and far away from the standard limits of ICNIRP. (10.23919/AT-AP-RASC54737.2022.9814210)
    DOI : 10.23919/AT-AP-RASC54737.2022.9814210
  • Anamorphic Encryption: Private Communication Against a Dictator
    • Persiano Giuseppe
    • Phan Duong Hieu
    • Yung Moti
    , 2022, 13276, pp.34-63. (10.1007/978-3-031-07085-3_2)
    DOI : 10.1007/978-3-031-07085-3_2
  • Unprofiled expectation-maximization attack
    • Béguinot Julien
    • Cheng Wei
    • Guilley Sylvain
    • Rioul Olivier
    , 2022. Block ciphers are often protected against side-channel attacks by masking. When traces are available for each key hypothesis, the attacker usually resorts to templates attacks with a profiling phase. Lemke-Rust & Paar suggested at CHES2007 a way to profile templates for Gaussian mixture models, with the use of the well-known Expectation-Maximization (EM) algorithm. In this work, we present a new attack, “unprofiled-EM” (U-EM) that does not use the knowledge of the masks nor requires a profiling phase. This is done by “on-the-fly” regression of the coefficients of a stochastic model using the EM algorithm. Compared to previous methods, it is easy to implement, computa- tionally tractable and efficient in terms of success rate or guessing entropy. We discuss several variations of U-EM and compare their performances on simula- tions and on real DPA contest traces. The best attack scenario depends on the trade-off between measurement noise and epistemic noise.
  • Reducing the Silicon Area Overhead of Counter-Based Rowhammer Mitigations
    • France Loïc
    • Bruguier Florent
    • Novo David
    • Mushtaq Maria
    • Benoit Pascal
    , 2022. Modern computer memories have shown to have reliability issues. The main memory is the target of a security threat called Rowhammer, which causes bit flips in adjacent victim cells of repeatedly activated aggressor rows [1]. This issue is becoming more important as DRAM technology scales down, with the required aggressor activations to corrupt a victim going from 130k for DDR3 [1] to around 10k for the most recent LPDDR4 memories [2]. Numerous countermeasures have been proposed, implemented either in software [3], [4] or in hardware [1], [5]-[10]. Among the hardware-based proposals, some rely on probability, randomly refreshing neighbors of activated rows [1], [5], [6], while others rely on row activation counters to detect aggressor rows before acting to prevent the corruption [7]-[9]. Counter-based hardware mitigation proposals offer the lowest performance overhead, as the mechanism only acts when an aggressor is detected and does not disturb the system for harmless applications. However, they require a lot of counters to track row activations. Considering the unrealistic amount of counters needed to track every rows, those mitigation exploit different most-frequent-elements detection algorithms to reduce the number of counters needed while keeping a complete protection and a minimal false positive rate. Most of them offer a bank-level attack detection, with a separate set of counters for each bank. In this talk, We will show you that by changing the counting granularity from bank-level to rank-level, we can further reduce the total required number for counters from 20% for DDR3 to 70% for DDR5, thus reducing the silicon area and energy overheads of such mitigations.
  • Evaluation of side-channel attacks using alpha-information
    • Liu Yi
    • Cheng Wei
    • Guilley Sylvain
    • Rioul Olivier
    , 2022. Mutual information as an information-theoretic tool has been frequently used in many security analyses. Ch ́erisey et al. used Shannon information- theoretic tools to establish some universal inequalities between the probabil- ity of success of a side-channel attack and the minimum number of queries to reach a given success rate. α-information theory is a generalization of clas- sic information-theoretic tools which seems more persuasive in a side-channel context. Such metrics include R ́enyi’s α-entropy, α-divergence, Arimoto’s con- ditional α-entropy, Sibson’s α-information, etc. In this work, we aim at extending the work of Ch ́erisey et al. to α-information quantities depending on a parameter α. A conditional version of Sibson’s α- information is defined using a simple closed-form expression. Our definition of conditional α-information satisfies important properties such as consistency, uni- form expansion, and data processing inequalities, while other previous proposals do not satisfy all of these properties. Based on our proposal and a generalized Fano inequality, we extend the case α = 1 of previous works to any α > 0, and obtain sharp universal upper bounds for the probability of success of any type of side-channel attack. It turns out the bound is improved as α increases, and it is already very tight when α = 2.
  • 3D Simulation for Disaster Management: toward a new approach
    • Tanzi Tullio
    • Apvrille Ludovic
    , 2022. Recent progress in modern technology can enhance the definition of disaster recovery management strategy. Rescue teams can rely on Autonomous Systems (A.S.) during recovery operations, dispatching to them various tasks. A.S. can reach locations that may be unattainable or dangerous for humans. However, before sending the autonomous system to the catastrophe area, it is important to verify its adequation to the environment and to the mission objectives. The simulation provides an assessment of this adaptation between the autonomous system and its expectations.
  • Fully Homomorphic Encryption and Bootstrapping
    • Leluc Rémi
    • Chedemail Elie
    • Kouande Adéchola
    • Nguyen Quyen
    • Andriamandratomanana Njaka
    , 2022, pp.1-12. This report is the result of a work done during the SEME (Semaine d'Étude Mathématiques Entreprise) in Rennes (May 2nd-May 6th 2022). The project presented here concerns the company Ravel Technologies and deals with the field of homomorphic encryption for secure data processing. Homomorphic encryption is an encryption that allows users to do computations on encrypted data without first decrypting them. An encryption algorithm must switch with elementary operations, which are at least addition and multiplication. A straightforward application of a homomorphic encryption for the delegation of calculations concerns cloud computing service where there is a need to perform calculations while preserving the confidentiality of the data, e.g. for the medical and banking sectors. Among possible algorithms, those that have become popular over the last ten years are based on the socalled Learning With Errors (LWE) problem, for performance and security reasons. Since this technique introduces noise into the encryption, which can grow during homomorphic computations to the point that later decryption fails, it is necessary to consider a noise reduction method such as bootstrapping.
  • Analyzing and repairing concept drift adaptation in data stream classification
    • Halstead Ben
    • Koh Yun Sing
    • Riddle Patricia
    • Pears Russel
    • Pechenizkiy Mykola
    • Bifet Albert
    • Olivares Gustavo
    • Coulson Guy
    Machine Learning, Springer Verlag, 2022, 111 (10), pp.3489--3523. Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions. (10.1007/S10994-021-05993-W)
    DOI : 10.1007/S10994-021-05993-W
  • Microstrip Antenna Array Design for UAV Detection
    • Mendes Ruiz Pedro
    • Begaud Xavier
    • Magne François
    • Leder Etienne
    , 2022. This work presents the design and realization of four linear arrays of microstrip rectangular patch antennas. This linear array is one of the elements of a passive radar using signals from 4G base stations for UAV detection. The arrays have been validated and operate from 2.62 GHz to 2.69 GHz, with a HPBW of 82° in H-plane and a maximal gain going from 11.1 dB to 12.2 dB in the required bandwidth, with a cosecant squared pattern in the E-plane.
  • Fiblets for Real‐Time Rendering of Massive Brain Tractograms
    • Schertzer Jérémie
    • Mercier Corentin
    • Rousseau Sylvain
    • Boubekeur Tamy
    Computer Graphics Forum, Wiley, 2022, 41 (2), pp.447-460. We present a method to render massive brain tractograms in real time. Tractograms model the white matter architecture of the human brain using millions of 3D polylines (fibers), summing up to billions of segments. They are used by neurosurgeons before surgery as well as by researchers to better understand the brain. A typical raw dataset for a single brain represents dozens of gigabytes of data, preventing their interactive rendering. We address this challenge with a new GPU mesh shader pipeline based on a decomposition of the fiber set into compressed local representations that we call fiblets. Their spatial coherence is used at runtime to efficiently cull hidden geometry at the task shader stage while synthesizing the visible ones as polyline meshlets in a warp‐scale parallel fashion at the mesh shader stage. As a result, our pipeline can feed a standard deferred shading engine to visualize the mesostructures of the brain with various classical rendering techniques, as well as simple interaction primitives. We demonstrate that our algorithm provides real‐time framerates on very large tractograms that were out of reach for previous methods while offering a fiber‐level granularity in both rendering and interaction.</jats:p> (10.1111/cgf.14486)
    DOI : 10.1111/cgf.14486
  • Wideband metamaterial absorber: from concept to naval applications
    • Begaud Xavier
    • Lepage Anne Claire
    • Soiron Michel
    • Barka André
    • Laybros Sarah
    , 2022. This presentation focuses on the results obtained during the two projects SAFAS and SAFASNAV, which highlight the evolution from the concept of broadband absorber with metamaterials to a realization of this concept in structural material for naval application. The French Ministry of Defense (DGA), through the National Research Agency (ANR) and the Astrid and Astrid Maturation programs, funded the research that led to these results. The simulation results were obtained using GENCI's HPC resources (Grant c2016107558).
  • TA4L: Efficient temporal abstraction of multivariate time series
    • Mordvanyuk Natalia
    • López Beatriz
    • Bifet Albert
    Knowledge-Based Systems, Elsevier, 2022, 244, pp.108554. In this work, we introduce TA4L, a new efficient algorithm to transform multivariate time series into Lexicographical Symbolic Time Interval Sequences (LSTISs), that is, sequences ready to feed time-interval related pattern (TIRP) mining algorithms. The ultimate goal is to make explicit the embedded, ad-hoc pre-processes related to TIRP mining algorithms while offering an efficient solution for the required pre-processing. On the one hand, TA4L divides the signals into segments based on time duration (instead of the often-used practice based on the number of samples), which allows the construction of consistent time intervals. Concatenation of intervals is controlled by a maximum time gap constraint that reinforces the generated time intervals’ consistency. Moreover, different ways to parallelise the algorithm are explored that are accompanied by efficient data structures to speed up the pre-processing cost. TA4L has been experimentally evaluated with synthetic and real datasets, and the results show that TA4L requires significantly less computation time than other state-of-the-art approaches, revealing that it is an effective algorithm. (10.1016/J.KNOSYS.2022.108554)
    DOI : 10.1016/J.KNOSYS.2022.108554
  • Generalized Sliced Probability Metrics
    • Kolouri Soheil
    • Nadjahi Kimia
    • Shahrampour Shahin
    • Simsekli Umut
    , 2022, pp.4513-4517. Sliced probability metrics have become increasingly popular in machine learning, and they play a quintessential role in various applications, including statistical hypothesis testing and generative modeling. However, in a practical setting, the convergence behavior of the algorithms built upon these distances have not been well established, except for a few specific cases. In this paper, we introduce a new family of sliced probability metrics, namely Generalized Sliced Probability Metrics (GSPMs), based on the idea of slicing high-dimensional distributions into a set of their one-dimensional marginals. We show that GSPMs are true metrics, and they are related to the Maximum Mean Discrepancy (MMD). Exploiting this relationship, we consider GSPM-based gradient flows and show that, under mild assumptions, the gradient flow converges to the global optimum. Finally, we demonstrate that various choices of GSPMs lead to new positive definite kernels that could be used in the MMD formulation while providing a unique integral geometric interpretation. We illustrate the application of GSPMs in gradient flows. (10.1109/ICASSP43922.2022.9746016)
    DOI : 10.1109/ICASSP43922.2022.9746016
  • PHASE SHIFTED BEDROSIAN FILTERBANK: AN INTERPRETABLE AUDIO FRONT-END FOR TIME-DOMAIN AUDIO SOURCE SEPARATION
    • Mathieu Félix
    • Courtat Thomas
    • Richard Gael
    • Peeters Geoffroy
    , 2022. The use of a parameterized encoders or audio front-ends has shown promises in improving the interpretability of time domain single-channel source separation models such as Conv-TasNet. This type of filters also allows a potential reduction of the computational cost since larger encoder filters can be used. In this work, we propose to build a new parameterization of such encoder filter-bank which allows gaining interpretability while keeping flexibility. Based on the Hilbert transform and the Bedrosian theorem, we propose to build phase-shifted set of filters by modulating sinusoids through freely learned low pass filters. We show that the use of these filters allows to keep the same performances when using small filters and even improve them when using large filters. (10.1109/ICASSP43922.2022.9746122)
    DOI : 10.1109/ICASSP43922.2022.9746122
  • Geographic Diversity in Public Code Contributions
    • Rossi Davide
    • Zacchiroli Stefano
    , 2022. We conduct an exploratory, large-scale, longitudinal study of 50 years of commits to publicly available version control system repositories, in order to characterize the geographic diversity of contributors to public code and its evolution over time. We analyze in total 2.2 billion commits collected by Software Heritage from 160 million projects and authored by 43 million authors during the 1971-2021 time period. We geolocate developers to 12 world regions derived from the United Nation geoscheme, using as signals email top-level domains, author names compared with names distributions around the world, and UTC offsets mined from commit metadata. We find evidence of the early dominance of North America in open source software, later joined by Europe. After that period, the geographic diversity in public code has been constantly increasing. We also identify relevant historical shifts related to the UNIX wars, the increase of coding literacy in Central and South Asia, and broader phenomena like colonialism and people movement across countries (immigration/emigration). (10.1145/3524842.3528471)
    DOI : 10.1145/3524842.3528471
  • A Two-Step Radiologist-Like Approach for Covid-19 Computer-Aided Diagnosis from Chest X-Ray Images
    • Barbano Carlo Alberto
    • Tartaglione Enzo
    • Berzovini Claudio
    • Calandri Marco
    • Grangetto Marco
    , 2022, 13231, pp.173-184. Thanks to the rapid increase in computational capability during the latest years, traditional and more explainable methods have been gradually replaced by more complex deep-learning-based approaches, which have in fact reached new state-of-the-art results for a variety of tasks. However, for certain kinds of applications performance alone is not enough. A prime example is represented by the medical field, in which building trust between the physicians and the AI models is fundamental. Providing an explainable or trustful model, however, is not a trivial task, considering the black-box nature of deep-learning based methods. While some existing methods, such as gradient or saliency maps, try to provide insights about the functioning of deep neural networks, they often provide limited information with regards to clinical needs. We propose a two-step diagnostic approach for the detection of Covid-19 infection from Chest X-Ray images. Our approach is designed to mimic the diagnosis process of human radiologists: it detects objective radiological findings in the lungs, which are then employed for making a final Covid-19 diagnosis. We believe that this kind of structural explainability can be preferable in this context. The proposed approach achieves promising performance in Covid-19 detection, compatible with expert human radiologists. Moreover, despite this work being focused Covid-19, we believe that this approach could be employed for many different CXR-based diagnosis. (10.1007/978-3-031-06427-2_15)
    DOI : 10.1007/978-3-031-06427-2_15
  • Privacy in Machine Learning
    • Cummings Rachel
    • Valla Mathias
    • Nesterenko Luca
    • El Ahmad Tamim
    • Ahmadipour Mehrasa
    • Lachi Veronica
    • Lalanne Clément
    • Siviero Emilia
    • Ogier Clément
    • Jose Ashna
    • Oufkir Aadil
    , 2022, pp.18. Privacy considerations arise as soon data is collected on individuals, on group on individuals, on moral personas, . . . . More specifically, we look at the setup where one processes data D through a mechanism M which can be anything from data publication, basic statistics computation, decision rule learning, complex machine learning tasks, . . . , and wants the result M(D) to be made public. The natural question on a privacy standpoint is whether the mechanism M can be "reverted" in order to learn sensitive information from D. For instance, if M is the identity function, the publication ofM¹º leaks full information about D and even though the notion of privacy is not rigorously defined yet, we can intuitively qualify such mechanism as "non-private". This manuscript is a transcription of Prof. Rachel Cummings’ lecture titled Privacy in Machine Learning that was given at the 2022 Spring School of Theoretical Computer Science at the CIRM, Marseille, France. Any error in this document may be due to its transcription and cannot be imputed to Prof. Cummings.
  • A Large-scale Dataset of (Open Source) License Text Variants
    • Zacchiroli Stefano
    , 2022. We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive-the largest publicly available archive of FOSS source code with accompanying development history-all versions of files whose names are commonly used to convey licensing terms to software users and developers. The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. Additional metadata about shipped license files are also provided, making the dataset ready to use in various contexts; they include: file length measures, detected MIME type, detected SPDX license (using ScanCode), example origin (e.g., GitHub repository), oldest public commit in which the license appeared. The dataset is released as open data as an archive file containing all deduplicated license files, plus several portable CSV files for metadata, referencing files via cryptographic checksums. (10.1145/3524842.3528491)
    DOI : 10.1145/3524842.3528491
  • Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation
    • Nugraha Aditya Arie
    • Sekiguchi Kouhei
    • Fontaine Mathieu
    • Bando Yoshiaki
    • Yoshii Kazuyoshi
    , 2022. This paper describes a blind source separation method for multichannel audio signals, called NF-FastMNMF, based on the integration of the normalizing flow (NF) into the multichannel nonnegative matrix factorization with jointly-diagonalizable spatial covariance matrices, a.k.a. FastMNMF. Whereas the NF of flow-based independent vector analysis, called NF-IVA, acts as the demixing matrices to transform an M-channel mixture into M independent sources, the NF of NF-FastMNMF acts as the diagonalization matrices to transform an Mchannel mixture into a spatially-independent M-channel mixture represented as a weighted sum of N source images. This diagonalization enables the NF, which has been used only for determined separation because of its bijective nature, to be applicable to non-determined separation. NF-FastMNMF has time-varying diagonalization matrices that are potentially better at handling dynamical data variation than the time-invariant ones in FastMNMF. To have an NF with richer expression capability, the dimension-wise scalings using diagonal matrices originally used in NF-IVA are replaced with linear transformations using upper triangular matrices; in both cases, the diagonal and upper triangular matrices are estimated by neural networks. The evaluation shows that NF-FastMNMF performs well for both determined and non-determined separations of multiple speech utterances by stationary or non-stationary speakers from a noisy reverberant mixture.
  • The General Index of Software Engineering Papers
    • Abou Khalil Zeinab
    • Zacchiroli Stefano
    , 2022. We introduce the General Index of Software Engineering Papers, a dataset of fulltext-indexed papers from the most prominent scientific venues in the field of Software Engineering. The dataset includes both complete bibliographic information and indexed ngrams (sequence of contiguous words after removal of stopwords and non-words, for a total of 577 276 382 unique n-grams in this release) with length 1 to 5 for 44 581 papers retrieved from 34 venues over the 1971-2020 period. The dataset serves use cases in the field of meta-research, allowing to introspect the output of software engineering research even when access to papers or scholarly search engines is not possible (e.g., due to contractual reasons). The dataset also contributes to making such analyses reproducible and independently verifiable, as opposed to what happens when they are conducted using 3rd-party and non-open scholarly indexing services. The dataset is available as a portable Postgres database dump and released as open data. (10.1145/3524842.3528494)
    DOI : 10.1145/3524842.3528494
  • Worldwide Gender Differences in Public Code Contributions
    • Rossi Davide
    • Zacchiroli Stefano
    , 2022. Gender imbalance is a well-known phenomenon observed throughout sciences which is particularly severe in software development and Free/Open Source Software communities. Little is know yet about the geography of this phenomenon in particular when considering large scales for both its time and space dimensions. We contribute to fill this gap with a longitudinal study of the population of contributors to publicly available software source code. We analyze the development history of 160 million software projects for a total of 2.2 billion commits contributed by 43 million distinct authors over a period of 50 years. We classify author names by gender using name frequencies and author geographical locations using heuristics based on email addresses and time zones. We study the evolution over time of contributions to public code by gender and by world region. For the world overall, we confirm previous findings about the low but steadily increasing ratio of contributions by female authors. When breaking down by world regions we find that the long-term growth of female participation is a worldwide phenomenon. We also observe a decrease in the ratio of female participation during the COVID-19 pandemic, suggesting that women's ability to contribute to public code has been more hindered than that of men. (10.1145/3510458.3513011)
    DOI : 10.1145/3510458.3513011