Sorry, you need to enable JavaScript to visit this website.
Share

Publications

2020

  • Scikit-network: Graph Analysis in Python
    • Bonald Thomas
    • de Lara Nathan
    • Lutz Quentin
    • Charpentier Bertrand
    Journal of Machine Learning Research, Microtome Publishing, 2020. Scikit-network is a Python package inspired by scikit-learn for the analysis of large graphs. Graphs are represented by their adjacency matrix in the sparse CSR format of SciPy. The package provides state-of-the-art algorithms for ranking, clustering, classifying, embedding and visualizing the nodes of a graph. High performance is achieved through a mix of fast matrix-vector products (using SciPy), compiled code (using Cython) and parallel processing. The package is distributed under the BSD license, with dependencies limited to NumPy and SciPy. It is compatible with Python 3.6 and newer. Source code, documentation and installation instructions are available online.
  • A Model-Based Combination Language for Scheduling Verification
    • Zhao Hui
    • Apvrille Ludovic
    • Mallet Frédéric
    , 2020. Cyber-Physical Systems (CPSs) are built upon discrete software and hardware components, as well as continuous physical components. Such heterogeneous systems involve numerous domains with competencies and expertise that go far beyond traditional software engineering: systems engineering. In this paper , we explore a model-based approach for systems engineering that advocates the composition of several heterogeneous artifacts (called views) into a sound and consistent system model. A model combination Language is proposed for this purpose. Thus, rather than trying to build the universal language able to capture all possible aspects of systems, the proposed language proposes to relate small subsets of languages in order to offer specific analysis capabilities while keeping a global consistency between all joined models. We demonstrate the interest of our approach through an industrial process based on Capella, which provides (among others) a large support for functional analysis from requirements to components deployment. Even though Capella is already quite expressive, it lacks support for schedulability analysis. AADL is also a language dedicated to system analysis. If it is backed with advanced schedulability tools, it lacks support for functional analysis. Thus, instead of proposing ways to add missing aspects in either Capella or AADL, we rather extract a relevant subset of both languages to build a view adequate for conducting schedulability analysis of Capella functional models. Finally, our combination language is generic enough to extract pertinent subsets of languages and combine them to build views for different experts. It also helps maintaining a global consistency between different modeling views.
  • Multimodal Analysis of Cohesion in Multi-party Interactions
    • B Kantharaju Reshmashree
    • Langlet Caroline
    • Barange Mukesh
    • Clavel Chloé
    • Pelachaud Catherine I
    , 2020. Group cohesion is an emergent phenomenon that describes the tendency of the group members' shared commitment to group tasks and the interpersonal attraction among them. This paper presents a multimodal analysis of group cohesion using a corpus of multi-party interactions. 16 two-minute segments annotated with cohesion data is used. We define three layers of modalities: non-verbal social cues, dialogue acts and interruptions. The initial analysis is performed at the individual level and later, we combine the different modalities to observe their impact on perceived level of cohesion. Results indicate that occurrence of laughter and interruption are higher in high cohesive segments. We also observed that, dialogue acts and head nods did not have an impact on the level of cohesion by itself. However, when combined there was an impact on the perceived level of cohesion. Overall, the analysis shows that multimodal cues are crucial for accurate analysis of group cohesion.
  • Uniform convergence rates for the approximated halfspace and projection depth
    • Nagy Stanislav
    • Dyckerhoff Rainer
    • Mozharovskyi Pavlo
    Electronic Journal of Statistics, Shaker Heights, OH : Institute of Mathematical Statistics, 2020, 14 (2). (10.1214/20-EJS1759)
    DOI : 10.1214/20-EJS1759
  • Quantitative Propagation of Chaos for SGD in Wide Neural Networks
    • de Bortoli Valentin
    • Durmus Alain
    • Fontaine Xavier
    • Şimşekli Umut
    , 2020. In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (ie, the size of the hidden layer) $N \to +\infty$. Following a probabilistic approach, we show 'propagation of chaos' for the particle system defined by this continuous-time dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to $N$ of any particle to a solution of a mean-field McKean-Vlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different mean-field limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand. We perform various experiments on real datasets to validate our theoretical results, assessing the existence of these two regimes on classification problems and illustrating our convergence results.
  • Statistical and Topological Properties of Sliced Probability Divergences
    • Nadjahi Kimia
    • Durmus Alain
    • Chizat Lénaïc
    • Kolouri Soheil
    • Shahrampour Shahin
    • Şimşekli Umut
    , 2020. The idea of slicing divergences has been proven to be successful when comparing two probability measures in various machine learning applications including generative modeling, and consists in computing the expected value of a `base divergence' between one-dimensional random projections of the two measures. However, the computational and statistical consequences of such a technique have not yet been well-established. In this paper, we aim at bridging this gap and derive some properties of sliced divergence functions. First, we show that slicing preserves the metric axioms and the weak continuity of the divergence, implying that the sliced divergence will share similar topological properties. We then precise the results in the case where the base divergence belongs to the class of integral probability metrics. On the other hand, we establish that, under mild conditions, the sample complexity of the sliced divergence does not depend on the dimension, even when the base divergence suffers from the curse of dimensionality. We finally apply our general results to the Wasserstein distance and Sinkhorn divergences, and illustrate our theory on both synthetic and real data experiments.
  • Computing and Illustrating Query Rewritings on Path Views with Binding Patterns
    • Romero Julien
    • Preda Nicoleta
    • Amarilli Antoine
    • Suchanek Fabian
    , 2020. In this system demonstration, we study views with binding patterns, which are a formalization of REST Web services. Such views are database queries that can be evaluated using the service, but only if values for the input variables are provided. We investigate how to use such views to answer a complex user query, by rewriting it as an execution plan, i.e., an orchestration of calls to the views. In general, it is undecidable to determine whether a given user query can be answered with the available views. In this demo, we illustrate a particular scenario studied in our earlier work [11], where the problem is not only decidable, but has a particularly intuitive graphical solution. Our demo allows users to play with views defined by real Web services, and to animate the construction of execution plans visually. (10.1145/3340531.3417431)
    DOI : 10.1145/3340531.3417431
  • On the number of the rational zeros of linearized polynomials and the second-order nonlinearity of cubic Boolean functions.
    • Mesnager Sihem
    • Kim K.H.
    • Jo. M.S.
    Cryptography and Communications- Discrete Structures, Boolean Functions, and Sequences (CCDS),, 2020.
  • Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
    • Şimşekli Umut
    • Sener Ozan
    • Deligiannidis George
    • Erdogdu Murat A.
    , 2020. Despite its success in a wide range of applications, characterizing the generalization properties of stochastic gradient descent (SGD) in non-convex deep learning problems is still an important challenge. While modeling the trajectories of SGD via stochastic differential equations (SDE) under heavy-tailed gradient noise has recently shed light over several peculiar characteristics of SGD, a rigorous treatment of the generalization properties of such SDEs in a learning theoretical framework is still missing. Aiming to bridge this gap, in this paper, we prove generalization bounds for SGD under the assumption that its trajectories can be well-approximated by a \emph{Feller process}, which defines a rich class of Markov processes that include several recent SDE representations (both Brownian or heavy-tailed) as its special case. We show that the generalization error can be controlled by the \emph{Hausdorff dimension} of the trajectories, which is intimately linked to the tail behavior of the driving process. Our results imply that heavier-tailed processes should achieve better generalization; hence, the tail-index of the process can be used as a notion of "capacity metric". We support our theory with experiments on deep neural networks illustrating that the proposed capacity metric accurately estimates the generalization error, and it does not necessarily grow with the number of parameters unlike the existing capacity metrics in the literature.
  • A Proof of the Beierle-Kranz-Leander Conjecture related to Lightweight Multiplication in $F_2^n$
    • Mesnager Sihem
    • Kim K. H.
    • Jo D.
    • Choe J.
    • Han M.
    • Lee D. N.
    Journal of Designs, Codes, and Cryptography, 2020.
  • Template-Based Graph Clustering
    • Riva M.
    • Yger F.
    • Gori P.
    • Cesar R.
    • Bloch Isabelle
    , 2020.
  • Stein's method for rough paths
    • Coutin Laure
    • Decreusefond Laurent
    Potential Analysis, Springer Verlag, 2020, 53, pp.387--406. The original Donsker theorem says that a standard random walk converges in distribution to a Brownian motion in the space of continuous functions. It has recently been extended to enriched random walks and enriched Brownian motion. We use the Stein-Dirichlet method to precise the rate of this convergence in the topology of fractional Sobolev spaces. (10.1007/s11118-019-09773-z)
    DOI : 10.1007/s11118-019-09773-z
  • Discrepancies of Measured SAR between Traditional and Fast Measuring Systems
    • Liu Zicheng
    • Allal Djamel
    • Cox Maurice
    • Wiart Joe
    International Journal of Environmental Research and Public Health, MDPI, 2020. Human exposure to mobile devices is traditionally measured by a system in which the human body (or head) is modelled by a phantom and the energy absorbed from the device is estimated based on the electric fields measured with a single probe. Such a system suffers from low efficiency due to repeated volumetric scanning within the phantom needed to capture the absorbed energy throughout the volume. To speed up the measurement, fast SAR (specific absorption rate) measuring systems have been developed. However, discrepancies of measured results are observed between traditional and fast measuring systems. In this paper, the discrepancies in terms of post-processing procedures after the measurement of electric field (or its amplitude) are investigated. Here, the concerned fast measuring system estimates SAR based on the reconstructed field of the region of interest while the amplitude and phase of the electric field are measured on a single plane with a probe array. The numerical results presented indicate that the fast SAR measuring system has the potential to yield more accurate estimations than the traditional system, but no conclusion can be made on which kind of system is superior without knowledge of the field-reconstruction algorithms and the emitting source. (10.3390/ijerph17062111)
    DOI : 10.3390/ijerph17062111
  • Stein's method for diffusive limit of queueing processes
    • Besançon Eustache
    • Decreusefond Laurent
    • Moyal Pascal
    Queueing Systems, Springer Verlag, 2020, 95, pp.173--201. Donsker Theorem is perhaps the most famous invariance principle result for Markov processes. It states that when properly normalized, a random walk behaves asymptotically like a Brownian motion. This approach can be extended to general Markov processes whose driving parameters are taken to a limit, which can lead to insightful results in contexts like large distributed systems or queueing networks. The purpose of this paper is to assess the rate of convergence in these so-called diffusion approximations, in a queueing context. To this end, we extend the functional Stein method introduced for the Brownian approximation of Poisson processes, to two simple examples: the single-server queue and the infinite-server queue. By doing so, we complete the recent applications of Stein's method to queueing systems, with results concerning the whole trajectory of the considered process, rather than its stationary distribution. (10.1007/s11134-020-09658-8)
    DOI : 10.1007/s11134-020-09658-8
  • A Surrogate Model Based on Artificial Neural Network for RF Radiation Modelling with High-Dimensional Data
    • Cheng Xi
    • Henry Clément
    • Andriulli Francesco
    • Person Christian
    • Wiart Joe
    International Journal of Environmental Research and Public Health, MDPI, 2020. This paper focuses on quantifying the uncertainty in the specific absorption rate values of the brain induced by the uncertain positions of the electroencephalography electrodes placed on the patient's scalp. To avoid running a large number of simulations, an artificial neural network architecture for uncertainty quantification involving high-dimensional data is proposed in this paper. The proposed method is demonstrated to be an attractive alternative to conventional uncertainty quantification methods because of its considerable advantage in the computational expense and speed. (10.3390/ijerph17072586)
    DOI : 10.3390/ijerph17072586
  • An analysis of the transfer learning of convolutional neural networks for artistic images
    • Gonthier Nicolas
    • Gousseau Yann
    • Ladjal Saïd
    , 2020. Transfer learning from huge natural image datasets, fine-tuning of deep neural networks and the use of the corresponding pre-trained networks have become de facto the core of art analysis applications. Nevertheless, the effects of transfer learning are still poorly understood. In this paper, we first use techniques for visualizing the network internal representations in order to provide clues to the understanding of what the network has learned on artistic images. Then, we provide a quantitative analysis of the changes introduced by the learning process thanks to metrics in both the feature and parameter spaces, as well as metrics computed on the set of maximal activation images. These analyses are performed on several variations of the transfer learning procedure. In particular, we observed that the network could specialize some pre-trained filters to the new image modality and also that higher layers tend to concentrate classes. Finally, we have shown that a double fine-tuning involving a medium-size artistic dataset can improve the classification on smaller datasets, even when the task changes. (10.1007/978-3-030-68796-0_39)
    DOI : 10.1007/978-3-030-68796-0_39
  • GAME-ON: A Multimodal Dataset for Cohesion and Group Analysis
    • Maman Lucien
    • Ceccaldi Eleonora
    • Lehmann-Willenbrock Nale
    • Likforman-Sulem Laurence
    • Chetouani Mohamed
    • Volpe Gualtiero
    • Varni Giovanna
    IEEE Access, IEEE, 2020, 8, pp.124185-124203. (10.1109/ACCESS.2020.3005719)
    DOI : 10.1109/ACCESS.2020.3005719
  • The Need to Move beyond Triples
    • Suchanek Fabian
    , 2020. Almost all major knowledge bases are concerned mainly with binary relationships between entities. In this vision paper, we argue that it is time to broaden this view: first to relations of higher arity, complex objects, and events, and then also to knowledge about knowledge: We should be able to represent why something is true, that something is not true, that something happened before something else, or that something is mainly believed. While this idea is as old as Artificial Intelligence itself, we argue that only now we have the tools to achieve it: a better understanding of our use-cases and large amounts of data. We survey relevant approaches, and point out avenues of research.
  • On the boomerang uniformity of quadratic permutations
    • Mesnager Sihem
    • Tang C.
    • Xiong M.
    Journal of Designs, Codes, and Cryptography, 2020.
  • Solving $x+x^{2^l}+\cdots+x^{2^{ml}}=a$ over $\mathbb{F}_{2^n}$.
    • Mesnager Sihem
    • Kim K.H.
    • Choe J.H.
    • Lee D.N.
    • Go D.S.
    Cryptography and Communications-Discrete Structures, Boolean Functions, and Sequences (CCDS), 2020.
  • Advances in Neural Information Processing Systems 32 (NeurIPS 2019)
    • Wallach Hanna M.
    • Larochelle Hugo
    • Beygelzimer Alina
    • d'Alché-Buc Florence
    • Fox Emily B.
    • Garnett Roman
    , 2020.
  • QFib: Fast and Efficient Brain Tractogram Compression
    • Mercier Corentin
    • Rousseau S.
    • Gori P.
    • Bloch Isabelle
    • Boubekeur T.
    Neuroinformatics, Springer, 2020, 18, pp.627-640. Diffusion MRI fiber tracking datasets can contain millions of 3D streamlines, and their representation can weight tens of gigabytes of memory. These sets of streamlines are called tractograms and are often used for clinical operations or research. Their size makes them difficult to store, visualize, process or exchange over the network. We propose a new compression algorithm well-suited for trac-tograms, by taking advantage of the way streamlines are obtained with usual tracking algorithms. Our approach is based on unit vector quantization methods combined with a spatial transformation which results in low compression and decompression times, as well as a high compression ratio. For instance, a 11.5GB tractogram can be compressed to a 1.02GB file and decompressed in 11.3 seconds. Moreover, our method allows for the compression and decompression of individual streamlines, reducing the need for a costly out-of-core algorithm with heavy datasets. Last, we open a way toward on-the-fly compression and decompression for handling larger datasets without needing a load of RAM (i.e. in-core handling), faster network exchanges and faster loading times for visualization or processing.
  • Fidelity metrics between curves and surfaces: currents, varifolds, and normal cycles
    • Charon Nicolas
    • Charlier Benjamin
    • Glaunès Joan
    • Gori Pietro
    • Roussillon Pierre
    , 2020, pp.441-477. This chapter provides an overview of some mathematical and computational models that have been proposed over the past few years for defining data attachment terms on shape spaces of curves or surfaces. In all these models shapes are seen as elements of a space of generalized distributions, such as currents or varifolds. Then norms are defined through reproducing kernel Hilbert spaces (RKHS), which lead to shape distances that can be conveniently computed in practice. These were originally introduced in conjunction with diffeomorphic methods in computational anatomy and have indeed proved to be very efficient in this field. We provide a basic description of these different models and their practical implementation, then discuss the respective properties and potential advantages or downsides of each of them in diffeomorphic registration problems. (10.1016/B978-0-12-814725-2.00021-2)
    DOI : 10.1016/B978-0-12-814725-2.00021-2
  • Matrix Factorization for High Frequency Non Intrusive Load Monitoring
    • Henriet Simon
    • Fuentes Benoît
    • Şimşekli Umut
    • Richard Gael
    , 2020, pp.20-24. Non Intrusive Load Monitoring has been introduced 30 years ago in order to monitor the electric consumption of specific equipments inside a building without the need of installing multiples sensors. During three decades, researchers and industrials have described the NILM problems according to the electric data available, the desired quantity to be monitored and the application it was used for. As a consequence of the multitude of choices, a lot of different formulations can be found in the literature. This diversity makes it difficult for researchers from general domains such as machine learning to tackle the NILM problem. In this paper we aim at defining the NILM problem as a Matrix Factorization task using high frequency measurements and also to review methods to solve this problem. We start by defining the general concepts driving the NILM problem and then show how to cast high frequency NILM into a Matrix Factorization problem. Once casted as a machine learning problem, we will review general purposes algorithms applicable to this problem such as Independent Component Analysis, Sparse Coding or Semi Non-negative Matrix Factorization and specific NILM methods such as BOLT and IVMF. (10.1145/3427771.3427847)
    DOI : 10.1145/3427771.3427847
  • Introducing coherent MIMO sensing, a fading-resilient, polarization-independent approach to ϕ-OTDR
    • Guerrier Sterenn
    • Dorize Christian
    • Awwad Elie
    • Renaudier Jeremie
    Optics Express, Optical Society of America - OSA Publishing, 2020, 28 (14), pp.21081. (10.1364/OE.396460)
    DOI : 10.1364/OE.396460