Publications

2023

Zero-shot spatial layout conditioning for text-to-image diffusion models
- Couairon Guillaume
- Careil Marlène
- Cord Matthieu
- Lathuilière Stéphane
- Verbeek Jakob
, 2023. Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process. Expressing spatial constraints, e.g. to position specific objects in particular locations, is cumbersome using text; and current text-based image generation models are not able to accurately follow such instructions. In this paper we consider image generation from text associated with segments on the image canvas, which combines an intuitive natural language interface with precise spatial control over the generated content. We propose ZestGuide, a zero-shot segmentation guidance approach that can be plugged into pre-trained text-to-image diffusion models, and does not require any additional training. It leverages implicit segmentation maps that can be extracted from cross-attention layers, and uses them to align the generation with input masks. Our experimental results combine high image quality with accurate alignment of generated content with input segmentations, and improve over prior work both quantitatively and qualitatively, including methods that require training on images with corresponding segmentations. Compared to Paint with Words, the previous state-of-the art in image generation with zero-shot segmentation conditioning, we improve by 5 to 10 mIoU points on the COCO dataset with similar FID scores.
Visualization Empowerment: How to Teach and Learn Data Visualization
- Bach Benjamin
- Carpendale Sheelagh
- Hinrichs Uta
- Huron Samuel
, 2023, pp.10.4230/DagRep.12.6.83. Data visualization is becoming an important asset for a data-literate, informed, and critical society. Despite the variety of existing resources to teach theories and practical skills in this domain, little is known about 1) how learning processes in the context of visualization unfold and 2) best practices for engaging and teaching data visualization to diverse audiences and in different contexts. This Dagstuhl Seminar invited practitioners, researchers, and teachers from the areas of visualization, design, education and cognitive psychology to explore these questions from multiple perspectives. Through a range of practical activities, talks, and discussions, we have begun characterizing and classifying teaching methodologies. We have redacted a pedagogical manifesto, and started formalizing the concept of improvisation with visualization in the context of teaching and learning. We have also interrogated creativity as an important aspect of visualization teaching and learning and explored links between data physicalization and visualization teaching activities. Across these different themes, we have begun to map out the challenges of visualization teaching and learning and the opportunities for research and practice in this area. (10.4230/DagRep.12.6.83)
DOI : 10.4230/DagRep.12.6.83
Parallelizable Synthesis of Arbitrary Single-Qubit Gates with Linear Optics and Time-Frequency Encoding
- Henry Antoine
- Raghunathan Ravi
- Ricard Guillaume
- Lefaucher Baptiste
- Miatto Filippo
- Belabas Nadia
- Zaquine Isabelle
- Alléaume Romain
Physical Review A, American Physical Society, 2023, 107, pp.062610. We propose novel methods for the exact synthesis of single-qubit unitaries with high success probability and gate fidelity, considering both time-bin and frequency-bin encodings. The proposed schemes are experimentally implementable with a spectral linear-optical quantum computation (S-LOQC) platform, composed of electro-optic phase modulators and phase-only programmable filters (pulse shapers). We assess the performances in terms of fidelity and probability of the two simplest 3-components configurations for arbitrary gate generation in both encodings and give an exact analytical solution for the synthesis of an arbitrary single-qubit unitary in the time-bin encoding, using a single-tone Radio Frequency (RF) driving of the EOMs. We further investigate the parallelization of arbitrary single-qubit gates over multiple qubits with a compact experimental setup, both for spectral and temporal encodings. We systematically evaluate and discuss the impact of the RF bandwidththat conditions the number of tones driving the modulators-and of the choice of encoding for different targeted gates. We moreover quantify the number of high fidelity Hadamard gates that can be synthesized in parallel, with minimal and increasing resources in terms of driving RF tones in a realistic system. Our analysis positions spectral S-LOQC as a promising platform to conduct massively parallel single qubit operations, with potential applications to quantum metrology and quantum tomography. (10.1103/PhysRevA.107.062610)
DOI : 10.1103/PhysRevA.107.062610
Pseudo-Bayesian Approach for Robust Mode Detection and Extraction Based on the STFT
- Legros Quentin
- Fourer Dominique
Sensors, MDPI, 2023, 23 (1), pp.85. This paper addresses the problem of disentangling nonoverlapping multicomponent signals from their observation being possibly contaminated by external additive noise. We aim to extract and to retrieve the elementary components (also called modes) present in an observed nonstationary mixture signal. To this end, we propose a new pseudo-Bayesian algorithm to perform the estimation of the instantaneous frequency of the signal modes from their time-frequency representation. In a second time, a detection algorithm is developed to restrict the time region where each signal component behaves, to enhance quality of the reconstructed signal. We finally deal with the presence of noise in the vicinity of the estimated instantaneous frequency by introducing a new reconstruction approach relying on nonbinary band-pass synthesis filters. We validate our methods by comparing their reconstruction performance to state-of-the-art approaches through several experiments involving both synthetic and real-world data under different experimental conditions. (10.3390/s23010085)
DOI : 10.3390/s23010085
Interactive Depixelization of Pixel Art through Spring Simulation
- Matusovic Marko
- Parakkat Amal Dev
- Eisemann Elmar
Computer Graphics Forum, Wiley, 2023, 42 (2). We introduce an approach for converting pixel art into high-quality vector images. While much progress has been made on automatic conversion, there is an inherent ambiguity in pixel art, which can lead to a mismatch with the artist’s original intent. Further, there is room for incorporating aesthetic preferences during the conversion. In consequence, this work introduces an interactive framework to enable users to guide the conversion process towards high-quality vector illustrations. A key idea of the method is to cast the conversion process into a spring-system optimization that can be influenced by the user. Hereby, it is possible to resolve various ambiguities that cannot be handled by an automatic algorithm.
The Software Heritage License Dataset (2022 Edition)
- González-Barahona Jesús M.
- Montes-Leon Sergio
- Robles Gregorio
- Zacchiroli Stefano
Empirical Software Engineering, Springer Verlag, 2023. Context: When software is released publicly, it is common to include with it either the full text of the license or licenses under which it is published, or a detailed reference to them. Therefore public licenses, including FOSS (free, open source software) licenses, are usually publicly available in source code repositories. Objective: To compile a dataset containing as many documents as possible that contain the text of software licenses, or references to the license terms. Once compiled, characterize the dataset so that it can be used for further research, or practical purposes related to license analysis. Method: Retrieve from Software Heritage-the largest publicly available archive of FOSS source code-all versions of all files whose names are commonly used to convey licensing terms. All retrieved documents will be characterized in various ways, using automated and manual analyses. Results: The dataset consists of 6.9 million unique license files. Additional metadata about shipped license files is also provided, making the dataset ready to use in various contexts, including: file length measures, MIME type, SPDX license (detected using ScanCode), and oldest appearance. The results of a manual analysis of 8102 documents is also included, providing a ground truth for further analysis. The dataset is released as open data as an archive file containing all deduplicated license files, plus several portable CSV files with metadata, referencing files via cryptographic checksums. Conclusions: Thanks to the extensive coverage of Software Heritage, the dataset presented in this paper covers a very large fraction of all software licenses for public code. We have assembled a large body of software licenses, characterized it quantitatively and qualitatively, and validated that it is mostly composed of licensing information and includes almost all known license texts. The dataset can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. It can also be used in practice to improve tools detecting licenses in source code. (10.1007/s10664-023-10377-w)
DOI : 10.1007/s10664-023-10377-w
Provably Efficient Learning of Phases of Matter via Dissipative Evolutions
- Onorati Emilio
- Rouzé Cambyse
- Watson James
- Stilck França Daniel
, 2023. The combination of quantum many-body and machine learning techniques has recently proved to be a fertile ground for new developments in quantum computing. Several works have shown that it is possible to classically efficiently predict the expectation values of local observables on all states within a phase of matter using a machine learning algorithm after learning from data obtained from other states in the same phase. However, existing results are restricted to phases of matter such as ground states of gapped Hamiltonians and Gibbs states that exhibit exponential decay of correlations. In this work, we drop this requirement and show how it is possible to learn local expectation values for all states in a phase, where we adopt the Lindbladian phase definition by Coser \& Pérez-García [Coser \& Pérez-García, Quantum 3, 174 (2019)], which defines states to be in the same phase if we can drive one to other rapidly with a local Lindbladian. This definition encompasses the better-known Hamiltonian definition of phase of matter for gapped ground state phases, and further applies to any family of states connected by short unitary circuits, as well as non-equilibrium phases of matter, and those stable under external dissipative interactions. Under this definition, we show that $N = O(\log(n/δ)2^{polylog(1/ε)})$ samples suffice to learn local expectation values within a phase for a system with $n$ qubits, to error $ε$ with failure probability $δ$. This sample complexity is comparable to previous results on learning gapped and thermal phases, and it encompasses previous results of this nature in a unified way. Furthermore, we also show that we can learn families of states which go beyond the Lindbladian definition of phase, and we derive bounds on the sample complexity which are dependent on the mixing time between states under a Lindbladian evolution. (10.48550/arXiv.2311.07506)
DOI : 10.48550/arXiv.2311.07506
Cardiac Adipose Tissue Segmentation via Image-Level Annotations
- Huang Ziyi
- Gan Yu
- Lye Theresa
- Liu Yanchen
- Zhang Haofeng
- Laine Andrew
- Angelini Elsa
- Hendon Christine
IEEE Journal of Biomedical and Health Informatics, Institute of Electrical and Electronics Engineers, 2023, pp.1-12. (10.1109/JBHI.2023.3263838)
DOI : 10.1109/JBHI.2023.3263838
Sparse Graph Neural Networks with Scikit-network
- Bonald Thomas
- Delarue Simon
, 2023. In recent years, Graph Neural Networks (GNNs) have undergone rapid development and have become an essential tool for building representations of complex relational data. Large real-world graphs, characterised by sparsity in relations and features, necessitate dedicated tools that existing dense tensor-centred approaches cannot easily provide. To address this need, we introduce a GNNs module in Scikit-network, a Python package for graph analysis, leveraging sparse matrices for both graph structures and features. Our contribution enhances GNNs efficiency without requiring access to significant computational resources, unifies graph analysis algorithms and GNNs in the same framework, and prioritises user-friendliness.
Few-shot Semantic Image Synthesis with Class Affinity Transfer
- Careil Marlène
- Verbeek Jakob
- Lathuilière Stéphane
, 2023. Semantic image synthesis aims to generate photo realistic images given a semantic segmentation map. Despite much recent progress, training them still requires large datasets of images annotated with per-pixel label maps that are extremely tedious to obtain. To alleviate the high annotation cost, we propose a transfer method that leverages a model trained on a large source dataset to improve the learning ability on small target datasets via estimated pairwise relations between source and target classes. The class affinity matrix is introduced as a first layer to the source model to make it compatible with the target label maps, and the source model is then further finetuned for the target domain. To estimate the class affinities we consider different approaches to leverage prior knowledge: semantic segmentation on the source domain, textual label embeddings, and self-supervised vision features. We apply our approach to GAN-based and diffusion-based architectures for semantic synthesis. Our experiments show that the different ways to estimate class affinity can be effectively combined, and that our approach significantly improves over existing state-of-the-art transfer approaches for generative image models.
LEARNING RAW IMAGE DENOISING USING A PARAMETRIC COLOR IMAGE MODEL
- Achddou Raphaël
- Gousseau Yann
- Ladjal Saïd
, 2023. Deep learning methods for image restoration have produced impressive results over recent years. Nevertheless, they generalize poorly and need large learning image datasets to be collected for each new acquisition modality. In order to avoid the building of such datasets, it has been recently proposed to develop synthetic image datasets for training image restoration methods, using scale invariant dead leaves models. While the geometry of such models can be successfully encoded with only a few parameters, the color content cannot be straightforwardly encoded. In this paper, we leverage the concept of color lines prior to build a light parametric color model relying on a chromaticity/luminance factorization. Further, we show that the corresponding synthetic dataset can be used to train neural networks for the denoising of RAW images from different camera-phones, without using any image from these devices. This shows the potential of our approach to increase the generalization capacity of learning-based denoising approaches in real case scenarios.
Face Aging via Diffusion-based Editing
- Chen Xiangyi
- Lathuilière Stéphane
, 2023. In this paper, we address the problem of face aging: generating past or future facial images by incorporating age-related changes to the given face. Previous aging methods rely solely on human facial image datasets and are thus constrained by their inherent scale and bias. This restricts their application to a limited generatable age range and the inability to handle large age gaps. We propose FADING, a novel approach to address Face Aging via DIffusion-based editiNG. We go beyond existing methods by leveraging the rich prior of large-scale language-image diffusion models. First, we specialize a pre-trained diffusion model for the task of face age editing by using an age-aware fine-tuning scheme. Next, we invert the input image to latent noise and obtain optimized null text embeddings. Finally, we perform text-guided local age editing via attention control. The quantitative and qualitative analyses demonstrate that our method outperforms existing approaches with respect to aging accuracy, attribute preservation, and aging quality.
Majorana stellar representation of twisted photons
- Fabre Nicolas
- Klimov Andrei B
- Murenzi Romain
- Gazeau Jean-Pierre
- Sánchez-Soto Luis L
Physical Review Research, American Physical Society, 2023, 5 (3), pp.L032006. Majorana stellar representation, which visualizes a quantum spin as points on the Bloch sphere, allows quantum mechanics to accommodate the concept of trajectory, the hallmark of classical physics. We extend this notion to the discrete cylinder, which is the phase space of the canonical pair angle and orbital angular momentum. We demonstrate that the geometrical properties of the ensuing constellations aptly encapsulate the quantumness of the state. (10.1103/PhysRevResearch.5.L032006)
DOI : 10.1103/PhysRevResearch.5.L032006
Runaway signals: Exaggerated displays of commitment may result from second-order signaling
- Lie-Panis Julien
- Dessalles Jean-Louis
Journal of Theoretical Biology, Elsevier, 2023, 572, pp.111586. To demonstrate their commitment, for instance during wartime, members of a group will sometimes all engage in the same ruinous display. Such uniform, high-cost signals are hard to reconcile with standard models of signaling. For signals to be stable, they should honestly inform their audience; yet, uniform signals are trivially uninformative. To explain this phenomenon, we design a simple model, which we call the signal runaway game. In this game, senders can express outrage at non-senders. Outrage functions as a second-order signal. By expressing outrage at non-senders, senders draw attention to their own signal, and benefit from its increased visibility. Using our model and a simulation, we show that outrage can stabilize uniform signals, and can lead signal costs to run away. Second-order signaling may explain why groups sometimes demand displays of commitment from all their members, and why these displays can entail extreme costs. (10.1016/j.jtbi.2023.111586)
DOI : 10.1016/j.jtbi.2023.111586
Monotonic Alpha-divergence Minimisation for Variational Inference
- Daudel Kamélia
- Douc Randal
- Roueff François
Journal of Machine Learning Research, Microtome Publishing, 2023, 24 (62), pp.1-76. In this paper, we introduce a novel family of iterative algorithms which carry out $\alpha$-divergence minimisation in a Variational Inference context. They do so by ensuring a systematic decrease at each step in the $\alpha$-divergence between the variational and the posterior distributions. In its most general form, the variational distribution is a mixture model and our framework allows us to simultaneously optimise the weights and components parameters of this mixture model. Our approach permits us to build on various methods previously proposed for $\alpha$-divergence minimisation such as Gradient or Power Descent schemes and we also shed a new light on an integrated Expectation Maximization algorithm. Lastly, we provide empirical evidence that our methodology yields improved results on several multimodal target distributions and on a real data example.
Stein's method for discrete alpha stable point processes
- Decreusefond Laurent
- Vasseur Aurélien
, 2023.
Describing movement learning using metric learning
- Loriette Antoine
- Liu Wanyu
- Bevilacqua Frédéric
- Caramiaux Baptiste
PLoS ONE, Public Library of Science, 2023, 18 (2), pp.e0272509. Analysing movement learning can rely on human evaluation, e.g. annotating video recordings, or on computing means in applying metrics on behavioural data. However, it remains challenging to relate human perception of movement similarity to computational measures that aim at modelling such similarity. In this paper, we propose a metric learning method bridging the gap between human ratings of movement similarity in a motor learning task and computational metric evaluation on the same task. It applies metric learning on a Dynamic Time Warping algorithm to derive an optimal set of movement features that best explain human ratings. We evaluated this method on an existing movement dataset, which comprises videos of participants practising a complex gesture sequence toward a target template, as well as the collected data that describes the movements. We show that it is possible to establish a linear relationship between human ratings and our learned computational metric. This learned metric can be used to describe the most salient temporal moments implicitly used by annotators, as well as movement parameters that correlate with motor improvements in the dataset. We conclude with possibilities to generalise this method for designing computational tools dedicated to movement annotation and evaluation of skill learning. (10.1371/journal.pone.0272509)
DOI : 10.1371/journal.pone.0272509
On the Hardness of Module Learning with Errors with Short Distributions
- Boudgoust Katharina
- Jeudy Corentin
- Roux-Langlois Adeline
- Wen Weiqiang
Journal of Cryptology, Springer Verlag, 2023, 36 (1), pp.1-70. The Module Learning With Errors (M-LWE) problem is a core computational assumption of lattice-based cryptography which offers an interesting trade-off between guaranteed security and concrete efficiency. The problem is parameterized by a secret distribution as well as an error distribution. There is a gap between the choices of those distributions for theoretical hardness results (standard formulation of M-LWE, i.e., uniform secret modulo $q$ and Gaussian error) and practical schemes (small bounded secret and error). In this work, we make progress towards narrowing this gap. More precisely, we prove that M-LWE with uniform $\eta$-bounded secret for any $1 \leq \eta \ll q$ and Gaussian error, in both its search and decision variants, is at least as hard as the standard formulation of M-LWE, provided that the module rank $d$ is at least logarithmic in the ring degree $n$. We also prove that the search version of M-LWE with large uniform secret and uniform $\eta$-bounded error is at least as hard as the standard M-LWE problem, if the number of samples $m$ is close to the module rank $d$ and with further restrictions on $\eta$. The latter result can be extended to provide the hardness of search M-LWE with uniform η-bounded secret and error under specific parameter conditions. Overall, the results apply to all cyclotomic fields, but most of the intermediate results are proven in more general number fields. (10.1007/s00145-022-09441-3)
DOI : 10.1007/s00145-022-09441-3
Bendima: a database for marine macro-invertebrate bycatch data designed to improve reproducibility in benthic ecology
- Martin Alexis
- Blettery Jonathan
- Dettaï Agnès
- Rosset Nicolas
- Gousseau Yann
Cybium : Revue Internationale d’Ichtyologie, Paris : Muséum national d'histoire naturelle, 2023. The difficulty of identifying marine macro-invertebrates and the lack of experts, added to the growing use of complex modeling approaches based on massive datasets, has led to a reproducibility crisis in benthic ecology. Improving the reliability of identification remains a key factor to increase the quality of raw data. We developed the database Bendima to manage benthic macro-invertebrate bycatch data from the scientific survey of the French Southern Ocean and Indian Ocean fisheries. This database is structured to store observations of macro-invertebrates in the form of images of the caught organisms associated to sampling effort data and molecular data, which allows for ongoing amendments to identifications and crossreferencing with barcode data. Once uploaded and stored as digital images, the Bendima observations data underpinning models can be fully assessed, criticized and compared. Here, we describe the Bendima system and provide an overview of the contents for teams involved in biodiversity database development, benthic ecology or fisheries monitoring. (10.26028/cybium/2023-020)
DOI : 10.26028/cybium/2023-020

Retour aux années