Skip to Main content Skip to Navigation

Deep kernel representation learning for complex data and reliability issues

Abstract : The first part of this thesis aims at exploring deep kernel architectures for complex data. One of the known keys to the success of deep learning algorithms is the ability of neural networks to extract meaningful internal representations. However, the theoretical understanding of why these compositional architectures are so successful remains limited, and deep approaches are almost restricted to vectorial data. On the other hand, kernel methods provide with functional spaces whose geometry are well studied and understood. Their complexity can be easily controlled, by the choice of kernel or penalization. In addition, vector-valued kernel methods can be used to predict kernelized data. It then allows to make predictions in complex structured spaces, as soon as a kernel can be defined on it.The deep kernel architecture we propose consists in replacing the basic neural mappings functions from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs). Although very different at first glance, the two functional spaces are actually very similar, and differ only by the order in which linear/nonlinear functions are applied. Apart from gaining understanding and theoretical control on layers, considering kernel mappings allows for dealing with structured data, both in input and output, broadening the applicability scope of networks. We finally expose works that ensure a finite dimensional parametrization of the model, opening the door to efficient optimization procedures for a wide range of losses.The second part of this thesis investigates alternatives to the sample mean as substitutes to the expectation in the Empirical Risk Minimization (ERM) paradigm. Indeed, ERM implicitly assumes that the empirical mean is a good estimate of the expectation. However, in many practical use cases (e.g. heavy-tailed distribution, presence of outliers, biased training data), this is not the case.The Median-of-Means (MoM) is a robust mean estimator constructed as follows: the original dataset is split into disjoint blocks, empirical means on each block are computed, and the median of these means is finally returned. We propose two extensions of MoM, both to randomized blocks and/or U-statistics, with provable guarantees. By construction, MoM-like estimators exhibit interesting robustness properties. This is further exploited by the design of robust learning strategies. The (randomized) MoM minimizers are shown to be robust to outliers, while MoM tournament procedure are extended to the pairwise setting.We close this thesis by proposing an ERM procedure tailored to the sample bias issue. If training data comes from several biased samples, computing blindly the empirical mean yields a biased estimate of the risk. Alternatively, from the knowledge of the biasing functions, it is possible to reweight observations so as to build an unbiased estimate of the test distribution. We have then derived non-asymptotic guarantees for the minimizers of the debiased risk estimate thus created. The soundness of the approach is also empirically endorsed.
Document type :
Complete list of metadata

Cited literature [224 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, July 17, 2020 - 11:29:09 AM
Last modification on : Thursday, December 10, 2020 - 4:49:38 PM
Long-term archiving on: : Tuesday, December 1, 2020 - 12:00:57 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02901469, version 1



Pierre Laforgue. Deep kernel representation learning for complex data and reliability issues. Machine Learning [stat.ML]. Institut Polytechnique de Paris, 2020. English. ⟨NNT : 2020IPPAT006⟩. ⟨tel-02901469⟩



Record views


Files downloads