A PhD-student at University Tübingen and MPI Tübingen studying Machine Learning and Massively Parallel Computing with applications in Computer Graphics and Computer Vision.

Projects

Will People Like Your Image?
pre-print (2016)

Katharina Schwarz*, Patrick Wieschollek*, Hendrik P.A. Lensch
The large distribution of digital devices, as well as cheap storage, allow us to take series of photos making sure not to miss any specific beautiful moment. Thereby, the enormous and constantly growing image assembly makes it quite time-consuming to pick the best shots afterward manually. Even more challenging, finding the most aesthetically pleasing images that might also be worth sharing is a largely subjective task in which general rules rarely apply. Nowadays, online platforms allow users to "like" or favor particular content with a single click. As we aim to predict the aesthetic quality of images, we now make use of such multi-user agreements. More precisely, we assemble a large data set of 380K images with associated meta information and derive a score to rate how visually pleasing a given photo is. Further, to predict the aesthetic quality of any arbitrary image or video, we transfer the obtained model into a deep learning problem. Our proposed model of aesthetics is validated in a user study. We demonstrate our results on applications for resorting photo collections, capturing the best shot on mobile devices and aesthetic key-frame extraction from videos.

Robust Large-scale Video Synchronization without Annotations
pre-print (2016)

Patrick Wieschollek, Ido Freeman, Hendrik P.A. Lensch
Aligning video sequences is a fundamental yet still unsolved component for a broad range of applications in computer graphics and vision. However, most image processing methods cannot be directly applied to related video problems due to the high amount of underlying data -- in our case of 1.75 TB raw video data. Using recent advances in deep learning, we present a scalable and robust method for detecting and computing optimal non-linear temporal video alignments. The presented algorithm learns to retrieve and match similar video frames from input sequences without any human interaction or additional label annotations. An iterative scheme is presented which leverages on the nature of the videos themselves to remove the need for labels. While previous methods are limited to short video sequences and assume similar settings in vegetation, season and illumination, our approach can align videos from data recorded months apart robustly.

End-to-End Learning for Image Burst Deblurring
Asian Conference on Computer Vision (ACCV) 2016 [oral presentation]

Patrick Wieschollek, Bernhard Schölkopf, Hendrik P.A. Lensch, Michael Hirsch
We present a neural network model approach for multi-frame blind deconvolution. The discriminative approach adapts and combines two recent techniques for image deblurring into a single neural network architecture. Our proposed hybrid architecture combines the explicit prediction of a deconvolution filter and non-trivial averaging of Fourier coefficients in the frequency domain. To make full use of the information contained in all images in one burst, the proposed network embeds smaller networks, which explicitly allow the model to transfer information between images in early layers. Our system is trained end-to-end using standard backpropagation on a set of artificially generated training examples, enabling competitive performance in multi-frame blind deconvolution, both on quality and runtime.
1 Random shot of 5 blurry input images and result of our deblurring approach. Drag the line to reveal the before and after.
@inproceeding{accv2016/Wieschollek, author = {Patrick Wieschollek and Bernhard Sch{\"{o}}lkopf and Hendrik P. A. Lensch and Michael Hirsch }, title = {End-to-End Learning for Image Burst Deblurring}, booktitle = {Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2016} }

Efficient Large-scale Approximate Nearest Neighbor Search on the GPU
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016

Patrick Wieschollek, Oliver Wang, Alexander Sorkine-Hornung, Hendrik P.A. Lensch
We present a new approach for efficient approximate nearest neighbor (ANN) search in high dimensional spaces, extending the idea of Product Quantization. We propose a two level product and vector quantization tree that reduces the number of vector comparisons required during tree traversal. Our approach also includes a novel highly parallelizable re-ranking method for candidate vectors by efficiently reusing already computed intermediate values. Due to its small memory footprint during traversal the method lends itself to an efficient, parallel GPU implementation. This Product Quantization Tree (PQT) approach significantly outperforms recent state of the art methods for high dimensional nearest neighbor queries on standard reference datasets. Ours is the first work that demonstrates GPU performance superior to CPU performance on high dimensional, large scale ANN problems in time-critical real-world applications, like loop-closing in videos.
@inproceedings{cvpr2016/Wieschollek author = {Patrick Wieschollek and Oliver Wang and Alexander Sorkine-Hornung and Hendrik P.A. Lensch}, title = {Efficient Large-scale Approximate Nearest Neighbor Search on the GPU}, booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2016} }

Backpropagation Training for Fisher Vectors within Neural Networks
pre-print (2016)

Patrick Wieschollek, Fabian Groh, Hendrik P.A. Lensch
Fisher-Vectors (FV) encode higher-order statistics of a set of multiple local descriptors like SIFT features. They already show good performance in combination with shallow learning architectures on visual recognitions tasks. Current methods using FV as a feature descriptor in deep architectures assume that all original input features are static. We propose a framework to jointly learn the representation of original features, FV parameters and parameters of the classifier in the style of traditional neural networks. Our proof of concept implementation improves the performance of FV on the Pascal Voc 2007 challenge in a multi-GPU setting in comparison to a default SVM setting. We demonstrate that FV can be embedded into neural networks at arbitrary positions, allowing end-to-end training with back-propagation.

Transfer Learning for Material Classification using Convolutional Networks
pre-print (2015)

Patrick Wieschollek, Hendrik P.A. Lensch
Material classification in natural settings is a challenge due to complex interplay of geometry, reflectance properties, and illumination. Previous work on material classification relies strongly on hand-engineered features of visual samples. In this work we use a Convolutional Neural Network (convnet) that learns descriptive features for the specific task of material recognition. Specifically, transfer learning from the task of object recognition is exploited to more effectively train good features for material classification. The approach of transfer learning using convnets yields significantly higher recognition rates when compared to previous state-of-the-art approaches. We then analyze the relative contribution of reflectance and shading information by a decomposition of the image into its intrinsic components. The use of convnets for material classification was hindered by the strong demand for sufficient and diverse training data, even with transfer learning approaches. Therefore, we present a new data set containing approximately 10k images divided into 10 material categories.

Robust and Efficient Kernel Hyperparameter Paths with Guarantees
International Conference on Machine Learning (ICML) 2014

Joachim Giesen, Soeren Laue, Patrick Wieschollek
We present a general framework for computing approximate solution paths for parameterized optimization problems. The framework can not only be used to compute regularization paths but also for computing the entire kernel hyperparamter solution path for support vector machines and the robust Kernel Regression. We prove a combinatorial complexity of the ε-approximate solution path of O(1/ε) which is independent of the number of data points.
@proceedings{icml2014/Wieschollek, Author = {Joachim Giesen and Soeren Laue and Patrick Wieschollek}, title = {Proceedings of the 31th International Conference on Machine Learning,{ICML} 2014, Beijing, China, 21-26 June 2014}, series = {{JMLR} Workshop and Conference Proceedings}, volume = {32}, publisher = {JMLR.org}, year = {2014}, url = {http://jmlr.org/proceedings/papers/v32/} }

Implementations

Optimization in C++11

Build Status The MIT License (MIT)

A header-only c++11 implementation of common optimization algorithms and MATLAB bindings.

The library currently contains the following solvers:

  • gradient descent solver
  • conjugate gradient descent solver
  • Newton descent solver
  • BFGS solver
  • L-BFGS solver
  • L-BFGS-B solver
  • Nelder-Mead solver
  • CMAes solver

The code uses Eigen3 for optimal vectorization.

code

tfBlueprint

The MIT License (MIT)

Some re-implementation of recent deep-learning papers (incl. NIPS16) and multi-GPU training.

  • easy datastreams to producing incoming data
  • multi-threaded prefetch
  • multi-gpu training with tfSlim support
code

InfoMark

The MIT License (MIT)

A webservice to manage courses with exercise sheets. Students can upload their solutions for exercise sheets, which will be grade by tutors. Features:

  • runs on RubyOnRails, Redis, Resque, Docker, Postgresql
  • automatically test student submissions inside Docker-sandbox (async)
  • online grading form for submissions
  • inline-comments on students code
  • manage exams, exercise-groups, ...
  • upload course slides
  • zero-downtime deploy

Handled over 400 students from course "Informatik 2" last term .

code