Postdoc position in deep probabilistic programming

by Fritz Henglein, May 25, 2023

We have an open postdoc position in deep probabilistic programming (DPP) at DIKU, University of Copenhagen, with a focus on programming language technology. The position is for one year and can be extended to two years. It is open immediately.

To inquire about the position or apply for it, please send your CV, your research interests, how you believe you can contribute to and benefit from the research program below, and one relevant research paper you have authored to Fritz Henglein, [email protected].

About DPP

Deep probabilistic programming combines programming with probability distributions as data and parameters that can be optimized to match observed outcomes by automatic differentiation (AD); it is essentially Bayesian inference plus deep learning in an expressive programming language.  

About AD

AD is of particular interest to us in this project. We are exploring DSL-based linear operator representations for derivatives that share the benefits of matrices (expressive algebra, easy parallelism-preserving adjoints for running ‘in reverse’, hardware-supported high-performance execution on GPUs, etc),  but avoid their prohibitive storage and execution costs at high dimension, without using so-called ‘jvp’ and ‘vjp’ representations of derivatives expressed in a general-purpose programming language, which is inherently tricky to analyze and optimize.  This is where we’ll need your help: Our hypothesis is that AD by aggressive almost-symbolic differentiation, clever symbolic representation (e.g. tensor product representation for low-rank matrices) of derivatives, followed by parallelism-preserving transformation and optimization, and eventual compilation to a functional high-performance array programming language such as Futhark can provide execution performance advantages in a suitably constructed DSL. 

About PP 

Subsequently we’d like to add probability distributions/random variables as datatypes with exploitable algebraic properties to the DSL to implement efficient (Hamiltonian and Newtonian) probabilistic inference techniques that minimize the need for sampling to maximize computational performance.   

About applications 

The present project is driven by multiple applications in protein structure prediction, where computational efficiency is (still) a major bottleneck.  This is the targeted problem domain for this postdoc position.   The overall multilinear algebraic programming approach has broader applications, though, ranging from database query processing (‘vector spaces’ — actually modules — over Z as generalized relations) via machine learning (vector spaces over R) to quantum computation (vector spaces over C).

Required background 

Ideally you should have both 

  • a good mathematical grounding, specifically in (multi)linear algebra and in common vector spaces such as Hilbert spaces, and ideally also in probability theory for the probabilistic programming aspects, and 
  • in functional programming language technology for generating high-performance code.

The reason is that the former is not a goal by itself —we aim for very high performance in practice, the generated code really has to run very fast — and the latter by itself will make the optimization ideas seem mysterious since they exploit the former in nonobvious ways. 

No particular background in bioinformatics, proteins or such is required.  If you have some, wonderful; but we have plenty of expertise in the team that you can draw on.

About employment terms

The position can be started as soon as the paperwork is done after an employment offer has been made and accepted, which may be weeks to months, depending on your citizenship.  The salary is according to the standard Danish union contract for academics.  Including mandatory contribution to your own pension account, it can be expected to be between DKK 40,000 and DKK 50,000 /month before income taxes.  The actual amount depends on formal qualifications and seniority.  

About the project team, DIKU, UCPH and Copenhagen

The project, deep probabilistic programming for protein structure prediction, is funded by the Danish Research Council and hosted by the Programming Language and Theory of Computation (PLTC) section at the Department of Computer Science (DIKU) at the University of Copenhagen (UCPH).  It is led by Thomas Hamelryck (Department of Biology and DIKU) and Fritz Henglein (DIKU) and is performed in collaboration with and in close proximity to the Futhark high-performance functional programming language research group.  DIKU is located on the North Campus of UCPH, next to the largest park in Copenhagen.  Copenhagen is routinely ranked as one of the most livable cities in the world.  You’ll have the opportunity to gather teaching qualifications during the second postdoc year, and UCPH is usually ranked as one of the better or even best universities in Europe so that we — and you — can contribute effectively to your academic career.