What we do - the gist
At Recursion, we have the benefit of being able to "ask the cells for more data" in frequent, high-throughput, rich biological experiments to an extent that sets us apart from most others doing biology and disease research.
We grow human cells, make them into models of thousands of rare diseases by breaking the genes corresponding to each disease, take pictures of them using automated microscopes, computationally extract 1000 structural features like shapes and textures from every cell, and quantify the structural differences that separate diseased from healthy cells. We then apply thousands of drugs to the cells corresponding to each disease, take pictures, and identify drugs that make the cells look healthy again. These drugs get investigated by our biologists, tested in animals, and eventually become new treatments for any of the thousands of untreated genetic diseases.
What you'll do
You'll work with our data, biology, and engineering teams to identify and answer questions in high-dimensional data space using your abilities and intuitions and our evolving data science platform. This platform is the core of our mission -- transforming drug discovery into a data science problem. We're tackling challenging problems, often with no obvious solutions, and in some cases with no right answers. But we're a group of sharp and highly-motivated scientists and engineers with diverse backgrounds and we're making rapid progress.
The high-level job description has only one item: do whatever is necessary to help us progress in identifying cures for diseases. We hire the best, and trust that they are usually in the best position to decide what to try next.
Typical work includes:
- Add a systems and computational biology perspective to experimental planning and data modeling discussions.
- Seek out, clean up and integrate useful external publicly-available biological datasets.
- Perform exploratory analysis and build creative visualizations from data you find and our experimental data.
- Help analyze weekly experimental datasets on the order of 10 million rows (one per imaged human cell) by 1000 features.
- Share methods with and borrow from the rest of the computational team, building a common codebase.
- Suggest and design biological experiments in collaboration with our biologists and data scientists to answer data-driven questions.
- Present your work and pick up techniques at data and biology conferences, as desired.
What you need
- Required: A PhD or equivalent experience in computational biology or bioinformatics.
- Required: Native-level fluency and the equivalent skills of 2+ years of experience in: statistics, machine learning, coding, and answering questions in high-dimensional numerical datasets. Preferably using the Python data stack (pandas, sklearn, etc). Thorough grasp of fundamentals of machine learning such as cross-validation and learning curves. Ability to, for instance, code up custom regularization or custom visualization for a given machine learning model.
- Very helpful: Code you can share, and a track record of outstanding past projects, publications, or presentations.
There are more than 5,000 untreated rare genetic diseases, which together affect nearly ten million people in the US alone. Each of these diseases affects too few people for traditional pharmaceutical companies to approach them, so we're building a way to seek treatments for hundreds of these diseases in parallel. We aim to find treatments for 100 of them in the next 10 years.
We offer competitive compensation, health insurance, an outstanding team, challenging and worthwhile problems, and close proximity to some of the best skiing, hiking and climbing in the world.