Bayesian inference & machine learning

(This tutorial is SOLD OUT)

Machine learning is transforming many fields in industry, allowing practitioners to fit models to ever-larger datasets to build automated systems for classifying satellite images, autonomous driving, translating between languages, and many other applications.

Machine learning is sometimes approached from a pure engineering perspective, by trying a variety of “black box” devices for learning patterns solely from data and tweaking them until they appear to work. In Python, this is quite easy with amazing packages like Scikit-Learn and TensorFlow. A purely data-driven approach may work well, or may not. Often it turns out that the data isn’t as clean or as plentiful as we would wish, or perhaps it doesn’t quite represent the problem we want to solve, so it makes sense to draw on domain knowledge.

Over last 10-20 years, Bayesian methods have provided new and better answers to many difficult questions in such diverse fields as physics, engineering, economics, and architecture. Bayesian inference follows naturally and powerfully from the laws of probability and gives us a principled approach for reasoning when data is incomplete or when we have useful prior knowledge. It is a natural fit for data-driven machine learning methods. Fitting realistic Bayesian models can be done effectively without heavy maths by using simulation techniques such as Gibbs sampling with packages like NumPy and SciPy.

This tutorial will show you examples of the opportunities and traps in fitting machine learning models to data alone, explain Bayesian inference with examples, and explain a principled step-by-step framework for how to incorporate prior knowledge into models fitted with Scikit-Learn or TensorFlow.

Intended audience:

People who are familiar with Python, NumPy, and the fundamentals of probability theory. You may have used tools like Scikit-Learn before, but this is not required.

Attendees will need:

You’ll need a laptop with the latest Anaconda installed, including Jupyter and Scikit-Learn. We will provide you with a more complete list of packages and data to download prior to the workshop.

Presented by

Ed Schofield

Dr Edward Schofield has consulted to or trained dozens of organisations in data analysis with Python, including the ABC, Barclays, Bureau of Meteorology, CISCO, CSIRO, Dolby, GDF Energie, Geoscience Australia, IMC, Macquarie, Optus, Suncorp, Toyota Technical Centre, and Woolworths. Ed holds a PhD in machine learning from Imperial College London, where his thesis was in models for speech recognition and machine translation. He also holds BA and MA (Hons) degrees in maths and computer science from Cambridge University.