Let's Run Python on a Supercomputer!
Once upon a time, coding for supercomputers meant lots of Fortran. Users were invariably physicists, engineers and mathematicians who already knew how to code. Today, almost all fields of research have big data and ever more complex analysis, necessitating a move from the desktop to large-scale compute. Now there's far more diversity, and happily, much Python being run on supercomputers across the world.
I'm going to talk about the general architecture for supercomputers, the parallel programming patterns to suit them, and how to implement them in Python. This includes the traditional message-passing approaches, as well as modern tools like numpy, dask, cython and numba which mean we can squeeze out performance that's competitive with low-level languages, but a whole lot more fun to write.
Supercomputers are usually managed multi-user environments, and so setting up the environment and packages needed for your code takes some thought. You can use what's pre-installed for you, install your own packages in a virtual environment, or go all-out and put everything in a container. We'll go over what's involved, and the trade-offs of each approach.
Finally, while academics might have ready access to a supercomputer, most of us do not. Never fear! I'll go over how to build your own in the cloud that costs a couple bucks an hour to run.
Dr. David Perry is Compute Integration Specialist at The University of Melbourne, working to increase research productivity using cloud and high performance computing. David chairs Australia's first community-owned wind farm, Hepburn Wind, and is co-founder/CTO of BoomPower, which facilitates simple solar and battery decisions for consumers and NGOs.