JuliaHub Blog: Insights & Updates

Julia Joins Petaflop Club: Performance Milestone | JuliaHub

Written by Andrew Claster | Sep 12, 2017

Berkeley, CA – Julia has joined the rarefied ranks of computing languages that have achieved peak performance exceeding one petaflop per second – the so-called ‘Petaflop Club.’

The Julia application that achieved this milestone is called Celeste. It was developed by a team of astronomers, physicists, computer engineers and statisticians from UC Berkeley, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Intel, Julia Computing and the Julia Lab at MIT.

Celeste uses the Sloan Digital Sky Survey (SDSS), a dataset of astronomical images from the Apache Point Observatory in New Mexico that includes every visible object from over 35% of the sky – hundreds of millions of stars and galaxies. Light from the most distant of these galaxies has been traveling for billions of years and lets us see how the universe appeared in the distant past.

Since SDSS data collection began in 1998, the process of cataloging these stars and galaxies was painstaking and laborious.

So the Celeste team developed a new parallel computing method to process the entire SDSS dataset. Celeste is written entirely in Julia, and the Celeste team loaded an aggregate of 178 terabytes of image data to produce the most accurate catalog of 188 million astronomical objects in just 14.6 minutes with state-of-the-art point and uncertainty estimates.

The Celeste team achieved peak performance of 1.54 petaflops using 1.3 million threads on 9,300 Knights Landing (KNL) nodes of the Cori supercomputer at NERSC. This result was achieved through a combination of a sophisticated parallel scheduling algorithm and optimizations to the single core version which resulted in a 1,000x improvement on a single core compared to the previously published version.

The Celeste research team is already looking to new challenges. For example, the Large Synoptic Survey Telescope (LSST), scheduled to begin operation in 2019, is 14 times larger than the Apache Point telescope and will produce 15 terabytes of images every night. This means that every few days, the LSST will produce more visual data than the Apache Point telescope has produced in 20 years. With Julia and the Cori supercomputer, the Celeste team can analyze and catalog every object in those nightly images in as little as 5 minutes.

The Celeste team is also working to:

  • Further increase the precision of point and uncertainty estimates

  • Identify ever-fainter points of light near the detection limit

  • Improve the quality of native code for high performance computing

The Celeste project is a shining example of:

  • High performance computing applied to real-world problems

  • Cross-institutional collaboration including researchers from UC Berkeley, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Intel, Julia Computing and the Julia Lab at MIT

  • Cross-departmental collaboration including astronomy, physics, computer science, engineering and mathematics

  • Julia, the fastest modern open source high performance programming language for scientific computing

  • Parallel and multithreading supercomputing capabilities

  • Public support for basic and applied scientific research

About Julia and Julia Computing

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1.2 million downloads and +161% annual growth, Julia is one of the top programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Citibank, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC and Uber.

  1. Julia is lightning fast. Julia is being used in production today and has generated speed improvements up to 1,000x for insurance model estimation and parallel supercomputing astronomical image analysis.

  2. Julia provides unlimited scalability. Julia applications can be deployed on large clusters with a click of a button and can run parallel and distributed computing quickly and easily on tens of thousands of nodes.

  3. Julia is easy to learn. Julia’s flexible syntax is familiar and comfortable for users of Python, R and Matlab.

  4. Julia integrates well with existing code and platforms. Users of C, C++, Python, R and other languages can easily integrate their existing code into Julia.

  5. Elegant code. Julia was built from the ground up for mathematical, scientific and statistical computing. It has advanced libraries that make programming simple and fast and dramatically reduce the number of lines of code required – in some cases, by 90% or more.

  6. Julia solves the two language problem. Because Julia combines the ease of use and familiar syntax of Python, R and Matlab with the speed of C, C++ or Java, programmers no longer need to estimate models in one language and reproduce them in a faster production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.