Why is Julia’s Flux Catching Fire for ML?

If you’re doing machine learning today, you’ve probably noticed that Julia is climbing language popularity charts like a rocket. Language adoption is sometimes tricky, but in this case, I believe the answer is clear. In the next series of articles, I am going to show how Julia ties together a hungry community and a few critical features that feed into one critical library: Zygote.

In this article, we’ll begin to set the tone. We’ll look at the forces at play that led to the seed community that gave Julia it’s initial foothold. Next time, we’ll look at the rocket fuel that is accelerating Julia’s growth.

This story starts with a hungry community watching a pendulum swinging back and forth across the chasm of performance and productivity.

Julia Taps a Hungry Community

The programming language graveyard is full of cool languages that don’t solve compelling problems. Technology might ensure success, but without community, technology alone is not enough. For Julia, the initial vision was the same as the current one: support science with a language that’s both fast and productive. The founding Julia team has mentioned a third hidden goal time and time again: transparency. In other words, most of Julia is written in Julia, so it’s easy to understand. You can think of these boxes as language features. They have also traditionally been compromises, so that optimizing one impacts the others:

Julia Language Strengths

Let’s load that pendulum to start it swinging between performance and productivity. You probably already know where the pendulum will start.

Scientific Languages Often Embrace Speed

Speed is seductive, and a necessary precondition for success. Data science requires performance. If your language isn’t fast enough, you can’t play. It’s no surprise that early languages and frameworks for data science optimized performance at the expense of productivity and transparency.

Early Languages for Scientific Domains Favor Performance

At this point, you can see the bet: computers were fast enough to solve problems that our brains couldn’t. Many special-purpose solutions emerged that optimized speed at the expense of everyting else. Here’s the thing, though. At some point, placing all of your bets on performance will burn you.

Speed is Not Enough

As systems get faster, the programming languages that run on them get faster too. At the same time, progress marches on too, so emerging problems require more complex solutions with more complex programs. Said another way, once problems are sufficiently complicated, if your language is too simple to model the problems you’re trying to solve, you can’t play.

Scientists noticed these trends too and began to move to languages that optimized the the thinking parts of programming, betting improvements in hardware could make up the differences in performance. And they began to adopt the best programming languages for thinking and learning, especially Python.

Language Choices Shift to Optimize Thinking

This compromise meant getting rid of the purpose-built language features for processing data quickly, and at scale. The compromise made sense. In fact, it made so much sense in so many different domains that Python became the top programming language in the world. Ah, that pendulum had swung vigorously.

That’s the funny thing about pendulums. You can absolutely trust them to abandon you. Problem domains were changing, too. You already know data science is the art of building models from data to make inferences that shape decision making.

And you know that machine learning seeks to automate that process by continuously improving models automatically, and feeding improvements back into the system. This is all, well, expensive. So, we did the only thing we could. We cheated.

The Great Compromise

Instead of adopting Python platforms, we adopted mostly Python systems by replacing the most expensive operations with other solutions written in languages like C++:

Running Fast Languages Underneath Slower Ones Impacts Transparency

The new systems were both productive and fast, but there was a cost: transparency. This didn’t seem to be very important at the time, but you know my stories always have a healthy dose of foreshadowing and drama. You already know that interfaces that cross networks, languages, or time are fraught with danger. The more often you need to drop into C++, the more opportunities that you have to get burned. After all, you might need to revisit some of those decisions those C++ library developers made, and even the interfaces between them.

Can you feel that pendulum swinging? And does it look more like that exciting rope swing in the back yard, or the wrecking ball down the road? The key to understanding is in that mysterious box on the top of those images, transparency, but we’re not yet ready for that part of the story.

The pendulum swung, and the resulting whistling winds of change blew.

Julia Hits Critical Mass

Late last year, I had the pleasure of interviewing Jeff Bezanson, the CTO of Julia Computing. One of the claims was that Julia has reached escape velocity, and it’s hard to disagree. The team had a keen understanding of the pendulum, and a more modern set of tools to solve the problems of productivity and performance at the same time. Some in the technical computing community took notice.

Business Needs Meet Compelling Technology

The scientific community was rapidly recognizing that a general purpose language that sacrifices speed isn’t going to be enough for the next generation of data science computing. It helps that the initial founders of Julia have strong ties to MIT, a community famous for pushing the state of the art in areas that demand both intellectual and computational scalability. The Julia language took root there, and began to spread.

A Peek Ahead

You can’t yet say the pendulum has completely shifted again, but it is starting to swing back toward performance. It’s the swing that Julia founders saw almost a decade ago. In the next part of the story, that unassuming foreshadowing around that transparency box takes on tremendous importance. In the next article, we’ll talk about how automatic differentiation computing got so important so fast.

In the next article, we’ll dig into that third box, the one for transparency. We’ll explore why it’s so important for the next generation of machine learning solutions, and the Julia language features that make it so good for building this kind of software.

Want to know more? Join us at Groxio. We have a Julia self guided course, and we’ll release the first bits of the Flux module on March 1. These courses will approach machine learning using a programmer’s language, so if you are having a tough time breaking through all of the math, we’ll give you a lift.

Until next time, keep an eye out for that pendulum. These next few years are going to be more fun if you’re the one doing the riding.

Bruce Tate is the founder of Groxio, a training and education company for programmers. He’s the author of more than a dozen books and an avid outdoor enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store