diff options
-rw-r--r-- | posts/juliacon.md | 339 | ||||
-rw-r--r-- | static/fig/julia_logo.png | bin | 0 -> 6566 bytes |
2 files changed, 339 insertions, 0 deletions
diff --git a/posts/juliacon.md b/posts/juliacon.md new file mode 100644 index 0000000..52751a9 --- /dev/null +++ b/posts/juliacon.md @@ -0,0 +1,339 @@ +Title: What I Learned At JuliaCon +Author: bnewbold +Date: 2016-07-12 +Tags: tech, recurse, julia + +*Note: It looks like videos of the JuliaCon talks were uploaded [to +Youtube][youtube] the day this post was finally published!* + +[youtube]: https://www.youtube.com/playlist?list=PLP8iPy9hna6SQPwZUDtAM59-wPzCPyD_S + +I was in Cambridge, MA for a few days the other week at [JuliaCon][], a small +conference for the Julia programming language. Julia is a young language +(started around 2014 and currently pre-1.0) oriented towards fast numerical +computation: matrix manipulation, simulation, optimization, signal analysis, +etc. I've done a fair amount of such programming over the years, and it has +never felt as elegant or coherent as it could be. The available tools and +languages are generally either: + +[JuliaCon]: http://juliacon.org + +<div class="sidebar"> +<img src="/static/fig/julia_logo.png" width="180px" alt="julia logo" /> +</div> + +1. stuck in the 1980s in terms of programming language features for safety, + productivity, and collaboration (eg, Fortran and Matlab) +1. expensive proprietary closed-source packages (eg, Matlab and Mathematica) +1. general-purpose languages with numerical features either hacked on or in the + form of libraries (eg, Python) + +There is a lot to be excited about in Julia. It's already pretty fast +(leveraging pre-existing JIT tools, hand-tuned matrix and solver libraries, and +the LLVM compiler suite) and has contemporary high-level language features +(like optional type annotation, polymorphic function dispatch, package +management tools, and general systems tools (eg, JSON and HTTP support)) that +can make the language more faster to develop in, and easier to read and +maintain. I'm personally excited about the progeny of the language: the +birthplace of the language is the CSAIL building at MIT, and the spirit of +[Scheme][sicm] and the work of [Project MAC][] is sprinkled through the +project. One of the [big pitches](graydon2) of Julia is that scientists won't +need to learn both a productive high-level language (eg, Python) and a +low-level performant language (eg, C or Fortran) and interface between the two: +Julia has everything all in one place. + +[graydon2]: http://graydon2.dreamwidth.org/3186.html +[sicm]: https://mitpress.mit.edu/sites/default/files/titles/content/sicm/book.html +[Project MAC]: http://groups.csail.mit.edu/mac/projects/mac/ + +All that being said, while I thought I would be working in Julia a lot during +my time at the Recurse Center, I've ended up being much more drawn to the +[Rust][] language instead. Rust is a general systems language (it's compiled, +has stronger typing, and no garbage collection), and not great for interactive +numerical exploration, but I've found it a joy to program in: for the most part +everything *just works* the way it says it will. My recent experience with +Julia, on the other hand, has been a lot of breakage between library and +interpreter versions, poor developer usability (eg, hard to figure out where +files should live in a package), and very frustrating import/load times. Though +I have to admit that I while I pushed through some frustrations with Rust, I +haven't spent *that* much time with Julia, and may have just been impatient, so +take everything I say here with a grain of salt. + +With these feels going in, what did I learn at JuliaCon and what do I think of +the future of the language now? In the below sections I'll go over the +interesting things I saw, then come back to summary at [the end](#summary). + +### Programming Language Design + +An older research language for numerical computing that I have always been +curious about is Fortress, and the leader of that project (Guy Steele, who also +worked on the design of the Scheme and Java languages) gave one of the opening +keynote speeches at JuliaCon this year. Awesome! I get really excited about +inter-generational learning and dialog. + +Fortress was a very "mathy" language. The number tower was intended to be +"correct" (aka, have the same structure that mathematicians use), physical +units were built-in, and some operator precedence was non-transitive. Operators +on built-in types (like Integers) could be overloaded, unlike in Java, because +Fortress users could apparently be trusted to "preserve algebraic properties". +Steele is a proponent of using whitespace (or lack of whitespace) to clarify +expressions, sort of like extra parentheses, and enforcing this in the +compiler. For example, the following two statements would be equivalent in most +languages, but not in Fortress: + +``` +a + b*c + d // Clear: Ok +a+b * c+d // Misleading: Compiler Error +``` + +This was part of a general effort to allow "whiteboard" style syntax in the +language. Fortress code actually has two representations: a plain text +Scala-style source code, and a LaTeX-y symbolic math format. Steele also used +some font-coloring in his slides to differentiate different types of symbols, +which reminded me of the helpful style my undergraduate physics professors +would use on the blackboard. I think this effort to adapt the "look and feel" +of the language to how the intended audience already writes and communicates is +really cool. I wonder if a third syntax format could have been added in a +one-to-one manner: that of a general purpose language like Scala or Haskell +(both noted as influences to Fortress) to make collaboration with general +purpose programming experts easier. Steele mentioned that some efforts to make +the syntax more math-like resulted in "contortions", so there is probably more +work to be done here. + +In my limited experience, Julia has a pretty clean syntax, and allows some +math-y [unicode characters as operators][unicode_ops] (like ∈, ≠, etc), but +didn't prioritize math-y syntax as much as Fortress. Given the open challenges +with formalizing informal whiteboard syntax this may or may not have been a +missed opportunity. + +[unicode_ops]: http://docs.julialang.org/en/release-0.4/manual/unicode-input/ + +The positive lessons learned from Fortress were summarized as being the type +system, automatic parallelism (via generators and reducers), the math-y syntax, +pretty printing (I assume meaning the LaTeX-y representation), physical units, +and forced syntax clarity (aka, forced use of parentheses and whitespace). One +issue that come up during implementation was that it was hard to bound the +latency and computational complexity of type constraint solving at run-time. + +A few other talks touched on language design decisions and features. There was +a short "Functional HPC" talk by Erik Schnetter, in which it was pointed out +that for some workloads regular old garbage collection can be faster than +reference counting: I've become used to thinking of latency and GC pauses as a +huge performance problem in systems programming, but for number crunching that +isn't as much of an issue, while little reference overheads are (especially if +locks or atomic operations are necessary). + +Keno Fischer gave an overview of the [Gallium][] debugger, which had some cool +features, but is still under development. There are both AST-based and +LLVM-based backends for the debugger, which allows stepping at function calls, +line-by-line, or expression-by-expression, which is something I hadn't seen +before. He demoed stepping through each step of the creation of a matplotlib +graph, with the output shown graphically after each step. Neat stuff! + +One of my personal interests in Julia would be formalizing the syntax into a +machine-readable grammar (eg, [EBNF][] or [ABNF][]). I was lucky enough to run +in to Stefan Karpinski during one of the coffee breaks, and he pointed me to +the Julia plugin for Eclipse, which already has a partial implementation of a +grammar. + +A few talks touched on the issue of Nullable datatypes (also called "Maybe" or +"Option" types in other languages), particularly for data science and +DataFrame-type applications. I only recently encountered [Option][] (and the +related [Result][] type) datatypes, in Rust, and can see why people want these +so badly, but there doesn't seem to be a simple path forward yet. Rust really +leverages these types in function return signatures, a feature which Julia does +not have for now; I think I read rumors about them being added in the future, +but didn't hear any mention of them here or on the 1.0 feature roadmap. + +[Option]: https://doc.rust-lang.org/std/option/index.html +[Result]: https://doc.rust-lang.org/core/result/index.html +[Gallium]: http://juliacon.org/abstracts.html#Gallium +[EBNF]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form +[ABNF]: https://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_Form + +### Numeric Abstraction + +One of the big trends I saw was taking advantage of Julia's abstractions around +generic operators and arrays to experiment with novel computation strategies. +Sometimes this means improving precision (with novel data types and +representations), sometimes it means increasing performance (by changing memory +layout or distribution, or targeting special hardware), and sometimes it just +makes code more elegant or semantic. + +For example, Tim Holy gave a talk (titled "To the Curious Incident of the CPU +in the run-time") which covered a bunch of nitty-gritty details for +implementing wrapper classes that re-shape or re-size Arrays, including sparse +arrays. + +Lindsey Kuper gave a nice overview of the [ParallelAccelerator.jl][pajl] +project, which entirely re-compiles Julia into C++ to get some extra performance +from the static full-program compiler. It seems to me that this only makes +sense because the Julia language has clean abstractions that the transpiler can +take advantage of. + +[pajl]: http://juliacon.org/abstracts.html#ParallelAccelerator + +One of my favorite talks from the whole conference was David Sanders' and Luis +Benet's talk on ValidatedNumerics ("Precise and rigorous calculations for +dynamical systems"). Instead of computing on approximate (rounded) scalars, +they compute on intervals of floating point numbers (or in higher dimensions, +boxes): at the end of computation the "correct" solution is known to be within +the final box, which also gives context as to how much numerical error has +accumulated. By defining new *types* to accomplish this (specifically, +DualNumbers), they can re-use any generic code in a relatively performant +manner. They also noted that when there is an analytic form to bound the error +for all following terms, Taylor expansion approximations can be truncated as +soon as the interval error exceeds the error in all following terms. Cool! + +### Other Fun Stuff + +**[Using Julia as a Quick and Dirty Code Generator][10]:** +The speaker (Arch Robison) is clearly having way too much fun! He used Julia to +output assembly code to get fast (real-time) discrete Fourier transform (DFT) +performance for a little video game called "FreqonInvaders". Infectious +enthusiasm! + +**[Autonomous driving for RC cars with ROS and Julia][11]:** +A fun little project doing "Model Predictive Control" on a small model car to +do stunts like drifting and slide parking into a tiny space. They achieved +about a 10Hz closed-loop control latency, which seems to me like barely enough +for this sort of thing, but clearly worked alright. Everything ran on the car +itself (no computation on a remote desktop with wireless control or anything +like that), with an Odroid ARM Linux system and an Arduino-compatible +microcontroller; Julia code using JuMP and other optimization stuff ran on the +ARM system. The code and raw data (for analysis) is available on the [BARC +project website](http://www.barc-project.com). Super cool, having this stuff +being experimented with already means there will be pressure to improve +soft-real-time performance in the language itself. + +**[Astrodynamics.jl: Modern Spaceflight Dynamics in Julia][12]:** +Mostly a bunch of code for doing timebase conversions and interpreting (or +calculating) ephemeris data (which is information about where astro bodies like +the Moon and planets will be at a given time), but some simple demos of orbital +simulation and event detection (eg, perihelion time and position) as well. Would +be cool if the ValidatedNumerics stuff was integrated. + +**[GLVisualize][13]:** +The demos in this talk were really impressive: live editing of mesh vertices, +relatively high performance, real-time feedback, etc. There were a bunch of +good graphics talks: the [GR Framework][14] stuff is really impressive in scope +(though maybe not as big a performance boost over Python as hoped), and +[Vulkan][15] is exciting. + +[10]: http://juliacon.org/abstracts.html#FrequonInvaders +[11]: http://juliacon.org/abstracts.html#RaceCars +[12]: http://juliacon.org/abstracts.html#Astrodynamics +[13]: http://juliacon.org/abstracts.html#GLVisualize +[14]: http://juliacon.org/abstracts.html#GR +[15]: http://juliacon.org/abstracts.html#Vulkan + +### Diversity + +It's sad to say, but the gender diversity at the conference was really poor, +particularly in contrast to the Recurse Center (where I have spent the past +couple months). The women I did meet gave some of the best talks, are crucial +contributors to infrastructure, and are generally amazing: more please! Aside +from the principle of the thing, there is just something about a giant sea of +guys at a tech event that results in a tense group vibe. Everybody I spoke to +one-on-one was friendly and we had great conversations, but as a group there +was a lot of ice to be broken. In my experience even hitting 10-20% women in +attendance can thaw this out, but that's just my anecdotal experience. + +I haven't attended, but I hear that PyCon has done a great job improving +diversity with careful planning and [systemic initiatives][pycon-diversity]. + +Overall, I thought the conference was a great group of people and admirably +well run. I appreciated the efforts to keep costs low, and everything generally +ran on time. Thanks to all the volunteer and MIT staff organizers for their +efforts! + +[pycon-diversity]: https://us.pycon.org/2016/about/diversity/ + +### Julia 1.0 + +Stefan Karpinski gave an overview of features and roadmap for getting to Julia +1.0, which I think was a topic close to most attendee's hearts (including +mine). I ended up with a huge list of written notes, which I'll summarize +below; the punchline was aiming to have a 1.0 release around one year from now. +Apparently the one-year goal has been floated in previous years; I'm not sure +how wise it is in general to float initial release timelines for a project like +this, it seems like it will just "be done when it's done". + +Some of the goals that were interesting to me: + +- Arrays: might refactor Arrays to have a separate backing abstraction of "Buffers" + with arrays on top (apparently Lua and Torch do this). +- Strings: move full Unicode support out of core language (Base) and into a + package. The `@printf` macro will be refactored into a function. To my + surprise, currently Strings are implemented as an Array! This has a + relatively large overhead for each string (72 bytes). +- Modularity and Package infrastructure: currently a mess (I agree), `import`, + `using` and `export` will be refactored. +- Compiler: add non-pthreads multithreading; better static compilation; ability + to define a `main()` function and get a standalone script or binary; ability + to redefine functions and have the changes propagate (cache invalidation + problem); stabilize intermediate representations. Seems like a lot! +- Optimizations: faster garbage collection, more auto-vectorization (eg, for + vector floating point units), improve globals performance. Might pull in part + of ParallelAccelerator? + +I'm a little nervous how many of these goals are big open questions instead of +just implementation tasks. I wish there was a more healthy way to experiment +with new features and refactoring without breaking everything or committing to a +long-term stable API; I think other languages have settled into good patterns +for this kind of development, though maybe they needed to go through a +difficult 1.0 process first. It was mentioned that 0.6 would be the last of the +0.x series of releases and considered 1.0-alpha, and that from 1.x and on +things should generally be backwards compatible. + +Separate from Stefan's talk, there was a short overview of progress on the next +iteration of the Julia package and dependency manger, called Pkg3. The goals +were described as "a mash-up of virtualenv and cargo": virtualenv is a tool for +isolating per-application dependencies and toolchains in Python, and Cargo is +is the Rust dependency manager and build tool (which is also used in a +per-application fashion). Pkg3 sounds like it will have a concept of distinct +"global" (meaning system-wide?) installations and "local" (eg, per-project or +per-directory) installations and name-spacing. The naming could use some work, +as "global" and "local" are pretty overloaded, but I think they are chasing the +right goals. Reproducibility (both for binary generation and data/experiment +reproduction), lock files (which lock in known-good versions of dependencies a +la Cargo), and other concepts that I care about were also thrown around. I +didn't catch all the details (and I'm not sure how much has been worked out and +implemented yet), but after my experiences with [Elm and Rust][elm-broken], and +the current state of packaging for Julia, I'm excited for Pkg3! + +[elm-broken]: /2016/elm-everything-broken/ + +<a name="summary"></a> + +### Overall Julia Feels + +There is sort of an explosion of ideas and experiments going on. It feels sort +of like what the Ruby community maybe went through with web frameworks, or the +web community did with languages that compile to Javascript: ambitious ideas, +which may have been on the back-burner for some time, can finally be prototyped +quickly and tested in a mostly-real-world environment, and everybody is excited +to try it out and demo their creations. + +One of the sponsors said: + +> "there is something quite good about not feeling bad about programming" + +and that seemed representative of the current state of Julia. It seems +undeniable that the language is less painful for developing performant +numerical code than the previous generation of languages and library wrappers. + +Perhaps because of this enthusiasm and froth of ideas, I'm a little worried +that the foundations of Julia (the language and the ecosystem) have not yet had +time to fully bake. The more demos and experiments that get implemented, and +the more popular they become, the more delicate it becomes to make hard +decisions about language syntax and features. I think people want stability and +promised features *yesterday*, but these things take time and reflection. My +feelings right now is that it doesn't really matter. The enthusiasm for +*a language like Julia* is proven and growing. Julia itself might end up being +the first try that gets thrown away in a decade or two, but in the end we'll +end up with something which is both exciting and robust. + +[PyX.jl]: https://github.com/bnewbold/PyX.jl +[rust]: https://www.rust-lang.org/ + diff --git a/static/fig/julia_logo.png b/static/fig/julia_logo.png Binary files differnew file mode 100644 index 0000000..0622136 --- /dev/null +++ b/static/fig/julia_logo.png |