Hack the textbook figures
Every single figure in the text book Statistics, Data Mining, and Machine Learning in Astronomy is downloadable and fully reproducible online. Jake VanderPlas accomplished this heroic feat as a graduate student at the University of Washington. Jake recalled the origin story to some of us at the hack week. He explained that he would usually have the figure done the same week it was conceived, and was really pretty happy with the whole experience of being a part of making the textbook and ultimately becoming a coauthor. His figures are now indispensable. Because of Jake's investment, generations of astronomers to come can now benefit from reproducing the explanatory material in the Textbook. The figures are complementary to the textbook prose. The textbook prose explains the theoretical framework underlying the concepts. Equations are derived. But by digging into the textbook figure Python code, the reader can see how the method is implemented, and try it out by tweaking the input. "What happens if I double the noise? Or decimate the number of data points? Or change this-or-that parameter? How long does it take to run?"
These and other questions motivated my hack idea, which was to dig into the source code of textbook figures and do some hacking.
So on Wednesday of the Hack Week a table of about 8 of us all hacked the book figures. The figure above is one of those figures, hacked by Beth Reid (BIDS) and Phil Marshall (SLAC). Beth and Phil pair-coded on Figure 8.10 (c.f. the original figure). This choice of figure and its redesign were both inspired by Hack Week breakout sessions! Specifically, Dan Foreman-Mackey's Gaussian Process breakout on Monday, and Jake's breakout on D3.js and his matplotlib wrapper for it, MPLD3. As you can see, Beth and Phil's figure shows a special hover-over effect for different realizations of the Gaussian Process curves consistent with the data points. See the textbook, the figure caption, and/or Dan's Gaussian Process Tutorial for further discussion.
Hack Week participants Wilma Trick and Michael Walther (MPIA Heidelberg, Germany) hacked on Figure 9.14, available here. Ruth Angus (Oxford, currently a pre-doctoral visitor at CfA) made an interactive version of Figure 9.2 using IPython Notebooks.
Specifically, she changed the mean center positions of the two clusters of points in the figures to address the question- "What happens if the classification boundary is not so obvious? How does the classifier grapple with uncertainty?" It was fascinating to see the classification boundary update in real-time as Ruth dragged the
interact() slider left and right.
You can find all of the submitted hacked book figures on the project page: http://gully.github.io/astroMLfigs/. I welcome hack submissions from the community! If you have hacked on one of Jake's book figures, please submit it via GitHub pull request to the GitHub repository.
CommentsComments powered by Disqus