Astro Hack Week (hacking)http://astrohackweek.github.io/enTue, 06 Oct 2015 04:03:45 GMThttps://getnikola.com/http://blogs.law.harvard.edu/tech/rssHack the textbook figureshttp://astrohackweek.github.io/blog/Hack-the-textbook-figures.htmlMichael Gully-Santiago<div><p>Every single figure in the <a href="http://press.princeton.edu/titles/10159.html">text book</a> <em>Statistics, Data Mining, and Machine Learning in Astronomy</em> is <a href="http://www.astroml.org/book_figures/">downloadable and fully reproducible online</a>. Jake VanderPlas accomplished this heroic feat as a graduate student at the University of Washington. Jake recalled the origin story to some of us at the hack week. He explained that he would usually have the figure done the same week it was conceived, and was really pretty happy with the whole experience of being a part of making the textbook and ultimately becoming a coauthor. His figures are now indispensable. Because of Jake's investment, generations of astronomers to come can now benefit from reproducing the explanatory material in the Textbook. The figures are complementary to the textbook prose. The textbook prose explains the theoretical framework underlying the concepts. Equations are derived. But by digging into the textbook figure Python code, the reader can see how the method is <em>implemented</em>, and try it out by tweaking the input. "What happens if I double the noise? Or decimate the number of data points? Or change this-or-that parameter? How long does it take to run?" </p>
<p>These and other questions motivated my hack idea, which was to dig into the source code of textbook figures and do some hacking. </p>
<div id="test_figure"></div>
<script type="text/javascript" src="http://astrohackweek.github.io/blog/js/hack-book-figs.js"></script>
<script>
draw_figure("test_figure");
</script>
<p>So on Wednesday of the Hack Week a table of about 8 of us all hacked the book figures. The figure above is one of those figures,
</p><p><a href="http://astrohackweek.github.io/blog/Hack-the-textbook-figures.html">Read more…</a> (2 min remaining to read)</p></div>hackingIPython Notebookmachine learningstatisticsvisualizationhttp://astrohackweek.github.io/blog/Hack-the-textbook-figures.htmlTue, 07 Oct 2014 15:30:00 GMTBayesian Evidence Calculationhttp://astrohackweek.github.io/blog/bayesian-evidence.htmlKyle Barbary<div><p>In a Bayesian framework, object classification or model comparison can
be done naturally by comparing the Bayesian <em>evidence</em> between two or
more models, given the data. The evidence is the integral of the
likelihood of the data over the entire prior volume for all the model
parameters, weighted by the prior. (The ratio of evidence for two
different models is known as the <a href="http://en.wikipedia.org/wiki/Bayes_factor">Bayes
Factor</a>.) This
multi-dimensional integral gets increasingly computationally intensive
as the number of parameters increases. As a result, several clever
algorithms have been developed to efficiently approximate the answer.</p>
<p>In this hack, I looked at a couple specific implementations of such
algorithms in Python.</p>
<p><a href="http://astrohackweek.github.io/blog/bayesian-evidence.html">Read more…</a> (3 min remaining to read)</p></div>bayesian evidencehackinghttp://astrohackweek.github.io/blog/bayesian-evidence.htmlFri, 03 Oct 2014 15:00:00 GMTK2 Photometryhttp://astrohackweek.github.io/blog/k2-photometry.htmlDan Foreman-Mackey<div><div style="float: left; padding-bottom: 6px;">
<img src="http://astrohackweek.github.io/blog/images/dfm-adhw-img.png" width="500">
</div>
<p>For my AstroHackWeek project, I decided to hack on the new images coming from
<a href="http://keplerscience.arc.nasa.gov/K2/">NASA's K2 mission</a>, the second
generation of the <em>Kepler</em> satellite.
The original <em>Kepler</em> mission obtained exquisite precision in the photometry
because the satellite's pointing was stable to better than a hundredth of a
pixel.
For <em>K2</em>, this is no longer the case.
Therefore, we'll need to work a little harder to extract useful photometric
measurements from these data.
That being said, these pointing variations also break some of the degeneracies
between the flat field of the detector and the PSF so we might be able to
learn some things about <em>Kepler</em> that we couldn't have with the previous data
releases.</p>
<p>At the hack week, I got a proof-of-concept implemented but there's definitely
a lot to do if we want to develop a general method.
The basic idea is to build a flexible probabilistic model inspired by what we
know about the physical properties of <em>Kepler</em> and then optimize the
parameters of this model to produce a light curve.</p>
<p>The figure at the top of this page shows a single frame observed in the
engineering phase of K2 on the left and, on the right, the optimized model for
the same frame.
The code lives (and is being actively developed) on GitHub
<a href="https://github.com/dfm/kpsf">dfm/kpsf</a> and the K2 data can be downloaded from
<a href="http://archive.stsci.edu/search_fields.php?mission=k2">MAST</a> using Python and
the git version of <a href="https://github.com/dfm/kplr">kplr</a>.</p>
<p><a href="http://astrohackweek.github.io/blog/k2-photometry.html">Read more…</a> (2 min remaining to read)</p></div>hackingkeplerprobabilistic modelshttp://astrohackweek.github.io/blog/k2-photometry.htmlThu, 25 Sep 2014 18:00:00 GMTMulti-Output Random Forestshttp://astrohackweek.github.io/blog/multi-output-random-forests.htmlJake VanderPlas<div tabindex="-1" id="notebook" class="border-box-sizing">
<div class="container" id="notebook-container">
<div class="cell border-box-sizing text_cell rendered">
<div class="prompt input_prompt">
</div>
<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>Classic machine learning algorithms map multiple inputs to a single output.
For example, you might have five photometric observations of a galaxy, and predict a single attribute or label (like the redshift, metallicity, etc.)
When multiple ouputs are desired, standard practice is to essentially run two independent classifications: first predict one variable, then the next.
The problem with this approach is that it completely ignores <em>correlations</em> in the outputs.</p>
<p>This is my Thursday hack, which was to explore ideas to improve on this within Random Forests.</p>
<p><a href="http://astrohackweek.github.io/blog/multi-output-random-forests.html">Read more…</a> (3 min remaining to read)</p></div></div></div></div></div>hackingmachine learningpythonrandom forestshttp://astrohackweek.github.io/blog/multi-output-random-forests.htmlSat, 20 Sep 2014 18:30:00 GMT