MadMode

Dan Connolly's tinkering lab notebook

Writing Madmode Articles with IPython and Docker

When I was doing exploratory signal processing a year ago, the IPython notebook was obviously a good tool. I tried it again recently for bringing old math notes back to life, and that went well too. So I'm putting a little effort into tooling support.

Hypertext editing with markdown works pretty well, especially a cell at a time. I was a little concerned that I'd miss the ability to select/cut/copy/paste multiple cells like Mathematica or do file-wide search and replace like emacs, but so far I haven't needed to.

Installing IPython notebook via a docker container

The Ubuntu 12.04 ipython notebook package isn't up to the task, and between these episodes, my manual installation bit-rotted. I got it running again and jotted down some rough notes for future reference:

#!/bin/sh
> virtualenv ~/pyenv/pynb
> . ~/pyenv/pynb/bin/activate
(pynb)> pip install ipython
(pynb)> sudo apt-get install libzmq-dev
(pynb)> pip install pyzmq
        ZMQ version detected: 2.1.11
   Warning: Detected ZMQ version: 2.1.11, but pyzmq targets ZMQ 4.0.3.
   Warning: libzmq features and fixes introduced after 2.1.11 will be unavailable.
(pynb)> pip install jinja2
  Downloading Jinja2-2.7.1.tar.gz (377Kb): 377Kb downloaded
(pynb)> pip install tornado
  Downloading tornado-3.1.1.tar.gz (374Kb): 374Kb downloaded
(pynb)> ipython notebook

Those notes quickly get out of date. For example, nbconvert requires pandoc.

Then I realized a docker container would be just the thing. And lo, dckc/ipython-docker is born:

$ sudo docker run -p 8123:8888 -v `/bin/pwd`:/notebooks  -t dckc/ipython-docker
2013-12-31 04:28:05.305 [NotebookApp] Created profile dir: u'/.ipython/profile_default'
2013-12-31 04:28:05.308 [NotebookApp] Using MathJax from CDN: http://cdn.mathjax.org/mathjax/latest/MathJax.js
2013-12-31 04:28:05.320 [NotebookApp] Serving notebooks from local directory: /notebooks
2013-12-31 04:28:05.320 [NotebookApp] The IPython Notebook is running at: http://0.0.0.0:8888/
2013-12-31 04:28:05.321 [NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Note http://0.0.0.0:8888/ is an address from inside the container. From outside the container, we use port 8123.

The Control-C message needs some context too: you'd have to attach the container to send it signals via the keyboard. I typically just use sudo docker kill to stop the service. I haven't bothered with the details of starting at boot and such.

Of course, after I got it all working, I found several other ipython images in the index. But I'm not sorry I worked it out for myself.

Getting started with docker

Docker is moving rapidly, but it's considerably more polished now than when I first looked at it. The docker apt-repositories for Ubuntu (key fingerprint: 36A1 D786 9245 C895 0F96 6E92 D857 6A8B A88D 21E9) work just fine. My only issue getting it started this time was that I had an old installation lying around in /usr/local/bin and it was getting in the way, and the diagnostics were a little mysterious:

$ sudo docker run -p :8888 -t ipython-notebook
WARNING: The mapping to public ports on your host has been deprecated. Use -p to publish the ports.

Generating a static HTML version of a notebook

IPython supports conversion to HTML, but out-of-the-box, you either get:

  1. a stand-alone HTML document
    • with all sorts of CSS that may or may not conflict with a blog style
    • with no links to blog context
  2. a stripped-down HTML document body with
    • no style
    • no syntax highlighting

Fortunately, the API for custom renditions is straightforward and well documented. My mm_ipy.py is serviceable, though I'm still working through some issues with pygments vs. javascript code highlighting and such.

Let's import it to take a look:

In [1]:
import imp

mm_ipy = imp.load_source('mm_ipy', 'code/ipynb_pub/mm_ipy.py')

Then let's connect the dots between rst markup in documentation and HTML renditions of values in ipython notebook cell outputs:

In [2]:
from docutils.core import publish_parts

class Doc(object): def init(self, it): self.it = it

<span class="k">def</span> <span class="nf">_repr_html_</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">publish_parts</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">it</span><span class="o">.</span><span class="n">__doc__</span><span class="p">,</span> <span class="n">writer_name</span><span class="o">=</span><span class="s">&#39;html&#39;</span><span class="p">)[</span><span class="s">&#39;html_body&#39;</span><span class="p">]</span>

In [3]:
Doc(mm_ipy)

Out[3]:

mm_ipy -- convert ipython notebook to markdown for madmode blog

Usage:

$ python article_in.ipynb article_out.md

See article_meta for conventions for title, date, tags, etc.

Note

IPython.nbconvert.HTMLExporter has late-binding dependencies on pandoc, pygments, etc.

Acknowledgements

The notebook should have some article metadata in a markdown cell surrounded by a certain kind of pre tags:

In [4]:
Doc(mm_ipy.article_meta)

Out[4]:

Collect article metadata from a notebook.

The title is taken from the (first) heading level 1 cell.

Other metadata is taken from the (first) cell that starts with:

>>> print article_meta.func_defaults[0]
<pre class="about yaml">

Metadata is written in YAML-ish name: value style (see grok_yaml for details).

The closing tag is ignored:

>>> print article_meta.func_defaults[1]
</pre>

In [5]:
Doc(mm_ipy.grok_yaml)

Out[5]:

Quick-n-dirty YAML parser.

>>> grok_yaml("""<pre>
... date: 2001-01-01
... tags: ['travel', 'humor']
... </pre>""", excludes=['<'])
[('date', '2001-01-01'), ('tags', "['travel', 'humor']")]

Note

TODO: handle continuation lines properly.

>>> grok_yaml("""<pre>
... summary: What I did
...   this summer.
... </pre>""", excludes=['<'])
[('summary', 'What I did'), ('  this summer.',)]

Packages from apt and PyPI

The container doesn't have access to packages installed in the host system via pip or apt-get. But I can install from pypi within the container.

Installing from within the Dockerfile makes the package part of the container, but it involves killing and re-starting the container. And it feels less minimal/modular somehow.

Installing from within a notebook (e.g. !pip install docutils) is handy, but once the container is stopped, the installation goes away (unless the container is committed and the image kept handy somehow).

File layout limitations

The IPython notebook service can only see notebooks in one directory. I wish it were more web-like, i.e. it expected to be part of a larger whole. I'd like to use it to edit .ipynb files under the various date-oriented subdirectories of my blog. Linking from a notebook to files elsewhere in the blog is also pretty awkward.

References

TODO: Zotero integration