MadMode http://www.madmode.com/ Dan Connolly's tinkering notebook en-us Thu, 12 Jul 2018 18:40:04 +0000 Thu, 12 Jul 2018 18:40:04 +0000 Smart Contracts for Health Research Data Sharing http://www.madmode.com//2018/boulder-bos/ <p>I'm under contract with RChain, two days a week from June to December, to work on <a href="https://github.com/rchain/bounties/issues/788">smart contracts for health research data sharing</a>. I'm still at KUMC for the other three days a week.</p> <h2>RChain Devcon Boulder</h2> <p>In April, I was invited to <a href="https://developer.rchain.coop/conference">RChain Devcon</a> in Boulder. I enjoyed Greg's <a href="https://www.youtube.com/watch?v=3R3IL1bGm0s&amp;t=617s">RChain VM talk</a>. (<em>More on that later, I hope.</em>) Vlad's CBC Casper talk was mostly familiar; I think I got most of it from <a href="https://www.youtube.com/watch?v=z3sY8zZRPtw">his devcon three talk</a>, which was just my speed.</p> <p>I gave a short talk about <a href="https://docs.google.com/presentation/d/1B2Vu8o3ACwruY6HY1ayXRQ4qkNKsMy4hdbOdxrCHI2o/edit">Getting Involved in the RChain Bounty Program</a> (<a href="https://www.youtube.com/watch?v=HsQTDNEIbjs&amp;t=1s">video</a>).</p> <p>TimBL had written <a href="https://webfoundation.org/2018/03/web-birthday-29/">The web is under threat</a>, which I used as a springboard for <a href="https://docs.google.com/presentation/d/1GOboFZj311rfGExrtCFp3aaQ2vYhxHqX3qnix21nqo4/edit">Can RChain Answer?</a> (<a href="https://www.youtube.com/watch?v=ZnBbi6ifzdo&amp;t=849s">video</a>). My summary slide is:</p> <blockquote> <p>It has a lot of the right pieces:</p> <ul> <li>Object Capability Discipline</li> <li>Support for Formal Verification</li> <li>Integrated Economics with Network Architecture</li> <li>Scalability</li> <li>Down to transistors</li> <li>Up to the globe</li> <li>Visionary Leadership</li> <li>Cooperative Governance</li> </ul> </blockquote> <p>I did a quick run up of capabilities from simple closure-based objects to sealing and unsealing to capability-based money as in <a href="http://erights.org/elib/capability/ode/index.html">Miller, Morningstar, and Franz</a> from FC 2000, including:</p> <blockquote> <p>Before presenting the following simple example of capability-based money, we must attempt to head off a confusion this example repeatedly causes. <strong>We are not proposing to actually do money this way!</strong> A desirable money system must also provide for: ... blinding, non-repudiation, accounting controls, and backing (redeemability).</p> </blockquote> <p>Then—tada!—I showed that <strong>RChain <em>does</em> propose to do money this way</strong> by translating the E code to <a href="https://github.com/rchain/rchain/blob/master/casper/src/main/rholang/MakeMint.rho">MakeMint.rho</a>:</p> <pre><code class="rholang"> contract MakeMint(return) = { new pairCh, thisMint, internalMakePurse in { MakeBrandPair!(*pairCh) | for(@p &lt;- pairCh) { ... </code></pre> <h2>HERON Clinical Data Repository Access Policy</h2> <p>The main smart contract I want to work on is an evolution of the <a href="https://informatics.kumc.edu/work/wiki/HeronAdminDev">HERON policy enforcement layer</a>, a bunch of python / PHP / SQL code that enforces the policy around access to KUMC's clinical data repository on ~500K patients: you have to be qualified faculty or sponsored by one, and your human subjects training has to be current and you have to sign a couple forms:</p> <p><img alt="flow of authority" src="https://informatics.kumc.edu/work/graphviz/a02d7fe066856aadf1894e50f41f4b2aa27ca3b4.dot.png" /></p> <p>Once you get through all that, you can use the i2b2 ad-hoc query interface to do "cohort discovery"; for example, to find out if enough of the right kind of patients come through KU Med for the study you want to do.</p> <h2>i2b2 Harvard Symposium</h2> <p>The <a href="http://transmartfoundation.org/harvard-symposium-2018/">i2b2 tranSMART Foundation Harvard Symposium</a> was this year's annual meeting of the i2b2 user community. Automation of regulatory processes is a regular topic and this year <a href="http://geekdoctor.blogspot.com/">John Halamka</a> spoke on <em>Emerging Models of Data Flow - APIs, Blockchain and Cloud</em>. He cut through a lot of the mystery around blockchain technology with a great illustration of one-way hashing. :)</p> <p><img src="https://lh3.googleusercontent.com/6w32pE2Tpc6ZHzHSWKR-TmNvIvL1Nl_21z3UAYFdCukNr-MJcDlPpVTy1HRnTOcrVD2jTt59TQeaD4WEmcftD8PtxAQWeA-OIlRJ8mageizVjdaTUCCL2ENSjmHulogCqPtwcYeQ-Q=w640-h480" width="640" height="480" /></p> <h2>Decentralized identity, verifiable claims</h2> <p>While killing time in the airport, I managed to reach Manu Sporny. His work around <a href="https://veres.io/use-cases/verifiable-prescriptions/">verifyable prescriptions</a> has certain parallels with research data sharing. It wasn't all good news, but he did put me in touch with a couple of his colleagues in the Boston area.</p> <p>I had hoped to meet with <a href="https://dustycloud.org/">Chris Webber</a> and sync up on js libraries for <a href="https://w3c-ccg.github.io/ocap-ld/">linked data capabilities</a>. Capability security seems necessary and sufficient, to me, for decentralized access control. As Stiegler put it in the <a href="http://erights.org/talks/efun/SecurityPictureBook.pdf">picture book</a>:</p> <blockquote> <p>The patterns described in this picturebook are simple because they discard the modern fascination with the identities of the participants. Individual Authentication is so pervasive, it is now a part of the problem.</p> <p>Suppose that your car, instead of accepting a delegatable key, demanded that your driver’s license match the car’s title registry, which happens to be in your spouse’s name. Entrepreneurs would leap forward to develop ever more powerful "identity management" for automobiles. We would subcontract to security experts so our teenage daughters could borrow the car to buy milk. Heaven forfend that the daughter, breaking her leg, had to delegate to her best friend to get to the hospital.</p> </blockquote> <p>Unfortunately, while Chris is closer to Boston these days, he's still a couple hours away.</p> <p>But I did get to meet <a href="http://computingjoy.com/">Dmitri Zagidulin</a> over breakfast. He has done javascript work both with TimBL and company on <a href="https://github.com/solid/solid">solid</a> and with Manu on decentralized identifiers (e.g. <a href="https://github.com/digitalbazaar/did-io">did-io</a>). He isn't yet working on linked data capabilities yet, so I twisted his arm in that direction, and <a href="http://www.madmode.com/2011/07/secure-mashups-csrf-resistent.html">away from WebID</a>.</p> <p>Things started to click for him when I talked about verifyable claims in terms of insurance and markets:</p> <blockquote> <p>Proof of identity, in so much as it involves revelation of the profile, or enables its revelation through the use of unique identifiers, is trade in an asset when the information revealed is more than the minimum required with current technology. -- <a href="https://www.w3.org/2000/12/drm-ws/pp/hp-poorvi2.html">Vora et. al from HP at W3C 2001 DRM Workshop</a></p> </blockquote> <p>Take proof of age, for example. A smart contract for a client might ask "who will bet me $10 at 50-to-1 that this request comes from someone over 21 years of age?" A business where people are routinely physically present is in a position to take such bets with high confidence.</p> <h2>Personal Health Records</h2> <p>Manu also put me in touch with <a href="http://healthurl.com/www/Blogs_+.html">Adrian Gropper</a>, CTO of Patient Privacy Rights. Meeting in the airport as he was arriving and I was departing was a bit hectic, so I couldn't be sure whether I had read the paper he and Mark Miller worked:</p> <ul> <li><a href="https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-fall2017/blob/master/final-documents/identity-hubs-capabilities-perspective.md">Identity Hubs Capabilities Perspective</a> Rebooting the Web of Trust V, October 13, 2017 by Adrian Gropper, Drummond Reed, Mark S. Miller</li> </ul> <p>His description of <a href="http://hieofone.org/">HIEofone</a> increased the priority of UMA in my things-to-study-in-more-depth list:</p> <blockquote> <p>HIE of One (Health Information Exchange of One) is a volunteer-driven open source project to combine emerging standards for access authorization (OpenID HEART) and emerging standards for blockchain-based self-sovereign identity (DID) into a patient-centered health record infrastructure.</p> </blockquote> <p>He was one of several people on this trip that talked about FHIR deployment in ways that make me want to look into it again. In particular, he mentioned a FHIR sandbox by CMS, the medicare folks.</p> <h2>OCap-safe JavaScript</h2> <p>On the way home, I got stuck in the Washington D.C. airport overnight due to weather. I used the time to start <a href="https://github.com/dckc/tinyses2rho/">tinyses2rho</a>, some exploratory scala code to integrate TinySES with RChain.</p> <p>TinySES is a small subset of JavaScript designed so that non-experts can use to write non-trivial non-exploitable smart contracts. It comes from Mark Miller and company, who have been working on object capabilities and smart contracts at least as far back as the early '90s, and recently <a href="https://www.coindesk.com/new-startup-zooko-naval-betting-better-crypto-contracts/">launched Agoric</a>.</p> <p><a href='https://photos.google.com/share/AF1QipPEmCV0T84sLHj1L1zNUQ-eldo2SVxeNYYf49RhnQF-5kp6kHZua4BAbfgCDOrICw?key=cmVENTBqc2YwY2g2QzZxQkJjSUVvVXFYYXg3QTVB&source=ctrlq.org'><img alt='DCA in the wee hours' src='https://lh3.googleusercontent.com/rv9s-h-_E58jZftv4H1XcBlgGtx1hszNMMpXrQyMVDGuath90K8OtXn7_ItZxR0G6n-_1dEVujUf0ED_nKtPq8qZElbDRsAY7PkWvKyOGejZgAZVU6HLKmQ3cKOdF0Rf-gCrh0zM8w=w2400' /></a></p> Thu, 12 Jul 2018 00:00:00 +0000 http://www.madmode.com//2018/boulder-bos/ dreaming big at the RChain Governance Forum http://www.madmode.com//2018/rchain-gov-forum-SEA/ <p>I'm back from the <a href="https://www.rchain.coop/">RChain</a> Governance Forum in Seattle, which was full of big dreams, just like the first Web conference in Geneva.</p> <blockquote> <p>Why RChain? Governance and blockchain can't be separated; I felt compelled to connect them. -- <a href="https://platform.coop/2017/schedule/breakout-panels-next-money-new-tech-for-new-finance">Greg Meredith Sep 2017</a></p> </blockquote> <p>I went in with almost no idea what that means. But at the closing brunch, I found myself telling the next person at the table that the economics of email—the original decentralized Internet communication platform—was what led me to trade my privacy to Google for spam protection.</p> <p>Many of us who built the Web had in mind a decentralized system where access to information and freedom of expression would enrich our lives. At some level, we knew sharp tools cut both ways, but ...</p> <blockquote> <p>When REST was being framed, it seemed inconceivable that two billion people would all agree to use one website (Facebook); -- <a href="https://research.google.com/pubs/pub46310.html">Fielding et. al. 2017</a></p> </blockquote> <p>The result is a vulnerability that Kate Charlet, who worked on Cyber Policy in the U.S. Department of Defense for the last decade, <a href="http://www.irckc.org/m/event_details.asp?id=1041848">says</a> nation states have successfully used to destabilize our democracy.</p> <p>Perhaps there's something to all this blockchain stuff; perhaps we can repair the imbalance by integrating economics into the next generation network architecture.</p> <p>The imbalance in the system has a glib description:</p> <blockquote> <p>If you are not paying for it, you're not the customer; you're the product being sold. -- <a href="https://www.metafilter.com/95152/Userdriven-discontent#32560467">blue_beetle Aug 2010</a></p> </blockquote> <p>But another way to put it is:</p> <blockquote> <p>There's a strangle-hold on social media today; people don't get a chance to participate in the economic proposition in an equitable way. -- <a href="https://youtu.be/p0a0zu5APd4">Greg Meredith, Jan 2017</a></p> </blockquote> <p>He emphasizes that we need new ways of coordinating, inspired by resilient living systems and reflective intelligence. Tai Chi on Saturday morning was a whole body experience of it.</p> <p>And music!</p> <p>I had great conversations with Peter and Jonathan and Derek and Rudy, including Ted Nelson's ideas on the <a href="http://www.madmode.com/2008/01/technology-that-inspires.html">evolution of movie making</a>.</p> <p>Greg performed in a group of mind-bogglingly improvisational guitar players. Mostly I felt stressed and out of place when listening to it. But I was right at home when the twelve bar blues chord progression emerged in the middle of it.</p> <p>I was also pretty much at home when I saw lightning talks on the agenda, so I offered to lead the session. That was <a href="https://youtu.be/Mmkae9E93tk?t=1h46m13s">great fun</a>. I gave two talks myself and enjoyed a lot of the follow-up discussions.</p> <ul> <li><a href="https://docs.google.com/presentation/d/1XxdZxk9mWbB4cAely1_Ly5DZo8sTIeE4jq-s5gqiaqw/edit?usp=sharing">Toward a Trust Metric for Authority in the RChain Bounty System</a></li> <li><a href="https://docs.google.com/presentation/d/1dW1nsNDJjXBd6YTBOxQ21GBGqHMt5Q8f3B1RIG3bNao/edit#slide=id.p">RChain Life, a learning game</a></li> </ul> <p>The technical architecture of RChain also sunk in a bit more. In addition to <a href="http://www.madmode.com/2017/rchain-dev-retreat.html">capability security and namespaces</a>, I'm beginning to see reflection (the R in RChain) as a possible solution to the <a href="https://lists.w3.org/Archives/Public/public-cwm-talk/2016JanMar/0000.html">quoting and diagonalization issues</a> TimBL and I struggled with in the Semantic Web.</p> <h2>References</h2> <ul> <li>Fielding, Taylor, Erenkrantz, Gorlick, Whitehead, Khare, Oreizy, <a href="https://research.google.com/pubs/pub46310.html">Reflections on the REST Architectural Style and “Principled Design of the Modern Web Architecture”</a> ESEC/FSE 2017<ul> <li><a href="https://www.youtube.com/watch?v=6oFAmQUM8ws">video</a></li> </ul> </li> <li>Arndt, Verborgh, De Roo, Sun, Mannens, Van De Walle, <a href="http://link.springer.com/chapter/10.1007/978-3-319-21542-6_9">Semantics of Notation3 Logic: A Solution for Implicit Quantification</a> RuleML 2015</li> <li>Berners-Lee, Connolly, Kagal, Hendler, and Schraf, <a href="https://arxiv.org/abs/0711.1533">N3Logic: A Logical Framework for the World Wide Web</a> Journal of Theory and Practice of Logic Programming (TPLP), Special Issue on Logic Programming and the Web, 2008</li> </ul> Wed, 21 Feb 2018 00:00:00 +0000 http://www.madmode.com//2018/rchain-gov-forum-SEA/ Smart Contracts and Capabilities Up Close http://www.madmode.com//2017/rchain-dev-retreat/ <p>This year I was invited to a <a href="https://foresight.org/event/next-frontier-blockchain-meets-object-capabilities/">Blockchain meets Object Capabilities panel</a> in San Francisco July 3 and the <a href="https://github.com/rchain/Members/issues/191">RChain developer retreat</a> in Park City Nov 12-17. Leaving Park City was... <em>interesting</em>.</p> <p>My flight to San Francisco was funded from the <a href="http://www.madmode.com/2016/eth-dao.html">$30 that I put into Ethereum</a> in mid 2016--as an educational expense, I thought; I hardly expected to square my investment.</p> <p>Jorge Lopez of <a href="https://economicspace.agency/gravity">Gravity</a> hosted long-time capabilities leaders Mark Miller, Zooko Wilcox, and Brian Warner as well as Arthur Breitman of <a href="https://www.tezos.com/">Tezos</a>. Zooko also represented <a href="https://z.cash/">Zcash</a>.</p> <figure> <a href="https://photos.app.goo.gl/XnUs6dddBlRTl94F2"> <img src="https://lh3.googleusercontent.com/-47cAa-jvInI/WknX6v9rDuI/AAAAAAAAEsk/0WCmJqTB3YAJLdm1gux1PRZ86BF32kU3QCJoC/w530-h224-n-rw/IMG_20170703_182953499.jpg" alt="" /> <figcaption>Mark Miller with Lopez and Warner to his right and Breitman and Wilcox to his left</figcaption> </a> </figure> <p>Brian gave a great summary of the bitcoin market forces but I was too engrossed in the event to take notes; the one quote I managed to write down was "Satoshi is probably in the room."</p> <p><em>Aside: <a href="https://oasis.sandstorm.io/">Sandstorm Oasis</a> let me export the grain I took my notes in even though my subscription had lapsed. That's pretty cool.</em></p> <p>The evening before the panel, @zooko gave a <a href="https://twitter.com/zooko/status/881656410777411584">shout</a> out to my <a href="https://github.com/dckc/awesome-ocap">awesome-ocap</a> list, and in a <a href="https://twitter.com/robertobrien/status/905362724292509696">reply from @robertobrien</a>, I discovered <a href="http://rchain-architecture.readthedocs.io/en/latest/contracts/namespaces.html">RChain's Namespace Logic</a>.</p> <p>I saw Mike Stay's name, which was familiar from <a href="https://groups.google.com/forum/#!forum/cap-talk">cap-talk</a> discussion, on some of the citations in the RChain Architecture document, and asked him some questions in email. Before long I was contributing <a href="https://github.com/rchain/rchain/pull/40">little tweaks</a> to the code and wrangled an invitation to the developer retreat.</p> <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">RChain developer&#39;s retreat at Flanagan&#39;s <a href="https://t.co/5by4odQ85J">pic.twitter.com/5by4odQ85J</a></p>&mdash; Dan Connolly (@dckc) <a href="https://twitter.com/dckc/status/930621928116592640?ref_src=twsrc%5Etfw">November 15, 2017</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> <p>Again, I was too engrossed to take good notes, but they made <a href="https://www.youtube.com/playlist?list=PLf2bbiic5ZjCPzin3gCSMGiBtbT8UO5o2">recordings of the main sessions</a>.</p> <p>The RChain goals are ambitious, to say the least: "content delivery at the scale of Facebook and transactions at the speed of Visa." Not to mention a an "e pluribus unum 2.0, a new center on a scale-invariant axis reconciling the wisdom of collective and the wisdom of the individual" for collaborating on problems such as global warming. An <a href="https://www.youtube.com/watch?v=p0a0zu5APd4">interview with Greg Meredith</a> from the <a href="https://bitcointalk.org/index.php?topic=1747033.0">January RChain announcement</a> filled in a lot of the background for me.</p> <p>Nash and the gang did an amazing job hosting. We had dinner out most nights and breakfast in the houses in the morning, and it's a tough call which was better.</p> <p>On the last day, after a mind-blowing week, we were all lounging around, zoned out on our phones, when we noticed the snow. For some, it was their first snowstorm... nice and pretty. Then Nash snapped us out of it: "<strong>Get off the mountain now</strong> before they close the pass and you miss your flights!" As we were getting in the big rental SUV, I asked, "Have you driven in snow before?" and since she hadn't, the keys passed to me.</p> <p>The drive out was a little hairy, but <a href="https://twitter.com/VladZamfir">Vlad</a> shared some tunes and we all got to know each other a little better.</p> <figure> <a href="https://photos.app.goo.gl/eTkY2Pb8L8Errglf2"> <img src="https://lh3.googleusercontent.com/-aUKMLj97inQ/Wk3Dn5U6f5I/AAAAAAAAEuY/-CMagGS7HuQ6RVwCGm42zhNAYfbZkvP2gCJoC/w530-h323-n-rw/IMG_0572.jpg" alt="" /> <figcaption>Escaping Park City</figcaption> </a> </figure> Sun, 31 Dec 2017 00:00:00 +0000 http://www.madmode.com//2017/rchain-dev-retreat/ College Expense Tracking in BASIC09 http://www.madmode.com//2017/ut-austin-expenses/ <p>It's no wonder my kids struggle so much more to pay for college:</p> <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Hours of minimum wage work needed to pay for four years of public college<br><br>Boomer: 306<br>Millennial: 4,449<br><br>Source: <a href="https://t.co/3ZZDpC9Fgw">https://t.co/3ZZDpC9Fgw</a></p>&mdash; Ryan Carson (@ryancarson) <a href="https://twitter.com/ryancarson/status/943468921834717185?ref_src=twsrc%5Etfw">December 20, 2017</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> <p>In my freshman year at U.T. Austin, I wrote a <a href="https://en.wikipedia.org/wiki/BASIC09">BASIC09</a> program to track my expenses:</p> <pre><code class="basic">PRINT CHR$(12); &quot;Expenses -- by Dan Connolly&quot; PRINT &quot;&lt;A&gt; - Edit Accounts&quot; PRINT &quot;&lt;E&gt; - Journal Entry&quot; PRINT &quot;&lt;R&gt; - Generate Report&quot; PRINT &quot;&lt;C&gt; - Clean up file&quot; PRINT &quot;&lt;Q&gt; - Quit&quot; RUN Choose(&quot;Choice: &quot;,&quot;AERCQ&quot;,Choice) </code></pre> <p>I found a <code>Rpt02.22</code> report that shows tuition of about $500, mostly covered by a scholarship:</p> <pre><code class="python">def _cocodisks(): from pathlib import Path return Path('1986-cocodisks') EXP = _cocodisks() / 'archive' / 'PRG-x' / 'EXP' tx_lines = list((EXP / 'Rpt02.22').open()) tx_lines = tx_lines[1:] # skip blank line tx_lines[:2] + tx_lines[8:11] </code></pre> <pre><code>[u'Date Description Amount Source Name Src Bal Dest Name Dest Bal\n', u'----------- ------------- ------- ------------- ------- ------------- --------\n', u' 9-01-86:11 Books 117.85 Cash 182.15 Books/Supplie 117.85\n', u' 9-15-86:11 Scholarship 296.46 National Meri -296.46 U T 496.28\n', u' 9-15-86:15 Scholarship 78.54 National Meri -375.00 Cash 260.69\n'] </code></pre> <p>The last page of the report shows account balances:</p> <pre><code class="python">acct_hd_ix = next(ix for ix, line in enumerate(tx_lines) if line.strip().startswith('Num')) acct_lines = tx_lines[acct_hd_ix:] acct_lines[:5] </code></pre> <pre><code>[u' Num Account Name Balance\n', u'---- ------------- -------\n', u' 1: Cash 63.51\n', u' 2: Checks 28.00\n', u' 3: Bank Account 888.52\n'] </code></pre> <p>But the <code>Jrnl</code> data file goes thru March 12...</p> <pre><code>1986-cocodisks/archive/PRG-x/EXP$ ls -l Jrnl -r--r--r-- 1 connolly connolly 5443 Mar 12 1987 Jrnl 986-cocodisks/archive/PRG-x/EXP$ sha1sum Jrnl 3f75dbc8bcdac51874259c44ef1fcad55fe068e0 Jrnl </code></pre> <p>... where that report only goes thru Feb 22:</p> <pre><code class="python">tx_lines[acct_hd_ix - 7: acct_hd_ix - 5] </code></pre> <pre><code>[u' 2-22-87:11 Bus .50 Cash 63.51 Living Expens 138.94\n', u' 2-22-87:15 NOW ----- .00 .00 .00\n'] </code></pre> <h2>Porting BASIC09 File Reading Code to Python</h2> <p>I spent some time poring over the <a href="https://bitbucket.org/DanC/madmode-blog/src/08c15cccd1ef?at=default">EXP</a> source code (<code>08c15cc</code>) to get the data out:</p> <pre><code>1986-cocodisks/archive/PRG-x/EXP$ wc *.b 0 38 300 Acct.b 3 506 4490 Entry.b 7 564 5141 Exp.b 1 125 1021 Rec.b 2 142 1269 Report.b 13 1375 12221 total </code></pre> <p>The <code>file</code> command even recognizes the compiled format:</p> <pre><code>1986-cocodisks/archive/PRG-x/EXP$ file Expenses Expenses: OS9/6809 module: BASIC I-code subroutine </code></pre> <pre><code class="python">import pandas as pd import numpy as np dict(pandas=pd.__version__, numpy=np.__version__) </code></pre> <pre><code>{'numpy': '1.10.1', 'pandas': u'0.17.1'} </code></pre> <p>The file is just 5K. These days it's trivial to read that into memory, but my coco only had 16K of RAM, upgraded from 4K.</p> <pre><code class="python">Jrnl = (EXP / 'Jrnl').open('rb').read() len(Jrnl) </code></pre> <pre><code>5443 </code></pre> <p>The transaction format is mostly straightforward, though I'm glad I had the source code to decode the <code>key</code> field:</p> <pre><code class="python">import datetime from collections import namedtuple, OrderedDict import struct class Trans(namedtuple('Trans', 'key, desc, amt, db, cr')): &quot;&quot;&quot; TYPE Trans=Key:INTEGER; Desc:STRING[13]; Amt:REAL; DB,CR:BYTE &quot;&quot;&quot; struct = struct.Struct('&gt;h13s5sBB') @classmethod def unpack(cls, data): it = cls(*cls.struct.unpack(data[:cls.struct.size])) it = it._replace(desc=Basic09.string(it.desc), amt=Basic09.real(it.amt)) return it @property def indx(self): return self.key % 32 + 1 @property def date(self): r&quot;&quot;&quot; port of PROCEDURE DateStr Indx=MOD(Key,32)+1 \Copy=Key/32 Day=MOD(Copy,31)+1 \Copy=Copy/31 Month=MOD(Copy,12)+1 \Copy=Copy/12 Year=86+Copy &quot;&quot;&quot; copy = self.key / 32 day = copy % 31 + 1 copy = copy / 31 month = copy % 12 + 1 copy = copy / 12 year = 1986 + copy try: return datetime.date(year, month, day) except ValueError: # Nov 31??? return datetime.date(year, month, day - 1) def as_dict(self): return dict(date=self.date, indx=self.indx, desc=self.desc, amt=round(self.amt, 2), db=self.db, cr=self.cr) </code></pre> <p>I couldn't figure out how to decode the floating point account balances until I realized I was comparing them against the Feb 22 report rather than their March 12 values.</p> <blockquote> <p>Type REAL</p> <p>REAL numbers are stored in 5 consecutive memory bytes. The first byte is the (8-bit) exponent in binary two's-complement representation. The next four bytes are the binary sign-and-magnitude representation of the mantissa; the mantissa in the first 31 bits, and the sign of the mantissa in the last (least significant) bit of the last byte of the real quantity. -- <a href="http://www.roug.org/soren/6809/basic09.pdf">BASIC09: Programming Language Reference Manual</a> Copyright (c) 1983 Microware Systems Corporation</p> </blockquote> <pre><code class="python">class Basic09(object): @classmethod def string(cls, data): return data[:data.find('\xff')] if '\xff' in data else data @classmethod def real(cls, b5): exp, mag = struct.unpack('&gt;bI', b5) sgn = -1 if (mag % 2) else 1 mag = mag &gt;&gt; 1 mag = mag * 1.0 / (2 ** 31) return mag * (2 ** exp) * sgn </code></pre> <p>The overall file format is a linked list:</p> <pre><code class="python">class Global(namedtuple('Global', 'trx, head, tail, rec, avail, name, bal, file')): &quot;&quot;&quot; TYPE Global=Trx:Trans; Head,Tail,Rec,Avail:INTEGER; Name(32):STRING &quot;&quot;&quot; struct = struct.Struct('&gt;hhhh%ds%dsB' % (32 * 13, 32 * 5)) @classmethod def unpack(cls, data): trx = Trans.unpack(data) data = data[Trans.struct.size:] it = cls(*((trx,) + cls.struct.unpack(data[:cls.struct.size]))) ea = 13 name = [Basic09.string(it.name[ea * ix:ea * (ix + 1)]) for ix in range(32)] ea = 5 bal = [Basic09.real(it.bal[ea * ix:ea * (ix + 1)]) for ix in range(32)] return it._replace(name=name, bal=bal) def accounts(self): a = pd.DataFrame(dict(name=self.name, bal=self.bal), columns=['name', 'bal']) a.index = a.index + 1 return a def iter_trans(self, jrnl): here = self.rec while True: after, before = struct.unpack('&gt;HH', jrnl[here + Trans.struct.size:][:4]) here = after if here == 0: break yield Trans.unpack(jrnl[here:]) G = Global.unpack(Jrnl) print G.trx print dict(head=G.head, tail=G.tail, rec=G.rec, avail=G.avail) ut_accounts = G.accounts() ut_accounts.head(3) </code></pre> <pre><code>Trans(key=32767, desc='Delphi Bill', amt=46.80000001192093, db=10, cr=19) {'avail': 5443, 'rec': 0, 'tail': 5313, 'head': 607} </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>name</th> <th>bal</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Cash</td> <td>76.220001</td> </tr> <tr> <th>2</th> <td>Checks</td> <td>552.950000</td> </tr> <tr> <th>3</th> <td>Bank Account</td> <td>890.420005</td> </tr> </tbody> </table> </div> <pre><code class="python">journal = pd.DataFrame.from_records( (tx.as_dict() for tx in G.iter_trans(Jrnl)), columns=['date', 'indx', 'desc', 'amt', 'db', 'cr']).set_index(['date', 'indx']) journal = journal.merge(ut_accounts[['name']], left_on='db', right_index=True) journal = journal.rename(columns=dict(name='Source Name')) journal = journal.merge(ut_accounts[['name']], left_on='cr', right_index=True) journal = journal.rename(columns=dict(name='Dest Name')) journal = journal.sort_index() journal.iloc[6:9] </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th></th> <th>desc</th> <th>amt</th> <th>db</th> <th>cr</th> <th>Source Name</th> <th>Dest Name</th> </tr> <tr> <th>date</th> <th>indx</th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>1986-09-01</th> <th>11</th> <td>Books</td> <td>117.85</td> <td>1</td> <td>23</td> <td>Cash</td> <td>Books/Supplie</td> </tr> <tr> <th rowspan="2" valign="top">1986-09-15</th> <th>11</th> <td>Scholarship</td> <td>296.46</td> <td>7</td> <td>20</td> <td>National Meri</td> <td>U T</td> </tr> <tr> <th>15</th> <td>Scholarship</td> <td>78.54</td> <td>7</td> <td>1</td> <td>National Meri</td> <td>Cash</td> </tr> </tbody> </table> </div> <p>Computing running balances with pandas with <code>cumsum</code> was fun.</p> <pre><code class="python">def running_balance(journal): cr = journal[['cr', 'amt']].rename(columns=dict(cr='acct')) cr['col'] = 'cr' db = journal[['db', 'amt']].rename(columns=dict(db='acct')) db['col'] = 'db' db.amt = -db.amt ea = cr.append(db).sort_index() ea['bal'] = ea.groupby('acct').amt.cumsum() cum = ea.reset_index().pivot_table(index=['date', 'indx'], columns='col', values=['bal']) journal = journal.copy() journal.insert(len(journal.columns) - 1, 'Src Bal', cum.bal.db) journal['Dest Bal'] = cum.bal.cr return journal running_balance(journal).to_csv('ut-austin-journal.csv') running_balance(journal).iloc[6:9] </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th></th> <th>desc</th> <th>amt</th> <th>db</th> <th>cr</th> <th>Source Name</th> <th>Src Bal</th> <th>Dest Name</th> <th>Dest Bal</th> </tr> <tr> <th>date</th> <th>indx</th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>1986-09-01</th> <th>11</th> <td>Books</td> <td>117.85</td> <td>1</td> <td>23</td> <td>Cash</td> <td>182.15</td> <td>Books/Supplie</td> <td>117.85</td> </tr> <tr> <th rowspan="2" valign="top">1986-09-15</th> <th>11</th> <td>Scholarship</td> <td>296.46</td> <td>7</td> <td>20</td> <td>National Meri</td> <td>-296.46</td> <td>U T</td> <td>496.28</td> </tr> <tr> <th>15</th> <td>Scholarship</td> <td>78.54</td> <td>7</td> <td>1</td> <td>National Meri</td> <td>-375.00</td> <td>Cash</td> <td>260.69</td> </tr> </tbody> </table> </div> <p>And with that, we have recovered the journal data from the report:</p> <pre><code class="python">tx_lines[:2] + tx_lines[8:11] </code></pre> <pre><code>[u'Date Description Amount Source Name Src Bal Dest Name Dest Bal\n', u'----------- ------------- ------- ------------- ------- ------------- --------\n', u' 9-01-86:11 Books 117.85 Cash 182.15 Books/Supplie 117.85\n', u' 9-15-86:11 Scholarship 296.46 National Meri -296.46 U T 496.28\n', u' 9-15-86:15 Scholarship 78.54 National Meri -375.00 Cash 260.69\n'] </code></pre> <p>And we can compute account balances:</p> <pre><code class="python">src_bal = journal.groupby('db')[['amt']].sum() dst_bal = journal.groupby('cr')[['amt']].sum() bal = src_bal.merge(dst_bal, left_index=True, right_index=True, how='outer', suffixes=['_src', '_dst']).fillna(0) bal['balance'] = bal.amt_dst - bal.amt_src bal = bal.drop(['amt_src', 'amt_dst'], axis=1) bal = bal.merge(ut_accounts[['name']], left_index=True, right_index=True)[['name', 'balance']] bal.to_csv('ut-austin-accounts.csv', index_label='acct_num') bal[:3] </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>name</th> <th>balance</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Cash</td> <td>76.22</td> </tr> <tr> <th>2</th> <td>Checks</td> <td>552.95</td> </tr> <tr> <th>3</th> <td>Bank Account</td> <td>890.42</td> </tr> </tbody> </table> </div> Sat, 30 Dec 2017 00:00:00 +0000 http://www.madmode.com//2017/ut-austin-expenses/ Migrating Breadcrumbs http://www.madmode.com//2017/breadcrumbs-migrate/ <p><em>Breadcrumbs</em> was the blog of <a href="http://dig.csail.mit.edu/">DIG</a>, the Decentralized Information Group at MIT CSAIL.</p> <p>In a <a href="http://logs.glob.uno/?c=freenode%23microformats&amp;s=20+Jun+2015&amp;e=20+Jun+2015#c81549">2015 #microformats chat</a>, I discovered that it was down:</p> <blockquote> <p>DanC&gt; grr... the blog is down. http://dig.csail.mit.edu/breadcrumbs/node/228<br /> "Unable to connect to database server"<br /> <em>DanC verifies that he has an export of his work there...</em><br /> DanC&gt; interesting... my backup is evidently python pickles of XMLRPC responses from the API of that CMS (drupal?)<br /> &gt;&gt;&gt; x['dateCreated']<br /> &lt;DateTime '20080306T17:00:05' at 7f20e8aef5f0&gt;<br /> &gt;&gt;&gt; x['dateCreated'].<strong>class</strong><br /> &lt;class xmlrpclib.DateTime at 0x7f20e444eef0&gt; </p> </blockquote> <p>The files are numbered:</p> <pre><code class="python">def _numbered_files(pattern='[0-9]*', breadcrumbs='/home/connolly/sites/breadcrumbs'): from pathlib import Path return Path(breadcrumbs).glob(pattern) breadcrumbs_bak = list(_numbered_files()) sorted(int(f.parts[-1]) for f in breadcrumbs_bak)[:10] </code></pre> <pre><code>[4, 5, 6, 7, 8, 9, 10, 11, 12, 13] </code></pre> <p>Each is a pickled XMLRPC response:</p> <pre><code class="python">import pickle breadcrumbs_xmlrpc = dict((int(f.parts[-1]), pickle.load(f.open('rb'))) for f in breadcrumbs_bak) x = breadcrumbs_xmlrpc[228] x['title'], x['dateCreated'], x['dateCreated'].__class__ </code></pre> <pre><code>('hAudio for microformats mixtapes, in progress', &lt;DateTime '20080306T17:00:05' at 7fa8242a5320&gt;, &lt;class xmlrpclib.DateTime at 0x7fa82427cf58&gt;) </code></pre> <h2>MadMode blog pages</h2> <pre><code class="python">from collections import OrderedDict from __future__ import print_function from sys import stderr class BlogWriter(object): def __init__(self, pages): self._pages = pages def addPage(self, body, title, date, tags, published, slug): datestr = date.isoformat() headings = OrderedDict(title=repr(title), date=datestr[:10], tags=&quot;[%s]&quot; % (', '.join(&quot;'%s'&quot; % tag for tag in tags)), published=published) header = '\n'.join([&quot;%s: %s&quot; % (k, v) for k, v in headings.iteritems()]) yyyy = datestr[:4] page = (self._pages / yyyy / slug).with_suffix('.md') print(&quot;addPage: &quot;, page, tags, file=stderr) with page.open('wb') as out: out.write(header) out.write('\n\n') out.write(body.encode('utf-8')) def _madmode(): from pathlib import Path return BlogWriter(Path('/home/connolly/sites') / 'madmode-blog' / 'pages') mmwr = _madmode() </code></pre> <pre><code class="python">from time import mktime from datetime import datetime import re def drupal2md(body): body = body.split('&lt;/title&gt;', 1)[1] # remove redundant title body = body.replace('\r', '') # unix newlines return body def findTags(body): tags = [] for txt in body.split('&lt;'): if txt.startswith('a '): txt = txt[len('a '):] attrs = {} while '=' in txt and not txt.startswith('&gt;'): name, txt = txt.split('=', 1) name = name.strip() txt = txt.strip() _, value, txt = txt.split('&quot;', 2) attrs[name] = value txt = txt.strip() href = attrs.get('href', '') if 'tag' in attrs.get('rel', '') or 'del.icio.us' in href: if href.endswith('/'): href = href[:-1] tags.append(href.split('/')[-1]) return tags for postid, item in sorted(breadcrumbs_xmlrpc.items()): print(postid, item['title'], file=stderr) dt = datetime.fromtimestamp(mktime(item['dateCreated'].timetuple())) tags = ['breadcrumbs'] + findTags(item['content']) mmwr.addPage(drupal2md(item['content']), title=item['title'], date=dt, tags=tags, published=True, slug='breadcrumbs_%04d' % postid) </code></pre> <pre><code>4 On OpenID and comment policies addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0004.md ['breadcrumbs'] 5 little burst of PAW demo hacking addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0005.md ['breadcrumbs'] 6 DIG blog wish list addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0006.md ['breadcrumbs', 'connolly'] 7 Fire at Southampton... hope everything's alright soon addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0007.md ['breadcrumbs'] 8 Sourceforge is the place... to sell soap? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0008.md ['breadcrumbs'] 9 Reflecting blog structure into the Semantic Web with SIOC? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0009.md ['breadcrumbs'] 10 I'd rather be... addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0010.md ['breadcrumbs'] 11 PHP angst addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0011.md ['breadcrumbs'] 12 Shopping for a client-side blogging editor addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0012.md ['breadcrumbs', 'authoring'] 13 presented Issues in Semantic Web Logic to 6.898 addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0013.md ['breadcrumbs'] 14 xchat RFE: "mail a log of this chat to mbox@domain" macro addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0014.md ['breadcrumbs'] 15 U.S. papertrail: the federal register addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0015.md ['breadcrumbs'] 16 XHTML for computer science research papers and bibliographies addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0016.md ['breadcrumbs'] 17 ISWC buzz addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0017.md ['breadcrumbs'] 18 Why isn't bill payee set-up integrated with address book or yellow pages lookup? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0018.md ['breadcrumbs'] 23 RDF Calendar, GRDDL, Microformats, and all that at XML2005 in Atlanta addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0023.md ['breadcrumbs', 'quality'] 24 SKOS, SIOC, and drupal taxonomy addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0024.md ['breadcrumbs'] 25 sorry about overriding your font size addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0025.md ['breadcrumbs'] 26 Ray Ozzie's take on diff/sync addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0026.md ['breadcrumbs'] 27 a fly-by of XACML addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0027.md ['breadcrumbs'] 28 MathML as a rule interchange format addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0028.md ['breadcrumbs'] 29 GRDDL transform wanted: National Information Exchange Model (NIEM) addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0029.md ['breadcrumbs'] 30 Go-Karting rush tainted by lack of OpenID for bug reporting about hypertext editing addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0030.md ['breadcrumbs'] 45 Toward richtext syndicated feed addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0045.md ['breadcrumbs'] 46 Toward better documentation of some schemas for the W3C digital library addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0046.md ['breadcrumbs'] 47 Brought my hockey skates with me this time addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0047.md ['breadcrumbs'] 52 Connecting DIG Student Projects to the MIT UROP listing addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0052.md ['breadcrumbs'] 55 Drupal, OpenID, and the Mac OS X Keychain addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0055.md ['breadcrumbs'] 56 Wikicompany? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0056.md ['breadcrumbs'] 57 upgrade to CivicSpace? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0057.md ['breadcrumbs'] 61 frbr:embodiment is enough without frbr:embodimentOf, no? addPage: /home/connolly/sites/madmode-blog/pages/2005/breadcrumbs_0061.md ['breadcrumbs'] 63 On Google, Jabber, and Jingle and good and evil in IM and IP networks addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0063.md ['breadcrumbs'] 66 Arpeggio in D, a little three chord ditty addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0066.md ['breadcrumbs'] 69 Fun with Policy Aware Web at UMD, AFS/SVN at CSAIL addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0069.md ['breadcrumbs'] 70 Using truth maintenance techniques in RDF stores? addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0070.md ['breadcrumbs'] 77 MadScientistMode addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0077.md ['breadcrumbs'] 78 RSS is dead; long live RSS addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0078.md ['breadcrumbs'] 82 python, javascript, and PHP, oh my! addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0082.md ['breadcrumbs', 'installation', 'javascript', 'python', 'quality', 'testing', 'programming'] 84 tabulator use cases: when can we meet? and PathCross addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0084.md ['breadcrumbs'] 85 bnf2turtle -- write a turtle version of an EBNF grammar addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0085.md ['breadcrumbs'] 86 formally closing the feedback loop addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0086.md ['breadcrumbs'] 87 Using RDF and OWL to model language evolution addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0087.md ['breadcrumbs'] 88 Toward integration of cwm's proof structures with InferenceWeb's PML addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0088.md ['breadcrumbs'] 89 Investigating logical reflection, constructive proof, and explicit provability addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0089.md ['breadcrumbs'] 90 Fun with Embedded RDF and DOAP for the GRDDL profile addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0090.md ['breadcrumbs'] 91 Toward Semantic Web data from Wikipedia addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0091.md ['breadcrumbs', u'connolly'] 92 Reflections on the W3C Technical Plenary week addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0092.md ['breadcrumbs', 'NCE'] 93 Getting (dis)organized for SxSWi in Austin addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0093.md ['breadcrumbs', 'Austin'] 94 Dates in drupal vs planetrdf addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0094.md ['breadcrumbs'] 96 Getting my Personal Finance data back with hCalendar and hCard addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0096.md ['breadcrumbs'] 97 A look at emerging Web security architectures from a Semantic Web perspective addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0097.md ['breadcrumbs'] 98 a quick take on Kiko, a nifty looking calendar service addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0098.md ['breadcrumbs'] 99 using JSON and templates to produce microformat data addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0099.md ['breadcrumbs'] 100 geocoding and hCards for airports from wikipedia addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0100.md ['breadcrumbs', 'geo'] 101 time, context, quoting, and reification addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0101.md ['breadcrumbs'] 102 no more life in a textarea: MozEx and emacs to the rescue! addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0102.md ['breadcrumbs'] 107 hacking soccer schedules into hCalendar and into my sidekick addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0107.md ['breadcrumbs'] 123 A step forward with python and sshagent, and a walk around gnome security tools addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0123.md ['breadcrumbs', 'web', 'policy', 'security', 'python', 'programming'] 124 Consensus and community review in open source and open standards addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0124.md ['breadcrumbs'] 127 busy day in #microformats addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0127.md ['breadcrumbs'] 129 Access control and version control: an over-constrained problem? addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0129.md ['breadcrumbs'] 130 citing W3C specs from WWW conference papers addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0130.md ['breadcrumbs'] 131 On GData, SPARQL update, and RDF Diff/Sync addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0131.md ['breadcrumbs', 'diff', 'sync', 'sparql', 'calendar', 'web+architecture'] 133 RDF, Microformats, and Javascript hacking in person at the 'tute addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0133.md ['breadcrumbs', 'mobile', 'javascript', 'microformats', 'travel', 'calendar', 'BOS', 'bos'] 135 webizing TaskJuggler addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0135.md ['breadcrumbs', 'calendar'] 139 WWW2006 in Edinburgh: Identity, Reference, and Meaning addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0139.md ['breadcrumbs', 'www2006', 'EDI', 'travel', 'web+architecture', 'URI'] 140 Exporting databases in the Semantic Web with SPARQL, D2R, dbview, ARC, and such addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0140.md ['breadcrumbs', 'www2006', 'EDI', 'travel', 'sparql'] 141 Equality and inconsistency in the rules layer addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0141.md ['breadcrumbs'] 142 fun with flock addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0142.md ['breadcrumbs', u'flock', u'writing', u'editing', u'drupal'] 146 converting vcard .vcf syntax to hcard and catching up on CALSIFY addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0146.md ['breadcrumbs'] 148 a walk thru the tabulator calendar view addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0148.md ['breadcrumbs', 'calendar', 'SeedApplications'] 151 Choosing flight itineraries using tabulator and data from Wikipedia addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0151.md ['breadcrumbs'] 154 OpenID, verisign, and my life: mediawiki, bugzilla, mailman, roundup, ... addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0154.md ['breadcrumbs'] 155 tabulator maps in Argentina addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0155.md ['breadcrumbs'] 156 how much do I want to know about drupal? addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0156.md ['breadcrumbs'] 157 on Wikimania 2006, from a few hundred miles away addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0157.md ['breadcrumbs'] 158 Stitching the Semantic Web together with OWL at AAAI-06 addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0158.md ['breadcrumbs', 'RdfAndSql', 'AAAI', 'public-sparql-dev', 'citation'] 159 On the Future of Research Libraries at U.T. Austin addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0159.md ['breadcrumbs', 'Austin', 'URI', 'web+architecture'] 160 ACL 2 seminar at U.T. Austin: Toward proof exchange in the Semantic Web addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0160.md ['breadcrumbs', 'Austin', 'semantic', 'web', 'logic', 'research'] 161 Talking with U.T. Austin students about the Microformats, Drug Discovery, the Tabulator, and the Semantic Web addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0161.md ['breadcrumbs', 'Austin', 'semantic', 'web'] 162 Wishing for XOXO microformat support in OmniOutliner addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0162.md ['breadcrumbs'] 163 Trip reporting with flock addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0163.md ['breadcrumbs'] 164 Adding Shoenfield, Brachman books to my bookshelf? addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0164.md ['breadcrumbs'] 165 Now is a good time to try the tabulator addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0165.md ['breadcrumbs'] 171 Celebrating OWL interoperability and spec quality addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0171.md ['breadcrumbs'] 172 A new Basketball season brings a new episode in the personal information disaster addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0172.md ['breadcrumbs'] 178 Modelling HTTP cache configuration in the Semantic Web addPage: /home/connolly/sites/madmode-blog/pages/2006/breadcrumbs_0178.md ['breadcrumbs'] 179 She's a witch and I have the proof (in N3) addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0179.md ['breadcrumbs'] 180 A design for web content labels built from GRDDL and rules addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0180.md ['breadcrumbs'] 187 The Mercurial SCM: great for lots of stuff, but not the holy grail addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0187.md ['breadcrumbs', 'python+scm'] 192 Collaboration and crime at a distance at HASTAC, WWW2007 addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0192.md ['breadcrumbs', 'openid', 'hastac', 'Duke', 'RDU', 'digital+media'] 193 IKL by Hayes et al. provides a semantics for N3? addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0193.md ['breadcrumbs'] 194 Linked Data at WWW2007: GRDDL, SPARQL, and Wikipedia, oh my! addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0194.md ['breadcrumbs', u'banff', u'grddl', u'www2007', u'travel'] 198 Units of measure and property chaining addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0198.md ['breadcrumbs'] 201 Soccer schedules, flight itineraries, timezones, and python web frameworks addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0201.md ['breadcrumbs'] 206 FOAF and OpenID: two great tastes that taste great together addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0206.md ['breadcrumbs'] 207 brainstorming, issue tracking, and problem reporting... with tabulator? addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0207.md ['breadcrumbs'] 214 Free Culture: Why buy the Amazon Kindle when you can give and get an OLPC XO-1 for the same price? addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0214.md ['breadcrumbs'] 221 I can only imagine... addPage: /home/connolly/sites/madmode-blog/pages/2007/breadcrumbs_0221.md ['breadcrumbs'] 228 hAudio for microformats mixtapes, in progress addPage: /home/connolly/sites/madmode-blog/pages/2008/breadcrumbs_0228.md ['breadcrumbs'] 229 sidekick calendar subscription for SXSW addPage: /home/connolly/sites/madmode-blog/pages/2008/breadcrumbs_0229.md ['breadcrumbs'] 240 The details of data in documents; GRDDL, profiles, and HTML5 addPage: /home/connolly/sites/madmode-blog/pages/2008/breadcrumbs_0240.md ['breadcrumbs'] 246 OpenID "Hello World" on apache still deep magic addPage: /home/connolly/sites/madmode-blog/pages/2009/breadcrumbs_0246.md ['breadcrumbs'] 250 DIG losing the battle with spammers again addPage: /home/connolly/sites/madmode-blog/pages/2009/breadcrumbs_0250.md ['breadcrumbs'] 251 migrating from danger/sidekick to android/G1 addPage: /home/connolly/sites/madmode-blog/pages/2009/breadcrumbs_0251.md ['breadcrumbs'] 252 Existentials in ACL2 and Milawa make sense; how about level breakers? addPage: /home/connolly/sites/madmode-blog/pages/2010/breadcrumbs_0252.md ['breadcrumbs'] 253 Map and Territory in RDF APIs addPage: /home/connolly/sites/madmode-blog/pages/2010/breadcrumbs_0253.md ['breadcrumbs'] </code></pre> <h2>PyData Tools</h2> <pre><code class="python">import pandas as pd dict(pandas=pd.__version__) </code></pre> <pre><code>{'pandas': u'0.17.1'} </code></pre> <pre><code class="python">items = pd.DataFrame.from_records(breadcrumbs_xmlrpc.values()) items.postid = items.postid.astype(int) items = items.set_index('postid') print(items.dtypes) items[['title', 'dateCreated']].sort_values('dateCreated').head() </code></pre> <pre><code>content object dateCreated object description object link object mt_allow_comments int64 mt_convert_breaks object permaLink object title object userid object dtype: object </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>title</th> <th>dateCreated</th> </tr> <tr> <th>postid</th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>4</th> <td>On OpenID and comment policies</td> <td>20051024T23:28:49</td> </tr> <tr> <th>5</th> <td>little burst of PAW demo hacking</td> <td>20051026T20:12:18</td> </tr> <tr> <th>6</th> <td>DIG blog wish list</td> <td>20051026T20:14:27</td> </tr> <tr> <th>7</th> <td>Fire at Southampton... hope everything's alrig...</td> <td>20051031T11:59:08</td> </tr> <tr> <th>9</th> <td>Reflecting blog structure into the Semantic We...</td> <td>20051031T13:18:51</td> </tr> </tbody> </table> </div> <pre><code class="python">items.loc[[228], ['title', 'dateCreated']] </code></pre> <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>title</th> <th>dateCreated</th> </tr> <tr> <th>postid</th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>228</th> <td>hAudio for microformats mixtapes, in progress</td> <td>20080306T17:00:05</td> </tr> </tbody> </table> </div> Fri, 29 Dec 2017 00:00:00 +0000 http://www.madmode.com//2017/breadcrumbs-migrate/ Making Secure IoT: seL4 on my Raspberry Pi 3B http://www.madmode.com//2017/sel4-rpi/ <p>I got seL4 running on my Raspberry Pi 3B tonight.</p> <p>Even though I worked with Dale Dougherty in the early '90s, I've been on the sidelines of the whole maker thing until September when Micro Center bundled a <a href="http://www.microcenter.com/site/content/google_aiy_preorder.aspx">Google AIY VOICE KIT with Raspberry Pi 3B for $35</a>.</p> <p><a data-flickr-embed="true" data-footer="true" href="https://www.flickr.com/photos/dckc/26502865629/in/album-72157690394355946/" title="AIY Kit"><img src="https://farm5.staticflickr.com/4517/26502865629_a8f62d67b5.jpg" width="500" height="305" alt="AIY Kit"></a> <script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> <p>After I verified that it works as a voice device, I remembered the tantalizing <a href="https://research.csiro.au/tsblog/sel4-raspberry-pi-3/">seL4 on the Raspberry Pi 3</a> item from early this year. The build dependencies were greatly simplified by the <a href="https://github.com/SEL4PROJ/seL4-CAmkES-L4v-dockerfiles">seL4 dockerfiles</a>. Building the seL4-Test project went without a hitch:</p> <pre><code>mkdir sel4test &amp;&amp; cd sel4test repo init --config-name -u git@github.com:seL4/sel4test-manifest.git repo sync make rpi3_debug_xml_defconfig &amp;&amp; make ... [GEN_IMAGE] sel4test-driver-image-arm-bcm2837 </code></pre> <p>The only way to see the seL4 test project do its thing is via the serial console. Before I overwrote the working voice kit SD card, I wanted to test connectivity. I have plenty of experience with RS-232 serial cables (I even had a job in high school where I helped a tech by putting together serial terminal cables) but <a href="https://www.sparkfun.com/tutorials/215">RS-232 vs. TTL serial</a> is not just a matter of wires and connectors; the voltage levels are different. USB to TTL cables usually go for around $15, which is more than half of what I paid for the Pi!</p> <p>Meanwhile, this summer Micro Center had a beaglebone green wireless, which usually goes for around $50, on clearance for $20, and I couldn't pass it up. The beaglebone uses the same TTL levels and works fine as an ssh server, so I put together a cable</p> <p><a data-flickr-embed="true" data-footer="true" href="https://www.flickr.com/photos/dckc/38223910826/in/album-72157690394355946/" title="IMG_20171106_212900488"><img src="https://farm5.staticflickr.com/4569/38223910826_a8fe8f7bdf.jpg" width="500" height="281" alt="IMG_20171106_212900488"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> <p>After some <code>config-pin</code> tinkering, I confirmed that I could get UART1 and UART2 on the beaglebone to talk to each other (UART0 is debug outputonly), but I couldn't get any console output from the Pi to show up.</p> <p>After discovering a <a href="https://www.amazon.com/gp/product/B00QT7LQ88/">JBtek Raspberry Pi USB to TTL Serial Cable</a> on Amazon for $7, I ordered it and set the project aside.</p> <p>That didn't work either until I connected an HDMI monitor and keyboard and used <code>raspi-config</code> to enable the serial console. <em>I wonder if the beaglebone would have worked given that fix.</em></p> <p>With serial console hardware issues in hand, I loaded the SD card as instructed. The first few boot stages worked, but I struggled to <code>Hit any key to stop autoboot</code>. Minicom (remember minicom?) showed "Offline" so I turned off hardware flow control. Bingo:</p> <pre><code>U-Boot&gt; fatls mmc 0 50248 bootcode.bin 2818372 start.elf 35 config.txt 393408 u-boot.bin 4064956 sel4test-driver-image-arm-bcm2837 U-Boot&gt; fatload mmc 0 0x100000 sel4test-driver-image-arm-bcm2837 reading sel4test-driver-image-arm-bcm2837 4064956 bytes read in 328 ms (11.8 MiB/s) U-Boot&gt; bootelf 0x100000 ... Test suite passed. 119 tests passed. 42 tests disabled. All is well in the universe </code></pre> Thu, 09 Nov 2017 00:00:00 +0000 http://www.madmode.com//2017/sel4-rpi/ My Capability Security 2017 Wish-List http://www.madmode.com//2017/ocap-wish-list/ <p>Computers are getting faster, smaller, more connected, and more capable, but when it comes to security, <a href="https://medium.com/message/everything-is-broken-81e5f33a24e1">everything is broken</a>. Along with correct-by-construction software (e.g. <a href="http://adam.chlipala.net/cpdt/">Certified Programming with Dependent Types</a>), the best weapon I see is <strong>object capability discipline</strong>.</p> <p>Before I get into my wish list of projects and issues, I'd like to point out <a href="https://github.com/dckc/awesome-ocap">dckc/awesome-ocap</a>, my list of capability security technology that is ready to use today, including everything from <a href="https://sel4.systems/">seL4</a>, an open source operating-system kernel with an end-to-end proof of implementation correctness and security enforcement to <a href="https://sandstorm.io/">Sandstorm</a>, a self-hostable SAAS platform.</p> <h2>Secure module loading for node.js</h2> <p>The good parts of JavaScript line up well with object capability discipline, but support in node.js for some of the best parts is lacking and hence there's no guarantee that calling and enhanced <code>sqrt()</code> from some npm module will not send an HTTP request to launch missiles.</p> <p>Mark Miller demonstrated the feasibility of secure loading as far back as 2011 with <a href="https://github.com/google/caja/blob/master/src/com/google/caja/ses/makeSimpleAMDLoader.js">makeSimpleAMDLoader.js</a>. I'm trying to fully understand node's incomplete support for <code>Object.defineProperty</code> in <a href="https://github.com/drses/ses/issues/6">ses/issues/6</a>.</p> <p>Meanwhile, I'm having fun with Capper; see <a href="https://bitbucket.org/DanC/finquick">finquick</a>.</p> <h2>Sandstorm dev tools on Ubuntu 16.04</h2> <p>Ubuntu's 16.04 kernels handle pid namespaces in a way that interferes with the sandstorm dev tools. Tracking issue: <a href="https://github.com/sandstorm-io/sandstorm/issues/2526">2526</a>.</p> <p>I figured out <a href="https://gist.github.com/dckc/2e6b5c8029246ab38c16e254fc3d3f4d">how to build sandstorm apps with nix</a>, but without the dev tools, the edit/test/debug cycle time is too long.</p> <h2>Capability security for mainstream linux with CloudABI, capsicum</h2> <p>While considering alternative kernels, I ran into <a href="https://github.com/dckc/madmode-blog/issues/20">a linking issue</a> when trying to build a linux kernel that supports capsicum and CloudABI. (There was a PPA for capsicum for a while...)</p> <p>CloudABI or capsicum at work would be <em>so great</em>. But it's a long way off... we're struggling to migrate to SuSE 12 so we can try out Docker.</p> <p>For example, a research workflow app I maintain needs to be able to send mail, but - only from one address - only using templated bodies - only to users who have in some way asked for it</p> <p>Design sketch: at at investigator request time, user grants "capability to send app-template mail to addresses X, Y, Z".</p> <p>As a demonstration, I'm <a href="https://github.com/dckc/ZeroVault/tree/cloudabi_wsgi">porting ZeroVault to CloudABI</a> using a <a href="https://atlas.hashicorp.com/freebsd/boxes/FreeBSD-11.0-STABLE">FreeBSD vagrant box VM</a>. It's pretty fun since Ed fixes <a href="https://github.com/NuxiNL/cloudabi-ports/issues?utf8=%E2%9C%93&amp;q=%20is%3Aissue%20author%3Adckc%20">my issues</a> within a few hours of when I report them.</p> <h3>bus1, Capability-based IPC for Linux</h3> <p>I'm heartened by <a href="https://lwn.net/Articles/697191/">momentum around bus1</a>.</p> <p>On top of the lack of composability in the chmod/chgroup, there's a mounting kludge tower of stuff like SELinux and (to a lesser extent?) AppArmour. I was doing a storage audit and learning how <code>lsblk</code> gets the serial number of my drives. I had heard of udev and systemd, but I had no idea it uses <a href="https://en.wikipedia.org/wiki/Netlink">netlink</a> ("a more flexible alternative to ioctl") to communicate with the kernel.</p> <h2>Object capability discipline for Docker</h2> <p>Is this even possible? I can't get my head around the Docker security story.</p> <h2>Uniform, composable FFI and stdlib for pony</h2> <p>Pony aims to be a high-performance capability-secure language. I would love to see it make some inroads on golang: while go addresses (many of) the memory-safety issues of C/C++, its standard library is full of ambient authority and its type system dooms developers to lots of boilerplate maintenance.</p> <p>I'm struggling (mostly for time) to convince the pony community that a reasonably simple <a href="https://github.com/dckc/rfcs/blob/ffi-taming/text/0000-ffi-taming.md">policy</a> can eradicate ambient authority from the standard library.</p> <p>In discussion of my <a href="https://github.com/ponylang/ponyc/pull/301">network API PR</a>, I learned that the pony designers don't (yet) see interposition as a key component of robust composition.</p> <h2>Safe systems programming on seL4 and genode</h2> <p>The <a href="https://genode.org/documentation/release-notes/16.05">genode May 2016 release</a> included initial support for rust. I haven't managed to try it out. Support for pony on genode has only gotten as far as a <a href="https://twitter.com/ponylang/status/671971997753212928">Dec 2015 twitter exchange</a> as far as I know.</p> <p><a href="https://robigalia.org/">Robigalia</a> aims to be a persistent capability OS built on seL4 and rust.</p> <p>Rust on seL4 is pretty bleeding-edge: <a href="https://github.com/SEL4PROJ/rust-camkes-samples/issues/1">SEL4PROJ/rust-camkes-samples/issues/1</a> documents my trials and tribulations. https://github.com/seL4/refos looks interesting.</p> <h2>Object capability discipline support in rust</h2> <p>We supporters of the <a href="https://github.com/rust-lang/rust/issues/3094#issuecomment-9589749">2012 proposals to isolate ambient authority in the rust stdlib</a> didn't make our case well enough for the 1.0 cut-off, but there is renewed interst in <a href="https://internals.rust-lang.org/t/refactoring-std-for-ultimate-portability/4301">refactoring std for ultimate portability</a>. One result of this could be a std alternative with no ambient authority.</p> Mon, 02 Jan 2017 00:00:00 +0000 http://www.madmode.com//2017/ocap-wish-list/ Can Google and Facebook robots help the Web promote the truth? http://www.madmode.com//2016/deep-learning-promote-truth/ <blockquote> <p>A critical, independent and investigative press is the lifeblood of any democracy. -- Nelson Mandela</p> </blockquote> <p>Newspaper revenue has traditionally come from advertising. Craigslist wiped out classified advertising and Google and Facebook get the lion's share of digital advertising revenue<sup><a href="https://www.bloomberg.com/news/articles/2016-04-22/google-and-facebook-lead-digital-ad-industry-to-revenue-record">1</a></sup>, which dwarfs newspaper advertising revenue by 3x<sup><a href="https://www.statista.com/topics/979/advertising-in-the-us/">2</a></sup>.</p> <p>Our kids can't tell the difference between real and fake news<sup><a href="http://www.npr.org/sections/thetwo-way/2016/11/23/503129818/study-finds-students-have-dismaying-inability-to-tell-fake-news-from-real">3</a></sup>.</p> <p>The deep-learning robots at Google can caption photos as well as humans<sup><a href="http://www.eetimes.com/document.asp?doc_id=1325712">4</a></sup>. How hard could it be to teach them to tell true stories from fake? That is:</p> <ul> <li>determine the main claims of an article</li> <li>determine the level of support for these claims</li> <li>determine whether promotion of the article is more likely to lead to spead of true or false information</li> </ul> <p>The issue of bias via incomplete truth is perhaps a harder problem.</p> <h2>References</h2> <ol> <li><a href="https://www.bloomberg.com/news/articles/2016-04-22/google-and-facebook-lead-digital-ad-industry-to-revenue-record">Google and Facebook Lead Digital Ad Industry to Revenue Record</a><br /> by Aleksandra Gjorgievska<br /> April 21, 2016</li> <li><a href="https://www.statista.com/topics/979/advertising-in-the-us/">U.S. Advertising Industry - Statistics &amp; Facts</a></li> <li><a href="http://www.npr.org/sections/thetwo-way/2016/11/23/503129818/study-finds-students-have-dismaying-inability-to-tell-fake-news-from-real">Students Have 'Dismaying' Inability To Tell Fake News From Real, Study Finds</a><br /> npr.org November 23, 2016 </li> <li><a href="http://www.eetimes.com/document.asp?doc_id=1325712">Microsoft, Google Beat Humans at Image Recognition Deep learning algorithms compete at ImageNet challenge</a><br /> R. Colin Johnson<br /> EE|Times 2/18/2015</li> </ol> Fri, 30 Dec 2016 00:00:00 +0000 http://www.madmode.com//2016/deep-learning-promote-truth/ Google Fiber vs. Kansas Jayhawks http://www.madmode.com//2016/google-fiber-vs-jayhawks/ <p>Google Fiber finally came to my neighborhood. The deadline to sign up was November 3.</p> <p>But Time Warner has an exclusive on Kansas Jayhawks basketball games. And their 300MB service is not bad.</p> <p>So I passed on Google Fiber. I can hardly believe it.</p> Wed, 28 Dec 2016 00:00:00 +0000 http://www.madmode.com//2016/google-fiber-vs-jayhawks/ Introducing Capabilities to the Next Generation http://www.madmode.com//2016/03-park-talk/ <p>The consequences of hooking stuff up to the Internet without sufficient care are going up all the time:</p> <ul> <li><a href="http://time.com/4270728/iran-cyber-attack-dam-fbi/">Iranian Cyber Attack on New York Dam Shows Future of War</a> Mark Thompson @MarkThompson_DC March 24, 2016</li> </ul> <p>As an open source advocate, I initially bristle at this...</p> <blockquote> <p>These sectors may be particularly vulnerable to cyberattack because they rely on open-source software or hardware, third-party utilities, and interconnected networks</p> </blockquote> <p>but it is a factor: it lets people hook their stuff up to interconnected networks without going up the management chain to authorize a purchase.</p> <p>Meanwhile, it's going to get worse before it gets better, from every indication I see. This sort of accountability might actually be healthy:</p> <ul> <li><a href="https://www.onthewire.io/ftc-demands-info-from-pci-auditors/">FTC Demands Info From PCI Auditors On Breached Companies' Compliance</a></li> </ul> <p>I can imagine demand for software audits will increase as a result. Perhaps that provides an opportunity, since object capability discipline facilitates software audits. The effort to get the value of ocap recognized widely in the security and compliance community is daunting, but I sure hope it happens.</p> <p>I managed to do a bit. I was invited to speak to a small C.S. class at a nearby college while the regular professor was away. I took the opportunity to review and re-package two of Mark Miller's talks from 2011, prefaced with the "giant bags of mostly water" slides. It was fun!</p> <ul> <li><a href="https://docs.google.com/presentation/d/1YApMNX-2LmERrDTtGNrvpAgOwXIZ-ppp44QXhu9dTc8/edit#slide=id.p">Web Security: Patterns of Cooperation Without Vulnerability</a></li> </ul> Thu, 08 Dec 2016 00:00:00 +0000 http://www.madmode.com//2016/03-park-talk/ Etherium and DAO tokens: an experience report http://www.madmode.com//2016/eth-dao/ <p><strong>Tada! I own 227.27 <a href="https://daohub.org/">DAO</a> tokens.</strong> Why? As a student of capability security and computer-supported collaboration in general, I'm naturally interested in smart contracts. When an autonomous smart-contract platform raises <a href="https://bitcoinmagazine.com/articles/the-dao-raises-more-than-million-in-world-s-largest-crowdfunding-to-date-1463422191">$100M+ in a week</a>, I figure I should know how it works.</p> <p>The buzzwords run thick and fast:</p> <ul> <li>The DAO is <strong>Code</strong>.</li> <li>The DAO is <strong>Autonomous</strong>.</li> <li>The DAO is <strong>Revolutionary</strong>.</li> <li>The DAO is <strong>Rewarding</strong>.</li> </ul> <p>Which of these do I believe?</p> <ul> <li><strong>Code</strong>: Check. The evidence is clear and compelling. The bytes are <code>0x60606040523615...</code> and a <a href="https://github.com/slockit/DAO/wiki/The-DAO-v1.0-Code#verifying-the-dao-code.">straightforward verification process</a> establishes that the <a href="https://github.com/slockit/DAO">source code</a> compiles to this output.</li> <li><strong>Autonomous</strong>: It autonomously does <em>something</em> (inasmuch as miners keep the Etherium distributed VM going). An <a href="https://blog.slock.it/deja-vu-dao-smart-contracts-audit-results-d26bc088e32e#.6wtj3lwqg">audit</a> has vouched that the contract is "secure" and, I gather, faithful to the <a href="https://download.slock.it/public/DAO/WhitePaper.pdf">DAO whitepaper</a>. I haven't digested the argument that what it does is fair and not controlled by any one or few actors. I'm still digesting basics such as <a href="https://github.com/ethereum/wiki/wiki/Patricia-Tree">patricia trees</a>, actually.</li> <li><strong>Revolutionary</strong>: Perhaps $100M in a week constitutes a revolution. But whether there will be any lasting effect is unclear to me. The argument from <a href="https://www.reddit.com/r/ethereum/comments/4jnem4/is_the_dao_going_to_be_doa_by_dan_larimer_of/">BitShares experience</a> that voter apathy and mis-aligned incentives will result in failure is more substantive than any argument I found in favor of the DAO.</li> <li><strong>Rewarding</strong>: Finding <em>any</em> substance behind this claim was quite a challenge. I looked for a simple 3-point argument that there's some ROI in here... Is that too much to ask? Apparently so. The whitepaper didn't elucidate much for me; it started with a bit of history of smart contracts (citing Szabo 1997 and Miller 1997 was good to see) and then immediately dove into details of the values of various constants in the contract algorithm. Buried several layers into the web site, I found that proposal 1 includes for a sort of <a href="https://forum.daohub.org/t/slock-it-proposal-1-discussion-thread/539">generalized, automated airbnb</a>. Ok, that's at least somewhat plausible. Follow-the-money seems to lead to slock.it. I'm sure it's rewarding for <em>them</em>.</li> </ul> <p>I get anxious reading the code: too many of the security properties seem to rely on programmer dilligence:</p> <pre><code>function transfer(address _to, uint256 _amount) noEther returns (bool success) { if (balances[msg.sender] &gt;= _amount &amp;&amp; _amount &gt; 0) { balances[msg.sender] -= _amount; balances[_to] += _amount; Transfer(msg.sender, _to, _amount); return true; } </code></pre> <p>Compare the above from <a href="https://github.com/slockit/DAO/blob/master/Token.sol">Token.sol</a> to the elegant simplicity of <a href="http://erights.org/elib/capability/ode/ode-capabilities.html">simple money in E</a>:</p> <pre><code> to deposit(amount :int, src) :void { unsealer.unseal(src.getDecr())(amount) balance += amount } </code></pre> <p>OK... so how do I do it?</p> <blockquote> <p>To obtain DAO tokens, ... send ETH from your Ethereum Wallet ... to The DAO’s address below. <code>0xbb9bc244d798123fde783fcc1c72d3bb8c189413</code></p> </blockquote> <p>... and there's a wizard... it recommends paying eith ETH. But I don't have any. So I choose USD, at which point they refer me to bity.</p> <ul> <li>The register button wouldn't light up when I used a password manager to enter a password<ul> <li>I eventually found a work-around: manually type a character and then delete it.</li> </ul> </li> <li>After filling in all the info to order some ETH, they gave me international bank transfer instructions. I have no idea how to execute such a transfer, but I'm quite sure it's not something I can do now, when my bank is closed.</li> </ul> <p>So I back-track and try the recommended wallet. Mist is an node+webkit style app. When I start it up, it says it has to sync with the blockchain and stays like that for longer than my attention span. <em>How about some advanced notice that this is going to take hours and GB of disk space?</em> I guess one should not expect good road signs in the wild west.</p> <p>Back-track again... Searching turned up a <a href="http://ethereum.stackexchange.com/a/1916">How do I buy Ethereum with USD? answer</a> Mar 8 at 5:38 by niksmac:</p> <ol> <li>Buy BTC with a debit card at coinbase.<ul> <li>The experience is much more what I expected. I did the SMS callback verification dance and exchanged $30 for BTC using a debit card within 10 or 15 minutes. I switched the 2FA on my account from SMS to TOTP (google authenticator) in the process.</li> </ul> </li> <li>Exchange BTC for ETH using shapeshift.<ul> <li>This presumes I have an ETH address. <a href="https://ethereumwallet.com/index.html">EtheriumWallet by Krypokit</a> lets you make one right in your browser in a minute or so. While I'm sure a full blockchain sync is more secure, I'm only risking a few dollars here and "more" is probably a difference between getting struck by lighting once and getting struck twice. Do I really care?</li> <li>shapeshift result: <a href="https://etherscan.io/tx/0x77fef130c7b576e188018602206141723bd11ff47cb8baadaab370fc29892618">receipt for 2.55384881 Ether</a></li> </ul> </li> <li>Send ETH to the DAO contract address.<ul> <li>I got plenty of confirmations on <a href="https://etherscan.io/tx/0x1be4715b9fc0e3b6c6793f6d27ba1ca00d4a47d9b4ab1fe51fa456a33adde355">transaction 0x1be4715b...</a>, so I thought I was all set. But the last step of the DAO wizard was to confirm on <a href="https://daohub.org/creation.html">the creation page</a>, but I kept getting 0 tokens for my address there.</li> <li>Eventually I learned the <a href="https://forum.daohub.org/t/out-of-gas-could-not-get-doa/2148">out of gas</a> warning really matters. That Krypokit wallet worked fine for sending ETH around, but it didn't add any gas, so non-trivial contracts didn't work.</li> <li>I back-tracked to <a href="https://www.myetherwallet.com/#the-dao">MyEtherWallet</a>, which added sufficient gas to run the contract. Bingo! I did a smaller transaction to be sure I had enough for gas and then another for the rest:<ul> <li><a href="https://etherscan.io/tx/0x06bee04bc3f3286557e0aa9e12313a169184f074a41b3c650b0fa13c69daa9f6">1 ETH</a></li> <li><a href="https://etherscan.io/tx/0x7340c3f8b91cb64c811b9079b0c716d86fd7568cd23f390b0333d467d337c858">1.5 ETH</a></li> </ul> </li> </ul> </li> </ol> <p>I eventually did a full blockchain sync. I kept starting over thinking I was doing something wrong. But no, it really iterates through all 1.5M blocks on the blockchain <em>twice</em>, which takes a few hours and uses about 2GB using <code>geth --fast</code>.</p> Wed, 18 May 2016 00:00:00 +0000 http://www.madmode.com//2016/eth-dao/ A Look Back at TEDxKC 2015 http://www.madmode.com//2015/tedx-kc/ <p>Looking back over the year, one of the highlights was <a href="http://tedxtalks.ted.com/browse/talks-by-event/TEDxKC">TEDxKC</a>. In its seventh year, Kansas City's TEDx is the largest in the country. It lived up to the hype. The Kauffman Center is a great venue, we went with some good friends, and the program was excellent.</p> <p><a data-flickr-embed="true" href="https://www.flickr.com/photos/mikeandwillow/9371792162/in/photolist-fh9Tiu-y1FJWU-bLfN4p-bLfN3D-d2duMY-d2dsUE-d2duaY-d2dt3y-oEEqdb-d2dv4s-d2dtVS-d2dv6L-d2dtHN-d2dsS3-d2dsKS-oEtHHL-oo78if-ooesUX-bo3UB1-oCy7a5-bAZFgH-oEAtoR-oEGdJf-ooeowi-oEJ3aP-ooehgo-oEspWT-oEssJM-oof5mB-ooecP7-oEGino-oEHZFR-oGtSU8-iepBfK-iepBoF-iepYof-iepBke-iepRco-ieqfqe-bLfN6a-iepYEN-iepYny-ieqfnP-iepYzN-oCEh5d-iepYyL-iepYps-iepRd5-iepYBw-iepBov" title="venue"><img src="https://farm8.staticflickr.com/7454/9371792162_973c31cf32_n.jpg" width="320" height="213" alt="venue"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> <p><a data-flickr-embed="true" href="https://www.flickr.com/photos/dckc/20390087964/in/album-72157661941165530/" title="TEDxKC SWAG"><img src="https://farm1.staticflickr.com/586/20390087964_c0a3335b38_n.jpg" width="237" height="320" alt="TEDxKC SWAG"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> <p>The most inspiring speakers were:</p> <ul> <li><a href="https://www.youtube.com/watch?v=PnMs_qLwaes">Tommy Caldwell - What are you up against?</a></li> <li><a href="https://www.youtube.com/watch?v=WD1IX1AFRZg">Martin Pistorius - My Way Back to Words</a></li> </ul> <p>Both of them conveyed extremes of the human condition, but Pistorius was simply spell-binding.</p> <p><a data-flickr-embed="true" href="https://www.flickr.com/photos/tedxkc/21011170416/in/photolist-y1FJWU-bLfN4p-bLfN3D-d2duMY-d2dsUE-d2duaY-d2dt3y-oEEqdb-d2dv4s-d2dtVS-d2dv6L-d2dtHN-d2dsS3-d2dsKS-oEtHHL-oo78if-ooesUX-bo3UB1-oCy7a5-bAZFgH-oEAtoR-oEGdJf-ooeowi-oEJ3aP-ooehgo-oEspWT-oEssJM-oof5mB-ooecP7-oEGino-oEHZFR-oGtSU8-iepBfK-iepBoF-iepYof-iepBke-iepRco-ieqfqe-bLfN6a-iepYEN-iepYny-ieqfnP-iepYzN-oCEh5d-iepYyL-iepYps-iepRd5-iepYBw-iepBov-a7mNKz" title="TEDxKC 2015: REIMAGINE"><img src="https://farm1.staticflickr.com/782/21011170416_e6e0411659_n.jpg" width="320" height="213" alt="TEDxKC 2015: REIMAGINE"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> <p><a href="https://www.youtube.com/watch?v=GhKYUPY64p0&amp;feature=youtu.be">The Yawpers jam</a> was mindless fun. The Kauffman Center is a naturally high-brow scene, so when the front man opened with "I'm gonna play some really stupid rock 'n roll for you," and the drummer started banging away, it was good to get outside your head and roll with it.</p> <p>Speaking of fun, at one of the booths, I got my own <a href="https://www.google.com/get/cardboard/">google cardboard</a> viewer, courtesy of Southwest Airlines.</p> <p>Scott Hamilton was a name and a face we all knew from Olympic skating, but he was there to talk about <a href="https://www.youtube.com/watch?v=Nzg4fMJI6jk">Surviving the cancer cure</a>. He was promoting <a href="https://en.wikipedia.org/wiki/Proton_therapy">proton therapy</a>, an alternative that the insurance companies seem to be resisting without good cause, given the falling costs and the quality of outcomes. Currently there are 14 operating proton therapy centers throughout the U.S. with another 10 under construction. The news that there's also a proton therapy center slated for Kansas City got a big round of applause.</p> <p>The local challenge winner was <a href="https://www.youtube.com/watch?v=kj9HYBD-knQ">Audrey Odom - Fighting drug resistant infections with your breath</a>. The increasing resistance to antibiotics is pretty serious stuff, so her simple diagnostic tool looks pretty important.</p> <p>And after all that, there's an after-party on the lawn, and all the speakers join in. Next time, we'll know to plan to stay for more than just a few minutes of that!</p> <p><a data-flickr-embed="true" href="https://www.flickr.com/photos/tedxkc/20414765554/in/photolist-x6Z1ys-xLpae1-oq9kDb-oq9kwN-xLo7yf-x786Az-y38hv5-y1FLHj-x788ic-y41aXX-oq9kTE-xLuMaF-y4FewD-oq9Fiv-y4FdWR-y41cMZ-y41d4v-y4FdEt-x786ec-oq9EUe-y1FLPG-y41b4i-xLo7VY-oq9kmC-oq99ZM-fBXWom-oGB1py-bAXLsp-y38fZ9-oGCH14-oq9awD-y1FMm3-oEATvj-y4Ffkx-oq9ajp-y1FLVo-oEATAQ-oq9kHQ-y4Ff9F-xLuM1n-bo3UBS-9noyVw-oq9asR-x787vF-oGAZMb-bAXLqc-8KPYXC-y38gzY-y4FdFR-xLpb1b" title="TEDxKC 2015: REIMAGINE"><img src="https://farm1.staticflickr.com/686/20414765554_f01d95c742_n.jpg" width="320" height="213" alt="TEDxKC 2015: REIMAGINE"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script></p> Thu, 17 Dec 2015 00:00:00 +0000 http://www.madmode.com//2015/tedx-kc/ jukekb - Browse iTunes libraries and upload playlists to Google Music http://www.madmode.com//2015/jukekb/ <p><em>originally published as <a href="https://bitbucket.org/DanC/palmagent/src/a2245686a4c19b43a3607a8265f2b1e43b7dd41a/jukekb/?at=default">jukekb on bitbucket</a></em></p> <p>My digital music collection has two parts:</p> <ul> <li>the audio files themselves, which are<ul> <li>somewhat large</li> <li>readily, though not freely replaceable</li> <li>only licensed copies, not mine to re-publish as I please</li> </ul> </li> <li>the playlists, ratings, play logs and counts, which are<ul> <li>irreplaceable</li> <li>small</li> </ul> </li> </ul> <p>My family has mostly used iTunes over the years, but our libraries have become somewhat fragmented and redundant. Plus, we carry android devices. I should be able to get at my old playlists from my new phone. But how? Then I discovered <a href="http://beets.radbox.org/">beets</a>, the "media library management system for obsessive-compulsive music geeks." I prefer the title Eric Miller gave me, "closet librarian," but still, it struck a chord.</p> <p>Apple, Google, and Amazon all offer cloud music services. Mostly I figure the audio files might as well live in the cloud while I just have cached copies, though there's no garage sale market in digital media -- the first sale doctrine is much more clear with physical CDs.</p> <p>Apple doesn't interoperate with android. My wife sometimes buys new CDs from Amazon, which come with cloud storage copies. But they only let you upload 250 songs for free. I'm a little leery about Google these days, but when I found a <a href="https://unofficial-google-music-api.readthedocs.org/en/latest/reference/mobileclient.html">python API for Google Music</a>, I figured I could get my data back, so I decided to dive in.</p> <h2>Design</h2> <p>My <a href="https://bitbucket.org/DanC/palmagent/src/f166e71cf023/jukekb/jukekb.py">first step</a> was a hello-world <a href="http://www.tornadoweb.org/en/stable/web.html">tornado</a> service, (tweaked for <a href="http://www.madmode.com/search/label/capabilities/">object capapility discipline</a>). While modern javascript looks pretty cool, I'm not nearly as productive with it yet as I am with python. And tornado's turn-based architecture is pretty close to node.js.</p> <p>Browsing iTunes metadata was straightforward. Sorting by by date added provided a fun trip down memory lane!</p> <p>The main challenge was dealing with the evolution of iTunes's file organization approach. <code>ituneslib.Library.path(track)</code> resolves <code>Location</code> data:</p> <pre><code>def path(self, track, fixes=['Music', 'iTunes Media', 'iTunes Media/Music', 'iTunes Media/Movies', 'iTunes Media/Podcasts', 'iTunes Music', 'iTunes Music/Music']): </code></pre> <p>BTW: yay <a href="https://pypi.python.org/pypi/pathlib/">pathlib</a>!</p> <p>I tried a couple approaches to finding duplicate iTunes tracks via musicbrainz IDs: first I let the beets tagger grind over my collection over night. But I was confused by the results. Then I incorporated the <a href="http://python-musicbrainzngs.readthedocs.org/en/latest/api/#searching">musicbrainz API</a> to interactively match iTunes albums to musicbrainz releases. I was disappointed to learn that beets records release groups, not releases.</p> <p>I also let the Google Music uploader do its thing overnight. But of course I was left with no record of which of my local copies matched which item in the cloud, so I'm left with the duplicate problem all over again.</p> <p>At this point I was juggling any number of metadata web services, but then switching to a local app to actually play the song to check that I had the right one (though beets has a web interface). Reviewing the state of the art in musicbrainz tools, I re-discovered <a href="https://quodlibet.readthedocs.org/">quodlibet</a>, which has evidently gotten steadily more awesome since I originally found it. Using its fingerprinting and musicbrainz lookup plugins, I started to see all sorts of problems with my metadata.</p> <p>When it came to <em>Graceland</em>, one of my all-time favorites, I went and tracked down the CD jewel case itself to use the barcode to figure out which was the relevant release. I started a <a href="http://www.discogs.com/user/dckc/collection">dckc discogs collection</a> in the process. Cool!</p> <p>Quodlibet has <a href="https://quodlibet.readthedocs.org/en/latest/guide/browse/playlists.html#playlists">playlist support</a>, but just .m3u and .pls, which leave me with the same problem: they're just lists of filenames, which don't have the UI benefits of HTML or even CSV let alone the ability to survive re-organzation of audio files.</p> <p>I thought about robust filenames for use in such playlists. What would be the top of the hierarchy? i.e. the major sort key?</p> <ul> <li>by artist?<ul> <li>That's how they're on the bookshelf.</li> <li>what about "Various Artists"?</li> <li>We can always re-created a view by artist.</li> </ul> </li> <li>by release date?<ul> <li>more stable over time</li> </ul> </li> </ul> <p>And spaces in filenames are a pain. So omething like: <code>release-1986-billy+joel-52nd+street-mbrain3897293/01-movin+out-mbrain2098324</code>.</p> <p>I still hope to get there. But meanwhile...</p> <p>I built a quodlibet plug-in to "reload" my tags from iTunes metadata after using the edit tags feature to erase all tags in one go. Whee!</p> <p>And I started my Google Music collection fresh and worked out (most of) the kinks of incremental upload with records of which Google Music server ids correspond to which iTunes Persistent IDs.</p> <p>I'm still thinking about workflows for new music. And I haven't actually solved the problem of duplicates across iTunes libraries yet. But when I do, my upload logs should let me clean up my Google Music collection too.</p> <h2>Usage</h2> <p><em>See requirements.txt for prerequisites.</em></p> <p>Get an OAuth token for uploading:</p> <pre><code>$ gmbox oauth </code></pre> <p>Provide password for metadata access:</p> <pre><code>$ export GOOGLE_MUSIC_PASSWORD=... $ # I like to do: $ export GOOGLE_MUSIC_PASSWORD=`ssh-ask-password` </code></pre> <p>Start the service:</p> <pre><code>$ jukekb --db=DB --gmusic=EMAIL LIBRARY... </code></pre> <p>... where LIBRARY is an iTunes library directory and DB is for upload logs.</p> <p>The service will report its web address. From there you can</p> <ul> <li>browse libraries</li> <li>browse albums and artistss within libraries</li> <li>search</li> <li>browse playlists</li> <li>upload playlists (with the "match" button)<ul> <li>already-uploaded songs are added to the Google Music playlist without uploading again</li> <li>TODO: cross-library duplicate detection, e.g. using MusicBrains IDs</li> </ul> </li> </ul> <p>The scripts have more usage details. Yay <a href="https://pypi.python.org/pypi/docopt/">docopt</a>!</p> Sat, 11 Jul 2015 00:00:00 +0000 http://www.madmode.com//2015/jukekb/ Syncing a 5 Year iPhoto Library with flickr http://www.madmode.com//2015/photo-flickr-explore/ <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Yay! jawj's <a href="https://github.com/jawj/iphoto-flickr">iphoto-flickr</a> sync'd a 30GB iPhoto library flickr. Not only did it upload the images, but it made a map from iPhoto metadata to flickr metadata that lets me continue on with the flickr API, syncing dates and such, during and after the upload.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="OS-X-Yosemite-runs-iPhoto,-reluctantly">OS X Yosemite runs iPhoto, reluctantly<a class="anchor-link" href="#OS-X-Yosemite-runs-iPhoto,-reluctantly">&#182;</a></h2><p>Having replaced it with a newer model, I gave <strong>airbook</strong>, our late 2008 MacBook Air MB543LL/A, a complete labotomy and installed OS X Yosemite. When I tried to re-introduce it to our photo archive, Apple told me iPhoto is no longer; Photos is the new thing, complete with iCloud hip-ness.</p> <p>So I'm faced with another "to Mac or not to Mac?" moment.</p> <p>This time I figure no, I lead a multi-platform life and I want something more web-native.</p> <p>I spent a bunch of time trying to downgrade to Mavericks. Just when I had given up trying to do it myself and ordered Mavericks on a USB flash drive via eBay, I learned that if you <a href="http://www.simplehelp.net/2015/05/01/how-to-install-iphoto-in-yosemite-os-x-10-10/#comment-2097851737">start iPhoto from a command prompt</a>, it runs on Yosemite after all.</p> <p>The only complication was a dangling reference to <code>iLifeSlideshow.framework</code> in <code>/System/Library/PrivateFrameworks</code>. (Thank you <a href="https://bombich.com/">Carbon Copy Cloner</a> for a complete backup!)</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Home-directory-in-encrypted-sparsebundle">Home directory in encrypted sparsebundle<a class="anchor-link" href="#Home-directory-in-encrypted-sparsebundle">&#182;</a></h3><p>This photo library is on an external USB drive, in an encrypted sparsebundle. The <a href="https://discussions.apple.com/thread/2082558?tstart=0">sparsebundle support discussion</a> said all I have to do is double-click it, but it's hidden (filename starts with a dot). Command-line to the rescue, again: <code>open /Users/.maryc</code>.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Flickr-is-the-web-photo-service-for-the-closet-librarian">Flickr is the web photo service for the closet librarian<a class="anchor-link" href="#Flickr-is-the-web-photo-service-for-the-closet-librarian">&#182;</a></h2><p>As an Android mobile user, Google's photo offerings were tempting. Then I discovered browsing photos of person X only works within one album. And to put a photo in multiple albums, you have to copy it, i.e. maintain the tags and such twice.</p> <p>My friends-and-family photo sharing community mostly uses facebook these days, but for curating an archive, flickr is a much better match. Imagine my horror when I downloaded some of my photos from facebook and discovered they were only available at reduced resolution. Perhaps they've addressed that since, but I still haven't seen any support for date-taken as separate from date-uploaded on facebook. There's little, if any, support for quietly curating without notifications firing every which way.</p> <p>My photostream on flickr goes back to <a href="https://www.flickr.com/photos/dckc/archives/date-posted/2004/12/calendar/">Dec 2004</a> when it was big in the open web community. I could never bring myself to go premium, but in May 2013 when they announced the terabyte storage offer, I dusted it off. Re-establishing my long lost yahoo credentials was no small feat, but I managed.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Flickr-Backup-from-Mac-App-Store-was-a-Bust">Flickr Backup from Mac App Store was a Bust<a class="anchor-link" href="#Flickr-Backup-from-Mac-App-Store-was-a-Bust">&#182;</a></h2><p>A quick search of the Mac App store turned up promising results:</p> <ul> <li><a href="https://itunes.apple.com/us/app/backup-to-flickr-for-iphoto/id733300407?mt=12">Backup to Flickr for iPhoto</a><br> By Sonia Bohelay</li> </ul> <p>It was just a few bucks, so I went ahead. But oops: <strong>Your iPhoto library is either too old (iPhoto version &lt; 9.0) or no photo found</strong>. Indeed, my library is from 8.1.2. I might have been able to upgrade the library, but with Apple pushing Photos over iPhoto, I didn't want to bet on it.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="iPhoto--&gt;-Flickr-in-350-lines-of-code">iPhoto -&gt; Flickr in 350 lines of code<a class="anchor-link" href="#iPhoto--&gt;-Flickr-in-350-lines-of-code">&#182;</a></h2><p>I was thinking about rolling my own with the flickr API when I discovered a kindred spirit had already been down this path and come up with <a href="https://github.com/jawj/iphoto-flickr">iPhoto -&gt; Flickr</a>.</p> <p>It worked in one go, so the incremental upload support wasn't necessary for the initial bulk upload, but to further sync the metadata, the resulting <code>uploaded-photo-ids-map.txt</code> is critical. In fact, I had to wrestle with iPhoto a bit to get ids that are useful without iPhoto running.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Just-Add-API-Key-and-Authorize-with-OAuth">Just Add API Key and Authorize with OAuth<a class="anchor-link" href="#Just-Add-API-Key-and-Authorize-with-OAuth">&#182;</a></h3> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>It works pretty much like it says on the tin. (<em>The <a href="https://github.com/jawj/iphoto-flickr/issues/22">colorize dependency issue</a> was easy enough to figure out.</em>)</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <pre><code>airbook:src connolly$ git clone https://github.com/jawj/flickrbackup.git # e339169212 airbook:flickrbackup connolly$ sudo gem install flickraw-cached colorize Successfully installed flickraw-0.9.8 Successfully installed flickraw-cached-20120701 Successfully installed colorize-0.7.7 airbook:flickrbackup connolly$ ruby flickrbackup.rb Flickr API key: 0481... Flickr API shared secret: 897... Authorise access to your Flickr account: press [Return] when ready Authorisation code: 162-... 2015-07-04 13:47:28 -0500 Authenticated as: DanC 2015-07-04 13:47:44 -0500 8057 photos and 78 standard albums in iPhoto library 2015-07-04 13:47:44 -0500 8057 photos not yet uploaded to Flickr</code></pre> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Platform-Independent-Data">Platform Independent Data<a class="anchor-link" href="#Platform-Independent-Data">&#182;</a></h2><p>The kernel for this notebook is on my linux desktop, but iPhoto is running on <strong>airbook</strong>.</p> <p><em>Since spaces in filenames are a royal pain over ssh, I made a convenient symlink.</em></p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[190]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>ssh airbook.local ls -l Pictures/flickrbackup </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>lrwxr-xr-x 1 connolly staff 57 Jul 5 09:34 Pictures/flickrbackup -&gt; /Users/connolly/Library/Application Support/flickrbackup/ </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Upload-Map-DataFrame">Upload Map DataFrame<a class="anchor-link" href="#Upload-Map-DataFrame">&#182;</a></h2><p>It carefully logs the correspondence to support incremental update:</p> <pre><code>2015-07-04 13:47:44 -0500 (1/8057) Uploading '...2002/Sep 25, 2002/....jpg' ... 4294967334 -&gt; 19226418710</code></pre> <p>Let's make sure we have redundant copies of the map. And let's use ordinary CSV rather than the funky <code>-&gt;</code> format.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[191]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">display</span><span class="p">,</span> <span class="n">Image</span> <span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="nb">dict</span><span class="p">(</span><span class="n">pandas</span><span class="o">=</span><span class="n">pd</span><span class="o">.</span><span class="n">__version__</span><span class="p">,</span> <span class="n">numpy</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">__version__</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[191]:</div> <div class="output_text output_subarea output_execute_result"> <pre>{&apos;numpy&apos;: &apos;1.9.2&apos;, &apos;pandas&apos;: &apos;0.14.1&apos;}</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[112]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">upload_map</span> <span class="o">=</span> <span class="o">!</span>ssh airbook.local cat Pictures/flickrbackup/uploaded-photo-ids-map.txt <span class="n">upload_map</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span><span class="n">apple</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">flickr</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">f</span><span class="p">))</span> <span class="k">for</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">f</span><span class="p">)</span> <span class="ow">in</span> <span class="p">[</span><span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&#39; -&gt; &#39;</span><span class="p">)</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">upload_map</span><span class="p">])</span> <span class="n">upload_map</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[112]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>apple</th> <th>flickr</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> 4294967334</td> <td> 19226418710</td> </tr> <tr> <th>1</th> <td> 4294976544</td> <td> 18793385783</td> </tr> <tr> <th>2</th> <td> 4294976530</td> <td> 18793388523</td> </tr> <tr> <th>3</th> <td> 4294976542</td> <td> 18791506174</td> </tr> <tr> <th>4</th> <td> 4294971867</td> <td> 19414016905</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[187]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">upload_map</span><span class="o">.</span><span class="n">to_csv</span><span class="p">(</span><span class="s">&#39;uploaded-photo-ids-map.csv&#39;</span><span class="p">)</span> </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Flickr-metadata-access">Flickr metadata access<a class="anchor-link" href="#Flickr-metadata-access">&#182;</a></h2><p>Let's take a at the results on flickr. I experimented with python flickr apis; the main one seems to be <a href="http://stuvel.eu/media/flickrapi-docs/documentation/index.html">Python Flickr</a>. <a href="flickdata.py">flickdata.py</a> (in <a href="http://bitbucket.org/DanC/palmagent/">palmagent</a>) is a least-authority packaging of that API.</p> <p><em>TODO: use a separate Photo object for setDates.</em></p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[168]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="kn">import</span> <span class="nn">flickdata</span> <span class="nb">reload</span><span class="p">(</span><span class="n">flickdata</span><span class="p">)</span> <span class="n">flickdata</span><span class="o">.</span><span class="n">__version__</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[168]:</div> <div class="output_text output_subarea output_execute_result"> <pre>&apos;0.4&apos;</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>To make a <code>flickdata.Account</code>, we use the privileged iPython notebook environment to get network access and the API key (<em>and OAuth credentials... where do they get squirrelled away?</em>) and pass it to <code>flickdata</code>:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[45]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="kn">import</span> <span class="nn">json</span> <span class="kn">import</span> <span class="nn">logging</span> <span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">()</span> <span class="n">logger</span><span class="o">.</span><span class="n">setLevel</span><span class="p">(</span><span class="n">logging</span><span class="o">.</span><span class="n">INFO</span><span class="p">)</span> <span class="n">logging</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s">&#39;We try to log I/O.&#39;</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:root:We try to log I/O. </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[75]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="k">def</span> <span class="nf">myFlickrAcct</span><span class="p">(</span><span class="n">user_id</span><span class="o">=</span><span class="s">&#39;14874637@N00&#39;</span><span class="p">):</span> <span class="kn">import</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="nn">flickrapi</span> <span class="n">api_secret</span> <span class="o">=</span> <span class="n">pathlib</span><span class="o">.</span><span class="n">Path</span><span class="p">(</span><span class="s">&#39;flickr_api_secret&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">open</span><span class="p">()</span><span class="o">.</span><span class="n">read</span><span class="p">()</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> <span class="k">return</span> <span class="n">flickdata</span><span class="o">.</span><span class="n">Read</span><span class="o">.</span><span class="n">make</span><span class="p">(</span><span class="n">flickrapi</span><span class="p">,</span> <span class="n">api_secret</span><span class="p">,</span> <span class="n">user_id</span><span class="p">)</span> <span class="n">myAcct</span> <span class="o">=</span> <span class="n">myFlickrAcct</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:flickdata:authenticating... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.auth.oauth.checkToken&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickrapi.core:REST Parser: using xml.etree.cElementTree </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Photostream-confused-about-recent-photos">Photostream confused about recent photos<a class="anchor-link" href="#Photostream-confused-about-recent-photos">&#182;</a></h3><p>A bunch of old photos and videos are showing up as recent in my photostream as the upload progresses.</p> <p>Flickr seems to set datetaken = date uploaded when there's no EXIF date, so let's look at these supposedly recent photos.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[172]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">records</span> <span class="o">=</span> <span class="p">[</span> <span class="n">r</span> <span class="k">for</span> <span class="n">page</span> <span class="ow">in</span> <span class="n">myAcct</span><span class="o">.</span><span class="n">getPhotos</span><span class="p">(</span> <span class="n">min_taken_date</span><span class="o">=</span><span class="s">&#39;2015-07&#39;</span><span class="p">,</span> <span class="n">max_taken_date</span><span class="o">=</span><span class="s">&#39;2015-08&#39;</span><span class="p">,</span> <span class="n">sort</span><span class="o">=</span><span class="s">&#39;date-taken-asc&#39;</span><span class="p">)</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">page</span><span class="p">]</span> <span class="n">photo</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">records</span><span class="p">)</span> <span class="n">photo</span><span class="p">[</span><span class="s">&#39;id&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">photo</span><span class="o">.</span><span class="n">id</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="nb">int</span><span class="p">)</span> <span class="c"># odd... even in JSON format, ids come back as strings</span> <span class="n">photo</span> <span class="o">=</span> <span class="n">photo</span><span class="o">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">&#39;id&#39;</span><span class="p">)</span> <span class="nb">len</span><span class="p">(</span><span class="n">photo</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:flickdata:getPhotos page 1 cirt: {&apos;sort&apos;: &apos;date-taken-asc&apos;, &apos;min_taken_date&apos;: &apos;2015-07&apos;, &apos;max_taken_date&apos;: &apos;2015-08&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.people.getPhotos&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:page 1 of 1 </pre> </div> </div> <div class="output_area"><div class="prompt output_prompt">Out[172]:</div> <div class="output_text output_subarea output_execute_result"> <pre>78</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Photo-URLs:-Thumbnail">Photo URLs: Thumbnail<a class="anchor-link" href="#Photo-URLs:-Thumbnail">&#182;</a></h3><p>Flickr's <a href="https://www.flickr.com/services/api/misc.urls.html">URLs API</a> uses "secrets" so they serve as nice tasty <a href="http://www.w3.org/TR/capability-urls/">capability URLs</a> for the photos. So we can see this thumbnail from this iPython notebook even though we're not logged in to flickr here.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[173]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">Image</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">photo</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">url_t</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[173]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <img src="https://farm1.staticflickr.com/540/19350464562_2ff03b551d_t.jpg"/> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Photo-info-fields">Photo info fields<a class="anchor-link" href="#Photo-info-fields">&#182;</a></h3><p><em>Note: this isn't all fields available from <a href="https://www.flickr.com/services/api/flickr.galleries.getPhotos.html">getPhotos</a></em>.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[174]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">photo</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[174]:</div> <div class="output_text output_subarea output_execute_result"> <pre>accuracy 16 context 0 datetaken 2015-07-02 11:28:40 datetakengranularity 0 datetakenunknown 0 dateupload 1435854531 description {u&apos;_content&apos;: u&apos;&apos;} farm 1 geo_is_contact 0 geo_is_family 0 geo_is_friend 0 geo_is_public 1 height_k 1516 height_o 2368 height_t 74 height_z 474 isfamily 0 isfriend 0 ispublic 0 latitude 39.054611 longitude -94.611264 machine_tags owner 14874637@N00 place_id _zncmCVTVLmQsuBlEg secret 2ff03b551d server 540 tags title IMG_20150702_112835 url_k https://farm1.staticflickr.com/540/19350464562... url_o https://farm1.staticflickr.com/540/19350464562... url_t https://farm1.staticflickr.com/540/19350464562... url_z https://farm1.staticflickr.com/540/19350464562... width_k 2048 width_o 3200 width_t 100 width_z 640 woeid 26342889 Name: 19350464562, dtype: object</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Ah. Good. When the display defaults date taken to upload date, the underlying data tells us so:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[176]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="kn">import</span> <span class="nn">datetime</span> <span class="n">photo</span><span class="p">[</span><span class="s">&#39;upload_date&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">datetime64</span><span class="p">(</span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">fromtimestamp</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">ts</span><span class="p">)))</span> <span class="k">for</span> <span class="n">ts</span> <span class="ow">in</span> <span class="n">photo</span><span class="o">.</span><span class="n">dateupload</span><span class="p">]</span> <span class="n">photo</span><span class="p">[</span><span class="n">photo</span><span class="o">.</span><span class="n">datetakenunknown</span> <span class="o">==</span> <span class="s">&#39;1&#39;</span><span class="p">][[</span><span class="s">&#39;datetaken&#39;</span><span class="p">,</span> <span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;width_o&#39;</span><span class="p">,</span> <span class="s">&#39;height_o&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[176]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>datetaken</th> <th>upload_date</th> <th>title</th> <th>width_o</th> <th>height_o</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>19426161436</th> <td> 2015-07-05 20:23:04</td> <td>2015-07-05 20:23:04</td> <td> segway tour - 4</td> <td> 466</td> <td> 630</td> </tr> <tr> <th>19456508621</th> <td> 2015-07-05 20:23:10</td> <td>2015-07-05 20:23:10</td> <td> segway tour - 28</td> <td> 851</td> <td> 630</td> </tr> <tr> <th>19266094319</th> <td> 2015-07-05 20:23:12</td> <td>2015-07-05 20:23:12</td> <td> segway tour - 2</td> <td> 466</td> <td> 630</td> </tr> <tr> <th>19264673498</th> <td> 2015-07-05 20:23:16</td> <td>2015-07-05 20:23:16</td> <td> segway tour - 33</td> <td> 466</td> <td> 630</td> </tr> <tr> <th>19445946592</th> <td> 2015-07-05 20:23:47</td> <td>2015-07-05 20:23:47</td> <td> Brennan baby tub</td> <td> 2351</td> <td> 2945</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>These were uploaded very soon after being taken; I suppose I turned on auto-upload on my phone:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[192]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">photo</span><span class="p">[</span><span class="n">photo</span><span class="o">.</span><span class="n">datetakenunknown</span> <span class="o">==</span> <span class="s">&#39;0&#39;</span><span class="p">][[</span><span class="s">&#39;datetaken&#39;</span><span class="p">,</span> <span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;width_o&#39;</span><span class="p">,</span> <span class="s">&#39;height_o&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[192]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>datetaken</th> <th>upload_date</th> <th>title</th> <th>width_o</th> <th>height_o</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>19350464562</th> <td> 2015-07-02 11:28:40</td> <td>2015-07-02 11:28:51</td> <td> IMG_20150702_112835</td> <td> 3200</td> <td> 2368</td> </tr> <tr> <th>19425609241</th> <td> 2015-07-04 13:17:22</td> <td>2015-07-04 19:49:20</td> <td> </td> <td> 1122</td> <td> 712</td> </tr> <tr> <th>19235236159</th> <td> 2015-07-04 13:17:33</td> <td>2015-07-04 19:49:22</td> <td> </td> <td> 458</td> <td> 46</td> </tr> <tr> <th>18800793223</th> <td> 2015-07-04 13:17:43</td> <td>2015-07-04 19:49:23</td> <td> </td> <td> 546</td> <td> 100</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>I verified the most recent upload dates against iphoto-flickr logs to be sure there were no timezone issues:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[177]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">photo</span><span class="p">[[</span><span class="s">&#39;datetaken&#39;</span><span class="p">,</span> <span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;width_o&#39;</span><span class="p">,</span> <span class="s">&#39;height_o&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="n">ascending</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[177]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>datetaken</th> <th>upload_date</th> <th>title</th> <th>width_o</th> <th>height_o</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>19453379905</th> <td> 2015-07-05 21:18:18</td> <td>2015-07-05 21:18:18</td> <td> 11223919_10103150053978467_4385673914430987089_n</td> <td> 960</td> <td> 639</td> </tr> <tr> <th>19457659531</th> <td> 2015-07-05 21:18:17</td> <td>2015-07-05 21:18:17</td> <td> Dining_Room_Turned_Office</td> <td> 550</td> <td> 400</td> </tr> <tr> <th>19447074092</th> <td> 2015-07-05 21:18:04</td> <td>2015-07-05 21:18:04</td> <td> 11261973_10103150053743937_5354724846552149885_n</td> <td> 960</td> <td> 639</td> </tr> <tr> <th>19447070902</th> <td> 2015-07-05 21:17:59</td> <td>2015-07-05 21:17:59</td> <td> 11143223_10103150052730967_2134381211714517553_n</td> <td> 960</td> <td> 639</td> </tr> <tr> <th>19457651521</th> <td> 2015-07-05 21:17:55</td> <td>2015-07-05 21:17:55</td> <td> 11265214_10103150052591247_2966764364887797640_n</td> <td> 960</td> <td> 639</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="iPhoto,-give-me-my-data-back!">iPhoto, give me my data back!<a class="anchor-link" href="#iPhoto,-give-me-my-data-back!">&#182;</a></h2><p>flickrbackup found the library I'm interested in even though it's not in the default path. Ah... it's using Applescript.</p> <p>iPhoto uses fairly nice .xml and .db files with a nice, sturdy uuid for each photo. But the id <code>flickerbackup.rb</code> got via applescript is nowhere to be found in there!</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[16]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>ssh airbook.local grep <span class="o">{</span>photo.index<span class="o">[</span>0<span class="o">]}</span> Pictures/flickrbackup/uploaded-photo-ids-map.txt </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>4294967334 -&gt; 19226418710 </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[17]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>ssh airbook.local grep <span class="m">4294967334</span> Pictures/iphoto-maryc/AlbumData.xml </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[18]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>ssh airbook.local sqlite3 Pictures/iphoto-maryc/iPhotoMain.db .dump <span class="p">|</span> grep <span class="m">4294967334</span> </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>The <a href="http://www.mugginsoft.com/html/kosmictask/ASDictionaryDocs/Apple/iPhoto/OS-X-10.7/iPhoto-9.2.3/html/">iPhoto script dictionary</a> doesn't show a uid property. <strong>Darn.</strong> We'll have to use file paths or something.</p> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="iPhoto-takes-orders-in-JavaScript">iPhoto takes orders in JavaScript<a class="anchor-link" href="#iPhoto-takes-orders-in-JavaScript">&#182;</a></h3> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Ooh! We can use <a href="https://developer.apple.com/library/mac/releasenotes/InterapplicationCommunication/RN-JavaScriptForAutomation/">JXA</a>, JavaScript for Automation. Somebody made a nice <a href="https://github.com/dtinth/JXA-Cookbook/wiki/Using-JavaScript-for-Automation">cookbook</a> on github.</p> <p>The first API I found for writing a string to a file was this <code>$.NSString</code> objective-C bridge thing. Oh well. It works.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[39]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">export_keys</span> <span class="o">=</span> <span class="s">&#39;&#39;&#39;</span> <span class="s">#!/usr/bin/env osascript -l JavaScript</span> <span class="s">function save(info, where) {</span> <span class="s"> console.log(&#39;saving to...&#39;, where)</span> <span class="s"> var str = $.NSString.alloc.initWithUTF8String(JSON.stringify(info));</span> <span class="s"> str.writeToFileAtomicallyEncodingError(where, true, $.NSUTF8StringEncoding, null); </span> <span class="s">}</span> <span class="s">function getKeys(iPhoto) {</span> <span class="s"> var photos = iPhoto.photoLibraryAlbum().photos;</span> <span class="s"> return {</span> <span class="s"> id: photos.id(),</span> <span class="s"> date: photos.date(),</span> <span class="s"> width: photos.width(),</span> <span class="s"> height: photos.height(),</span> <span class="s"> originalPath: photos.originalPath(),</span> <span class="s"> imagePath: photos.imagePath()</span> <span class="s"> };</span> <span class="s">}</span> <span class="s">function run(argv) {</span> <span class="s"> out = argv[0];</span> <span class="s"> iPhoto = Application(&#39;iPhoto&#39;);</span> <span class="s"> save(getKeys(iPhoto), out)</span> <span class="s">}</span> <span class="s">&#39;&#39;&#39;</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Let's save it, <code>scp</code> it over, run it, and <code>scp</code> the results back:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[40]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="k">def</span> <span class="nf">save_script</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">text</span><span class="p">):</span> <span class="kn">from</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="n">Path</span> <span class="k">with</span> <span class="n">Path</span><span class="p">(</span><span class="s">&#39;photo_keys.js&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s">&#39;wb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">out</span><span class="p">:</span> <span class="n">out</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">export_keys</span><span class="p">)</span> <span class="n">save_script</span><span class="p">(</span><span class="s">&#39;photo_keys.js&#39;</span><span class="p">,</span> <span class="n">export_keys</span><span class="p">)</span> </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[41]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>scp photo_keys.js airbook.local:Pictures/ </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>photo_keys.js 100% 687 0.7KB/s 00:00 </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[42]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>ssh airbook.local osascript -l JavaScript Pictures/photo_keys.js Pictures/keys.json </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>saving to... Pictures/keys.json </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[43]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="o">!</span>scp airbook.local:Pictures/keys.json . </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>keys.json 100% 1697KB 848.6KB/s 00:02 </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Now we can ground these ids in key information such as file paths, dates, and image sizes that we can join with other sources:</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[47]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">pk</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="nb">open</span><span class="p">(</span><span class="s">&#39;keys.json&#39;</span><span class="p">)))</span><span class="o">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">&#39;id&#39;</span><span class="p">)</span> <span class="n">pk</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[47]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>date</th> <th>height</th> <th>imagePath</th> <th>originalPath</th> <th>width</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>4294967334</th> <td> 2002-09-25T15:14:23.000Z</td> <td> 600</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 800</td> </tr> <tr> <th>4294976544</th> <td> 2003-07-21T03:52:05.000Z</td> <td> 1385</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1332</td> </tr> <tr> <th>4294976530</th> <td> 2003-08-08T20:09:51.000Z</td> <td> 1459</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Modifie...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1647</td> </tr> <tr> <th>4294976542</th> <td> 2003-08-08T20:09:51.000Z</td> <td> 1536</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 2048</td> </tr> <tr> <th>4294971867</th> <td> 2006-02-01T17:05:22.000Z</td> <td> 377</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 495</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Applescript reports the full image paths, but we'll need library-relative paths for our work below.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[62]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">libloc</span> <span class="o">=</span> <span class="s">&#39;/Volumes/maryc/Pictures/iPhoto Library/&#39;</span> <span class="c"># TODO: get from applescript?</span> <span class="n">pk</span><span class="p">[</span><span class="s">&#39;relativePath&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="n">p</span><span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">libloc</span><span class="p">):]</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">pk</span><span class="o">.</span><span class="n">imagePath</span><span class="p">]</span> <span class="n">pk</span><span class="p">[(</span><span class="n">pk</span><span class="o">.</span><span class="n">date</span> <span class="o">&gt;=</span> <span class="s">&#39;2002-09&#39;</span><span class="p">)</span> <span class="o">&amp;</span> <span class="p">(</span><span class="n">pk</span><span class="o">.</span><span class="n">date</span> <span class="o">&lt;</span> <span class="s">&#39;2002-10&#39;</span><span class="p">)][[</span><span class="s">&#39;date&#39;</span><span class="p">,</span> <span class="s">&#39;height&#39;</span><span class="p">,</span> <span class="s">&#39;width&#39;</span><span class="p">,</span> <span class="s">&#39;relativePath&#39;</span><span class="p">]]</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[62]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>date</th> <th>height</th> <th>width</th> <th>relativePath</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>4294967334</th> <td> 2002-09-25T15:14:23.000Z</td> <td> 600</td> <td> 800</td> <td> Originals/2002/Sep 25, 2002/Santa Cecelia gran...</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="iPhoto-data-without-iPhoto">iPhoto data without iPhoto<a class="anchor-link" href="#iPhoto-data-without-iPhoto">&#182;</a></h2><p>iPhoto keeps nice sqlite3 databases.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[207]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="k">def</span> <span class="nf">my_photo_db</span><span class="p">(</span><span class="n">path</span><span class="o">=</span><span class="s">&#39;maryc-airbook-iphoto-meta/iPhotoMain.db&#39;</span><span class="p">):</span> <span class="kn">import</span> <span class="nn">sqlite3</span> <span class="k">return</span> <span class="n">sqlite3</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> <span class="n">db1</span> <span class="o">=</span> <span class="n">my_photo_db</span><span class="p">()</span> <span class="n">q</span> <span class="o">=</span> <span class="s">&#39;&#39;&#39;</span> <span class="s">select count(distinct uid) from SqPhotoInfo</span> <span class="s">&#39;&#39;&#39;</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_sql</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">db1</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[207]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>count(distinct uid)</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> 8607</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[212]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">q</span> <span class="o">=</span> <span class="s">&#39;&#39;&#39;</span> <span class="s">select count(*) qty, year from (</span> <span class="s"> select substr(datetime(photoDate + julianday(&#39;2000-01-01 00:00:00&#39;)), 1, 4) year</span> <span class="s"> from SqPhotoInfo</span> <span class="s">) t</span> <span class="s">group by year</span> <span class="s">having count(*) &gt; 10</span> <span class="s">&#39;&#39;&#39;</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_sql</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">db1</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[212]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>qty</th> <th>year</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> 306</td> <td> 2011</td> </tr> <tr> <th>1</th> <td> 4368</td> <td> 2012</td> </tr> <tr> <th>2</th> <td> 1564</td> <td> 2013</td> </tr> <tr> <th>3</th> <td> 2190</td> <td> 2014</td> </tr> <tr> <th>4</th> <td> 161</td> <td> 2015</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Cameras">Cameras<a class="anchor-link" href="#Cameras">&#182;</a></h3> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[195]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">q</span> <span class="o">=</span> <span class="s">&#39;&#39;&#39;</span> <span class="s">select qty,</span> <span class="s"> datetime(min_date + julianday(&#39;2000-01-01 00:00:00&#39;)) min_date,</span> <span class="s"> datetime(max_date + julianday(&#39;2000-01-01 00:00:00&#39;)) max_date,</span> <span class="s"> cameraModel from (</span> <span class="s">select count(*) qty, min(photoDate) min_date, max(photoDate) max_date, cameraModel</span> <span class="s"> from</span> <span class="s">sqphotoinfo</span> <span class="s">where photoDate &gt; julianday(&#39;1993-01-01&#39;) - julianday(&#39;2000-01-01 00:00:00&#39;)</span> <span class="s">group by cameraModel</span> <span class="s">)</span> <span class="s">where qty &gt;= 10</span> <span class="s">order by 1 desc</span> <span class="s">&#39;&#39;&#39;</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_sql</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">db1</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[195]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>qty</th> <th>min_date</th> <th>max_date</th> <th>cameraModel</th> </tr> </thead> <tbody> <tr> <th>0 </th> <td> 5568</td> <td> 2009-11-25 17:28:32</td> <td> 2013-11-04 11:05:02</td> <td> Canon PowerShot SD1100 IS</td> </tr> <tr> <th>1 </th> <td> 872</td> <td> 2013-12-04 07:02:10</td> <td> 2015-04-18 15:50:41</td> <td> FinePix S4850</td> </tr> <tr> <th>2 </th> <td> 641</td> <td> 2006-02-01 11:05:22</td> <td> 2015-05-23 09:47:31</td> <td> None</td> </tr> <tr> <th>3 </th> <td> 401</td> <td> 2012-09-21 11:06:43</td> <td> 2015-01-02 09:22:16</td> <td> Galaxy Nexus</td> </tr> <tr> <th>4 </th> <td> 399</td> <td> 2014-08-13 15:55:36</td> <td> 2014-08-19 11:15:30</td> <td> NIKON D800</td> </tr> <tr> <th>5 </th> <td> 273</td> <td> 2014-05-16 16:59:18</td> <td> 2014-06-28 14:09:51</td> <td> NIKON D3200</td> </tr> <tr> <th>6 </th> <td> 137</td> <td> 2012-11-14 17:35:19</td> <td> 2013-11-29 13:29:43</td> <td> FinePix S5Pro </td> </tr> <tr> <th>7 </th> <td> 90</td> <td> 2012-10-19 17:55:45</td> <td> 2015-03-12 11:37:07</td> <td> iPhone 4S</td> </tr> <tr> <th>8 </th> <td> 78</td> <td> 2012-09-15 18:21:05</td> <td> 2012-10-21 18:17:14</td> <td> Canon PowerShot ELPH 100 HS</td> </tr> <tr> <th>9 </th> <td> 65</td> <td> 2011-04-18 12:19:47</td> <td> 2012-05-05 15:40:20</td> <td> NIKON D90</td> </tr> <tr> <th>10</th> <td> 28</td> <td> 2014-06-27 22:17:35</td> <td> 2014-06-28 01:07:04</td> <td> Canon PowerShot SD1200 IS</td> </tr> <tr> <th>11</th> <td> 28</td> <td> 2014-10-02 13:22:41</td> <td> 2014-10-02 15:31:08</td> <td> SGH-T999</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Photos-and-Images">Photos and Images<a class="anchor-link" href="#Photos-and-Images">&#182;</a></h3><p>The model is nice and clean, separating photos, relating any number of possibly-edited images to each photo-taking event, and issuing a uuid to the photo-taking event.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[139]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">q</span> <span class="o">=</span> <span class="s">&#39;&#39;&#39;</span> <span class="s">select photo.primaryKey, photo.uid, datetime(photo.photoDate + julianday(&#39;2000-01-01 00:00:00&#39;)) as photoDate,</span> <span class="s"> photo.cameraModel, photo.archiveFilename,</span> <span class="s"> fi.imageWidth, fi.imageHeight, fi.fileSize, fi.imageType, fi.version,</span> <span class="s"> fl.relativePath, fl.aliasPath</span> <span class="s"> -- TODO: decode fl.format</span> <span class="s">from SqPhotoInfo photo</span> <span class="s">join SqFileImage fi on fi.photoKey = photo.primaryKey</span> <span class="s">join SqFileInfo fl on fi.sqFileInfo = fl.primaryKey</span> <span class="s">where fileSize &gt; 0</span> <span class="s">order by photo.photoDate desc</span> <span class="s">&#39;&#39;&#39;</span> <span class="n">pdb</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_sql</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">db1</span><span class="p">)</span> <span class="n">pdb</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[139]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>primaryKey</th> <th>uid</th> <th>photoDate</th> <th>cameraModel</th> <th>archiveFilename</th> <th>imageWidth</th> <th>imageHeight</th> <th>fileSize</th> <th>imageType</th> <th>version</th> <th>relativePath</th> <th>aliasPath</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> 9272</td> <td> 4BC69B9E-3AEF-4FA6-BFCC-5476CC6CAC0D</td> <td> 2015-05-23 09:47:31</td> <td> None</td> <td> Dining_Room_Turned_Office.jpg</td> <td> 550</td> <td> 400</td> <td> 34651</td> <td> 6</td> <td> 100</td> <td> Originals/2015/May 23, 2015/Dining_Room_Turned...</td> <td> None</td> </tr> <tr> <th>1</th> <td> 9272</td> <td> 4BC69B9E-3AEF-4FA6-BFCC-5476CC6CAC0D</td> <td> 2015-05-23 09:47:31</td> <td> None</td> <td> Dining_Room_Turned_Office.jpg</td> <td> 360</td> <td> 262</td> <td> 52460</td> <td> 5</td> <td> 100</td> <td> Data/2015/May 23, 2015/Dining_Room_Turned_Offi...</td> <td> None</td> </tr> <tr> <th>2</th> <td> 9282</td> <td> 11F6E0C9-90F8-43E3-8427-EFD1F98D5ECD</td> <td> 2015-05-20 09:24:50</td> <td> None</td> <td> 11223919_10103150053978467_4385673914430987089...</td> <td> 960</td> <td> 639</td> <td> 101115</td> <td> 6</td> <td> 100</td> <td> Originals/2015/May 20, 2015/11223919_101031500...</td> <td> None</td> </tr> <tr> <th>3</th> <td> 9282</td> <td> 11F6E0C9-90F8-43E3-8427-EFD1F98D5ECD</td> <td> 2015-05-20 09:24:50</td> <td> None</td> <td> 11223919_10103150053978467_4385673914430987089...</td> <td> 360</td> <td> 240</td> <td> 67356</td> <td> 5</td> <td> 100</td> <td> Data/2015/May 20, 2015/11223919_10103150053978...</td> <td> None</td> </tr> <tr> <th>4</th> <td> 9289</td> <td> 88DC7357-96A2-4C29-88CD-6DF9E39E4CBB</td> <td> 2015-05-20 09:24:41</td> <td> None</td> <td> 11261973_10103150053743937_5354724846552149885...</td> <td> 960</td> <td> 639</td> <td> 97108</td> <td> 6</td> <td> 100</td> <td> Originals/2015/May 20, 2015/11261973_101031500...</td> <td> None</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Joining-sqlite-data-with-flickr-via-applescript-key-info">Joining sqlite data with flickr via applescript key info<a class="anchor-link" href="#Joining-sqlite-data-with-flickr-via-applescript-key-info">&#182;</a></h2><p>Ah... excellent... even though there are more image files than photos, we get an exact 1-1 match when we join with our photo keys (implicitly on <code>relativePath</code>).</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[109]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="nb">len</span><span class="p">(</span><span class="n">pdb</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">pk</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">pdb</span><span class="o">.</span><span class="n">merge</span><span class="p">(</span><span class="n">pk</span><span class="p">))</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[109]:</div> <div class="output_text output_subarea output_execute_result"> <pre>(19127, 8057, 8057)</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[150]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">pkdb</span> <span class="o">=</span> <span class="n">pk</span><span class="o">.</span><span class="n">reset_index</span><span class="p">()</span><span class="o">.</span><span class="n">merge</span><span class="p">(</span><span class="n">pdb</span><span class="p">)</span><span class="o">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">&#39;id&#39;</span><span class="p">)</span> <span class="n">pkdb</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[150]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>date</th> <th>height</th> <th>imagePath</th> <th>originalPath</th> <th>width</th> <th>relativePath</th> <th>primaryKey</th> <th>uid</th> <th>photoDate</th> <th>cameraModel</th> <th>archiveFilename</th> <th>imageWidth</th> <th>imageHeight</th> <th>fileSize</th> <th>imageType</th> <th>version</th> <th>aliasPath</th> </tr> <tr> <th>id</th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>4294967334</th> <td> 2002-09-25T15:14:23.000Z</td> <td> 600</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 800</td> <td> Originals/2002/Sep 25, 2002/Santa Cecelia gran...</td> <td> 38</td> <td> 7281F4D1-E140-4832-B759-60D5B9DF78B1</td> <td> 2002-09-25 10:14:23</td> <td> PDR-3320</td> <td> Santa Cecelia granite.jpg</td> <td> 800</td> <td> 600</td> <td> 63092</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>4294976544</th> <td> 2003-07-21T03:52:05.000Z</td> <td> 1385</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1332</td> <td> Originals/2003/Jul 20, 2003/brothers.jpg</td> <td> 9248</td> <td> 7EFC6CF1-F3A5-42DD-ADBC-52561476CC50</td> <td> 2003-07-20 22:52:05</td> <td> hp photosmart 720</td> <td> brothers.jpg</td> <td> 1332</td> <td> 1385</td> <td> 615319</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>4294976530</th> <td> 2003-08-08T20:09:51.000Z</td> <td> 1459</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Modifie...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1647</td> <td> Modified/2003/Aug 8, 2003/justin car.jpg</td> <td> 9234</td> <td> E3D5AD74-7FC1-4916-A9DA-6C2CB47B5D16</td> <td> 2003-08-08 15:09:51</td> <td> hp photosmart 720</td> <td> justin car.jpg</td> <td> 1647</td> <td> 1459</td> <td> 828662</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>4294976542</th> <td> 2003-08-08T20:09:51.000Z</td> <td> 1536</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 2048</td> <td> Originals/2003/Aug 8, 2003_2/justin car.jpg</td> <td> 9246</td> <td> CCD2008C-4CB5-46BF-B99D-4BAE8999D8AD</td> <td> 2003-08-08 15:09:51</td> <td> hp photosmart 720</td> <td> justin car.jpg</td> <td> 2048</td> <td> 1536</td> <td> 738791</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>4294971867</th> <td> 2006-02-01T17:05:22.000Z</td> <td> 377</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 495</td> <td> Originals/2006/Feb 1, 2006/NY-Skyline-new-york...</td> <td> 4571</td> <td> 5D1426FC-5F86-457D-9EB7-6F761607C8FA</td> <td> 2006-02-01 11:05:22</td> <td> None</td> <td> NY-Skyline-new-york-1138029_495_377.jpeg</td> <td> 495</td> <td> 377</td> <td> 159106</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Merging with the <code>upload_map</code> gives us a clear correspondence between iPhoto applescript ids and flickr ids.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[151]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">upkdb</span> <span class="o">=</span> <span class="n">upload_map</span><span class="o">.</span><span class="n">merge</span><span class="p">(</span><span class="n">pkdb</span><span class="p">,</span> <span class="n">left_on</span><span class="o">=</span><span class="s">&#39;apple&#39;</span><span class="p">,</span> <span class="n">right_index</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="nb">len</span><span class="p">(</span><span class="n">upkdb</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[151]:</div> <div class="output_text output_subarea output_execute_result"> <pre>8057</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[152]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">upkdb</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[152]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>apple</th> <th>flickr</th> <th>date</th> <th>height</th> <th>imagePath</th> <th>originalPath</th> <th>width</th> <th>relativePath</th> <th>primaryKey</th> <th>uid</th> <th>photoDate</th> <th>cameraModel</th> <th>archiveFilename</th> <th>imageWidth</th> <th>imageHeight</th> <th>fileSize</th> <th>imageType</th> <th>version</th> <th>aliasPath</th> </tr> </thead> <tbody> <tr> <th>0</th> <td> 4294967334</td> <td> 19226418710</td> <td> 2002-09-25T15:14:23.000Z</td> <td> 600</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 800</td> <td> Originals/2002/Sep 25, 2002/Santa Cecelia gran...</td> <td> 38</td> <td> 7281F4D1-E140-4832-B759-60D5B9DF78B1</td> <td> 2002-09-25 10:14:23</td> <td> PDR-3320</td> <td> Santa Cecelia granite.jpg</td> <td> 800</td> <td> 600</td> <td> 63092</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>1</th> <td> 4294976544</td> <td> 18793385783</td> <td> 2003-07-21T03:52:05.000Z</td> <td> 1385</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1332</td> <td> Originals/2003/Jul 20, 2003/brothers.jpg</td> <td> 9248</td> <td> 7EFC6CF1-F3A5-42DD-ADBC-52561476CC50</td> <td> 2003-07-20 22:52:05</td> <td> hp photosmart 720</td> <td> brothers.jpg</td> <td> 1332</td> <td> 1385</td> <td> 615319</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>2</th> <td> 4294976530</td> <td> 18793388523</td> <td> 2003-08-08T20:09:51.000Z</td> <td> 1459</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Modifie...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 1647</td> <td> Modified/2003/Aug 8, 2003/justin car.jpg</td> <td> 9234</td> <td> E3D5AD74-7FC1-4916-A9DA-6C2CB47B5D16</td> <td> 2003-08-08 15:09:51</td> <td> hp photosmart 720</td> <td> justin car.jpg</td> <td> 1647</td> <td> 1459</td> <td> 828662</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>3</th> <td> 4294976542</td> <td> 18791506174</td> <td> 2003-08-08T20:09:51.000Z</td> <td> 1536</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 2048</td> <td> Originals/2003/Aug 8, 2003_2/justin car.jpg</td> <td> 9246</td> <td> CCD2008C-4CB5-46BF-B99D-4BAE8999D8AD</td> <td> 2003-08-08 15:09:51</td> <td> hp photosmart 720</td> <td> justin car.jpg</td> <td> 2048</td> <td> 1536</td> <td> 738791</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> <tr> <th>4</th> <td> 4294971867</td> <td> 19414016905</td> <td> 2006-02-01T17:05:22.000Z</td> <td> 377</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> /Volumes/maryc/Pictures/iPhoto Library/Origina...</td> <td> 495</td> <td> Originals/2006/Feb 1, 2006/NY-Skyline-new-york...</td> <td> 4571</td> <td> 5D1426FC-5F86-457D-9EB7-6F761607C8FA</td> <td> 2006-02-01 11:05:22</td> <td> None</td> <td> NY-Skyline-new-york-1138029_495_377.jpeg</td> <td> 495</td> <td> 377</td> <td> 159106</td> <td> 6</td> <td> 100</td> <td> None</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Fixing-Dates">Fixing Dates<a class="anchor-link" href="#Fixing-Dates">&#182;</a></h2><p>Let's grab flickr photos with unkonwn date taken (with upload date, title, and original size).</p> <p>Then merge with the date information from the sqlite3 db.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[178]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">tofix</span> <span class="o">=</span> <span class="n">photo</span><span class="p">[</span><span class="n">photo</span><span class="o">.</span><span class="n">datetakenunknown</span> <span class="o">==</span> <span class="s">&#39;1&#39;</span><span class="p">][[</span><span class="s">&#39;datetaken&#39;</span><span class="p">,</span> <span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;width_o&#39;</span><span class="p">,</span> <span class="s">&#39;height_o&#39;</span><span class="p">]]</span> <span class="n">fixed</span> <span class="o">=</span> <span class="n">tofix</span><span class="o">.</span><span class="n">merge</span><span class="p">(</span><span class="n">upkdb</span><span class="p">,</span> <span class="n">left_index</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">right_on</span><span class="o">=</span><span class="s">&#39;flickr&#39;</span><span class="p">)[</span> <span class="p">[</span><span class="s">&#39;date&#39;</span><span class="p">,</span> <span class="s">&#39;photoDate&#39;</span><span class="p">,</span> <span class="s">&#39;upload_date&#39;</span><span class="p">,</span> <span class="s">&#39;title&#39;</span><span class="p">,</span> <span class="s">&#39;archiveFilename&#39;</span><span class="p">,</span> <span class="s">&#39;width_o&#39;</span><span class="p">,</span> <span class="s">&#39;imageWidth&#39;</span><span class="p">,</span> <span class="s">&#39;height_o&#39;</span><span class="p">,</span> <span class="s">&#39;imageHeight&#39;</span><span class="p">,</span> <span class="s">&#39;flickr&#39;</span><span class="p">,</span> <span class="s">&#39;uid&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">&#39;flickr&#39;</span><span class="p">)</span> <span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">tofix</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">fixed</span><span class="p">)</span> <span class="n">fixed</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stdout output_text"> <pre>74 74 </pre> </div> </div> <div class="output_area"><div class="prompt output_prompt">Out[178]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>date</th> <th>photoDate</th> <th>upload_date</th> <th>title</th> <th>archiveFilename</th> <th>width_o</th> <th>imageWidth</th> <th>height_o</th> <th>imageHeight</th> <th>uid</th> </tr> <tr> <th>flickr</th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>19426161436</th> <td> 2014-10-20T18:47:27.000Z</td> <td> 2014-10-20 13:47:27</td> <td>2015-07-05 20:23:04</td> <td> segway tour - 4</td> <td> segway tour - 4.jpg</td> <td> 466</td> <td> 466</td> <td> 630</td> <td> 630</td> <td> 93061866-7680-43EA-9D58-8DE25E554B43</td> </tr> <tr> <th>19456508621</th> <td> 2014-10-20T18:48:03.000Z</td> <td> 2014-10-20 13:48:03</td> <td>2015-07-05 20:23:10</td> <td> segway tour - 28</td> <td> segway tour - 28.jpg</td> <td> 851</td> <td> 851</td> <td> 630</td> <td> 630</td> <td> BFCD692A-52C7-4059-8A9F-F6264133C155</td> </tr> <tr> <th>19266094319</th> <td> 2014-10-20T18:52:50.000Z</td> <td> 2014-10-20 13:52:50</td> <td>2015-07-05 20:23:12</td> <td> segway tour - 2</td> <td> segway tour - 2.jpg</td> <td> 466</td> <td> 291</td> <td> 630</td> <td> 438</td> <td> B854FE26-841E-4968-BDF2-975C18AB3B05</td> </tr> <tr> <th>19264673498</th> <td> 2014-10-20T18:48:22.000Z</td> <td> 2014-10-20 13:48:22</td> <td>2015-07-05 20:23:16</td> <td> segway tour - 33</td> <td> segway tour - 33.jpg</td> <td> 466</td> <td> 338</td> <td> 630</td> <td> 519</td> <td> 8976E385-2BE0-46AB-8E9E-2FA573574ED4</td> </tr> <tr> <th>19445946592</th> <td> 2014-10-31T13:45:47.000Z</td> <td> 2014-10-31 08:45:47</td> <td>2015-07-05 20:23:47</td> <td> Brennan baby tub</td> <td> Brennan baby tub.jpg</td> <td> 2351</td> <td> 2351</td> <td> 2945</td> <td> 2945</td> <td> BA448463-3932-40C2-811C-B30017C68087</td> </tr> </tbody> </table> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>For this, we need write access.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[169]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="k">def</span> <span class="nf">myFlickrEdit</span><span class="p">(</span><span class="n">user_id</span><span class="o">=</span><span class="s">&#39;14874637@N00&#39;</span><span class="p">):</span> <span class="kn">import</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="nn">flickrapi</span> <span class="n">api_secret</span> <span class="o">=</span> <span class="n">pathlib</span><span class="o">.</span><span class="n">Path</span><span class="p">(</span><span class="s">&#39;flickr_api_secret&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">open</span><span class="p">()</span><span class="o">.</span><span class="n">read</span><span class="p">()</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> <span class="k">return</span> <span class="n">flickdata</span><span class="o">.</span><span class="n">Write</span><span class="o">.</span><span class="n">make</span><span class="p">(</span><span class="n">flickrapi</span><span class="p">,</span> <span class="n">api_secret</span><span class="p">,</span> <span class="n">user_id</span><span class="p">)</span> <span class="n">edit</span> <span class="o">=</span> <span class="n">myFlickrEdit</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:flickdata:authenticating... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.auth.oauth.checkToken&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickrapi.core:REST Parser: using xml.etree.cElementTree </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Let's work with one photo at first, verifying with the flickr web UI as we go.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[179]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">Image</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">photo</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="mi">19426161436</span><span class="p">]</span><span class="o">.</span><span class="n">url_t</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[179]:</div> <div class="output_html rendered_html output_subarea output_execute_result"> <img src="https://farm1.staticflickr.com/308/19426161436_28c40cd67d_t.jpg"/> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[180]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">fixed</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="mi">19426161436</span><span class="p">]</span><span class="o">.</span><span class="n">photoDate</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt output_prompt">Out[180]:</div> <div class="output_text output_subarea output_execute_result"> <pre>u&apos;2014-10-20 13:47:27&apos;</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[181]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="n">edit</span><span class="o">.</span><span class="n">setDates</span><span class="p">(</span><span class="mi">19426161436</span><span class="p">,</span> <span class="n">date_taken</span><span class="o">=</span><span class="n">fixed</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="mi">19426161436</span><span class="p">]</span><span class="o">.</span><span class="n">photoDate</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-20 13:47:27&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com </pre> </div> </div> <div class="output_area"><div class="prompt output_prompt">Out[181]:</div> <div class="output_text output_subarea output_execute_result"> <pre>{u&apos;stat&apos;: u&apos;ok&apos;}</pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Now we can iterate over all the fixes.</p> <p>Incremental updates came in handy here. At first, I forgot to rate-limit my requests and flickr noticed after a few hundred. I went back and fetched metadata for recent photos in my photostream again and finished off the rest.</p> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[182]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="kn">import</span> <span class="nn">time</span> </pre></div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell rendered"> <div class="input"> <div class="prompt input_prompt">In&nbsp;[185]:</div> <div class="inner_cell"> <div class="input_area"> <div class=" highlight hl-ipython2"><pre><span class="k">def</span> <span class="nf">do_fixes</span><span class="p">():</span> <span class="k">for</span> <span class="n">pid</span><span class="p">,</span> <span class="n">photo</span> <span class="ow">in</span> <span class="n">fixed</span><span class="o">.</span><span class="n">iterrows</span><span class="p">():</span> <span class="n">edit</span><span class="o">.</span><span class="n">setDates</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="n">date_taken</span><span class="o">=</span><span class="n">photo</span><span class="o">.</span><span class="n">photoDate</span><span class="p">)</span> <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)</span> <span class="n">do_fixes</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="output_wrapper"> <div class="output"> <div class="output_area"><div class="prompt"></div> <div class="output_subarea output_stream output_stderr output_text"> <pre>INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-20 13:47:27&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-20 13:48:03&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-20 13:52:50&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-20 13:48:22&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-31 08:45:47&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-31 09:07:10&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-31 08:59:05&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-10-31 09:07:30&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-01 07:06:54&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:27:35&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:28:06&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:27:35&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:28:40&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:28:16&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:29:05&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:30:01&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:36:10&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:31:59&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:35:29&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:35:15&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 18:39:51&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:26:15&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:26:30&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:26:42&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:27:04&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:28:15&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:28:56&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:28:36&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:29:09&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:30:34&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:30:45&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:30:56&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:31:13&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:32:18&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:32:18&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:32:41&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:32:30&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:32:52&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:33:23&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:35:17&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:34:47&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:34:56&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:35:08&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:36:40&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:35:38&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 19:36:51&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 20:04:18&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 20:07:33&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 20:07:44&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-11-08 20:08:11&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-12-27 14:51:54&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-12-27 14:52:04&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-12-27 14:52:16&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-12-27 14:52:34&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2014-12-27 14:52:48&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-01-09 14:22:41&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-03-03 17:19:17&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-09 19:53:29&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-09 19:47:33&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-09 19:49:17&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-09 19:55:26&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-11 09:03:06&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-09 20:09:42&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-11 09:11:11&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-04-18 21:51:53&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:23:33&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:23:42&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:24:28&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:23:52&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:24:08&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:24:17&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:24:41&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-23 09:47:31&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com INFO:flickdata:setDates: {&apos;date_taken&apos;: u&apos;2015-05-20 09:24:50&apos;}... INFO:flickrapi.core:Calling {&apos;nojsoncallback&apos;: 1, &apos;method&apos;: &apos;flickr.photos.setDates&apos;, &apos;format&apos;: &apos;parsed-json&apos;} INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api.flickr.com </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Future-Work">Future Work<a class="anchor-link" href="#Future-Work">&#182;</a></h2><ul> <li>make an album of all the photos uploaded in this process?</li> <li>tag flickr photos with uids<ul> <li>don't lose "untagged" state, though! capture untagged-ness in an album or something.</li> </ul> </li> <li>sync events... using photosets?</li> <li>sync faces</li> </ul> </div> </div> </div> <div class="cell border-box-sizing text_cell rendered"> <div class="prompt input_prompt"> </div> <div class="inner_cell"> <div class="text_cell_render border-box-sizing rendered_html"> <p>Additional notes bookmarked under: <a href="https://www.diigo.com/user/dckc-madmode/mac%20photos">mac photos</a>, <a href="https://www.diigo.com/user/dckc-madmode/mac%20sysadmin">mac sysadmin</a></p> </div> </div> </div> Tue, 07 Jul 2015 00:00:00 +0000 http://www.madmode.com//2015/photo-flickr-explore/ studying Knuth's Mastermind Solver with rust http://www.madmode.com//2015/rust1/ <p>To celebrate rust turning 1.0, here's what I learned with <a href="https://github.com/dckc/mmind5">mmind5</a>, a study of <a href="http://en.wikipedia.org/wiki/Mastermind_%28board_game%29#Five-guess_algorithm">Knuth's five guess algorithm for mastermind</a>.</p> <p>While I had my share of frustration with the borrow-checker, the rust type system is expressive enough that code is typically correct once it compiles. I had only one bug fix in the whole project.</p> <p>My first commit was <em>codemaker chooses a random Pattern of CodePegs</em>:</p> <pre><code> #[derive_Rand] #[derive(Debug)] enum CodePeg { Red, Orange, Yellow, Green, Blue, White } #[derive(Debug)] struct Pattern { pegs: [CodePeg; 4] } impl Rand for Pattern { fn rand&lt;R: Rng&gt;(rng: &amp;mut R) -&gt; Self { Pattern { pegs: [CodePeg::rand(rng), CodePeg::rand(rng), CodePeg::rand(rng), CodePeg::rand(rng)] } } } </code></pre> <p>But as soon as I started working on scoring a guess vs. the code and wanted to iterate over the pegs, my next commit was <em>abandon fixed sized array in favor of vec</em>.</p> <p>Then we get nice functional code for scoring blacks:</p> <pre><code> let rightColorAndPlace = (0..Pattern::size()).map(|pos| { if g[pos] == s[pos] { Some(KeyPeg::Black) } else { None } }).collect(); </code></pre> <p>White scoring is more involved; I worked it out using sloppy <code>println!()</code> debugging. More on testing below.</p> <p>As I started to study the algorithm, I changed the representation of <code>Pattern</code> to a single <code>u32</code> representing the (lexicographic) index of the pattern, converting to a vector of pegs as needed for scoring. And I punted on deriving <code>Rand</code>.</p> <p>The rust <code>std::collections::BitSet</code> was a good match for <em>... the set S of 1296 possible codes, 1111,1112,.., 6666.</em></p> <p>I got the first five steps of the algorithm working on the first night; or so I thought. On the second night, I got the final minmax step coded up and fixed that bug in step 5 (<code>remove_mismatches</code>) and tada! It works.</p> <p>Once I got it working, I have found any number of ways to clarify the code by refactoring. Each time, it was a matter of making one isolated change and letting the compiler guide me through the rest of the places in the code that needed fixing.</p> <p>For example, I had conversion from patterns to indexes and back mixed in with scoring logic:</p> <pre><code> let mut guesses_with_score = HashMap::new(); for guess_ix in 0..Pattern::cardinality() { if !self.guessed.contains(&amp;Pattern::ith(guess_ix)) { let score = guess_score(guess_ix); let guess = Pattern::ith(guess_ix); guesses_with_score.entry(score).or_insert(vec![]).push(guess) } } </code></pre> <p>I was able to factor out <code>PatternSet</code>, hiding the <code>BitSet</code> of indexes, so the solver logic looks like:</p> <pre><code> let mut guesses_with_score = HashMap::new(); for guess in Pattern::range() { if !self.guessed.contains(&amp;guess) { let score = guess_score(guess); guesses_with_score.entry(score).or_insert(vec![]).push(guess) } } </code></pre> <p>Implementing <code>Iterator</code> for <code>Solver</code> worked nicely, but doing <code>IntoIterator</code> for <code>PatternSet</code> stumped me. It's frustrating: all I wanted to do was factor out the expression <code>Pattern::range().filter(|p| self.s.contains(p))</code> as <code>into_iter</code> on <code>s</code>, but its type is a monster to write out and I never did get the associated types and lifetimes figured out.</p> <p>It seems to make two or three guesses per second, which seems pretty speedy, considering it seems to be O(N^2) where N = 1296. Now that I think about it, those were debug builds. A release build takes a small part of a second to solve the whole thing:</p> <pre><code>mmind$ time target/release/mmind codemaker: 4112 turn 1: 1122 BBW turn 2: 1223 WW turn 3: 4115 BBB turn 4: 4112 BBBB real 0m0.026s user 0m0.026s sys 0m0.000s </code></pre> <p>This is what I would see during development:</p> <pre><code>mmind$ time target/debug/mmind codemaker: 1553 turn 1: 1122 B turn 2: 1344 BW turn 3: 4524 B turn 4: 1336 BW turn 5: 1553 BBBB real 0m3.804s user 0m3.793s sys 0m0.004s </code></pre> <p>Another thing that felt slow was example documentation tests. Having support for them is great; python doctest got me addicted to this style. But testing them seems to rely on having the crate built; i.e. <code>cargo test</code> isn't enough; I had to do <code>cargo build; cargo test</code>. And to see the documentation, it becomes <code>cargo build; cargo test; cargo doc</code>.</p> <p>I'm also addicted to emacs and flycheck-mode. flycheck-rust works pretty well but helping it find the crate root is a little fidgety.</p> Wed, 20 May 2015 00:00:00 +0000 http://www.madmode.com//2015/rust1/ Installing a web IDE for postgress: three hours of woe http://www.madmode.com//2015/phpPgAdmininstallwoes/ <blockquote class="twitter-tweet" lang="en"> <p>finally! after 3 hours, got phpPgAdmin working. I want to like postgres over mysql, but the initial experience is dreadful.</p>&mdash; Dan Connolly (@dckc) <a href="https://twitter.com/dckc/status/584229649535737856">April 4, 2015</a></blockquote> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script> <blockquote class="twitter-tweet" lang="en"><p><a href="https://twitter.com/dckc">@dckc</a> I (and some others in the community) would be interested in a more detailed writeup if you were willing.</p>&mdash; Robert Treat (@robtreat2) <a href="https://twitter.com/robtreat2/status/584523279056162816">April 5, 2015</a></blockquote> <!-- hide the 1st tweet if js is turned on? --> <p>OK, I'm willing.</p> <p>My goal that evening was: give peers in a multi-site research project a web-based IDE to access to a postgres database underneath a jboss app running on CentOS on AWS.</p> <p>We've been using ssh tunnels and public keys, but creating those accounts, not to mention using them, is tedious. We'd like to delegate account provisioning to Jenkins, but we don't give jenkins blanket root access. I realized that something like phpMyAdmin would obviate the need for unix accounts altogether.</p> <p>Is there such a thing for postgres? yes: <a href="http://phppgadmin.sourceforge.net/doku.php">phpPgAdmin</a></p> <p>I downloaded it and checked the <a href="https://raw.githubusercontent.com/phppgadmin/phppgadmin/master/INSTALL">INSTALL</a> doc:</p> <blockquote> <ol> <li>Unpack your download ...</li> <li>Configure phpPgAdmin - edit phpPgAdmin/conf/config.inc.php ...</li> <li>Ensure the statistics collector is enabled in PostgreSQL. phpPgAdmin will display table, index performance, and usage statistics if you have enabled the PostgreSQL statistics collector. While this is normally enabled by default, ...</li> <li>Browse to the phpPgAdmin installation using a web browser.</li> <li><strong>IMPORTANT - SECURITY</strong><br /> PostgreSQL by default does not require you to use a password to log in. We STRONGLY recommend that you enable md5 passwords for local connections in your pg_hba.conf, and set a password for the default superuser account.<br /> Due to the large number of phpPgAdmin installations that have not set passwords on local connections, there is now a configuration file option called 'extra_login_security', which is TRUE by default. &#160;While this option is enabled, you will be unable to log in to phpPgAdmin as the 'root', 'administrator', 'pgsql' or 'postgres' users and empty passwords will not work.<br /> Once you are certain you have properly secured your database server, you can then disable 'extra_login_security' so that you can log in as your database administrator using the administrator password.</li> </ol> </blockquote> <p>I don't know why step 2 is there. The defaults look OK as far as I can tell, so I'm already not sure I'm doing it right. <em>If the defaults are OK in typical cases, move step 2 to an troubleshooting FAQ section later. Likewise step 3, since (a) the statistics collector is on by default, and (b) statistics doesn't seem like a critical "getting started" feature.</em></p> <p>The fact that the security step comes <em>after</em> the service is available on the net threw me. I immediately tried to figure out what was going on there.</p> <p>The reference to "your <strong>pg_hba.conf</strong>" was frustrating. I tried to find it with <strong>locate</strong>. No joy. From <code>rpm -qa | grep postgres</code> I recall the main package is <strong>postgresql91</strong>. But <code>rpm -ql postgresql91|grep pg_hba</code> turns up empty. I get as far as <code>pg_config --sysconfdir</code> says <strong>/etc/sysconfig/pgsql</strong> but nope; empty too.</p> <p>Some relevant-looking docs were easy enough to find with a quick web search: <a href="http://www.postgresql.org/docs/9.1/static/auth-pg-hba-conf.html">19.1. The pg_hba.conf File</a> says:</p> <blockquote> <p>A default pg_hba.conf file is installed when the data directory is initialized by initdb.</p> </blockquote> <p>Ah&#8230; <code>initdb</code>&#8230; that seems familiar. So I pore over notes from setting up the database, and I find it: in <strong>/var/lib/pgsql/9.1/data/pg_hba.conf</strong>. A google search for that path turns up 5,840 results, but it's not there in section 19.1 of the official documentation, nor do I win if I follow the link to <a href="http://www.postgresql.org/docs/9.1/static/runtime-config-file-locations.html#GUC-HBA-FILE">18.2. File Locations</a>. <strong>Before you tell me "It is possible to place the authentication configuration file elsewhere &#8230;" how about you tell me, in concrete, literal terms, where it typically is?!?!?!?</strong></p> <p>Now that I found it, I don't understand what exactly I'm supposed to change. "We STRONGLY recommend that you enable md5 passwords for local connections in your pg_hba.conf, and set a password for the default superuser account." But not so strongly as to spell out how to do it nor cite documentation on how to do it. More on that below.</p> <p>The current configuration seems fail-safe, though, so I go ahead with step 4 and try to browse. Bzzzt:</p> <blockquote> <p>Your PHP installation does not support PostgreSQL. You need to recompile PHP using the <code>--with-pgsql</code> configure option.</p> </blockquote> <p>Then I vaguely remember php's mysql support is packaged separately, so I got hunting, and surprise! CentOS actually supports phpPgAdmin itself:</p> <pre><code>$ yum search php | grep -i postgres php-pear-MDB2-Driver-pgsql.noarch : PostgreSQL MDB2 driver php-pgsql.x86_64 : A PostgreSQL database module for PHP phpPgAdmin.noarch : Web-based PostgreSQL administration </code></pre> <p>So...</p> <pre><code>$ sudo yum install phpPgAdmin Installed: phpPgAdmin.noarch 0:5.1-1.rhel6 Dependency Installed: php-pdo.x86_64 0:5.3.3-40.el6_6 php-pgsql.x86_64 0:5.3.3-40.el6 </code></pre> <p>and try to browse. No joy: some sort of HTTP forbidden error.</p> <p><code>rpm -ql</code> turns up <strong>/etc/httpd/conf.d/phpPgAdmin.conf</strong>, where we find "By default this application is only accessible from the local host." OK, fair enough. I tweak that apache config file and now I see a phpPgAdmin web page showing one server, PostgreSQL. Hmm. I choose it and I get username/password prompt. I enter my linux credentials. No joy. "Login failed".</p> <p>So I go looking for clues in apache log files (<code>ssl_error_log</code>, <code>error_log</code>, <code>access_log</code>), linux/CentOS log files (<code>/var/log/messages</code>), and postgres log files (<code>/var/lib/pgsql/9.1/data/pg_log/postgresql-Fri.log</code>). None to be had. Is the <code>php.ini</code> config supressing them? Not as far as I can tell.</p> <p>So I begin guessing what the problem is.</p> <p>Between <code>phpPgAdmin/conf/config.inc.php</code> and <code>pg_hba.conf</code>, I must have tried a dozen combinations. In several cases, postgres wouldn't start at all. In <strong>no case</strong> were there <strong>any relevant diagnostics</strong> in <strong>any log file</strong> that I could find. I found logs of SQL syntax errors from ordinary select statements, but no connection error logs.</p> <p>That <code>phpPgAdmin/conf/config.inc.php</code> file says:</p> <pre><code> // Hostname or IP address for server. Use '' for UNIX domain socket // use 'localhost' for TCP/IP connection on this computer $conf['servers'][0]['host'] = ''; </code></pre> <p>but what worked was changing the <code>auth-method</code> in <strong>pg_hba.conf</strong> for <code>host</code> 127.0.0.1 to <code>md5</code>.</p> <p>Meanwhile, problems setting up passwords undermined my confidence in setting up md5 authentication. Stackoverflow discussion or something suggested the <strong>createuser</strong> utility, but it kept giving me "already exists" errors. I stumbled across the <code>-e</code> flag, which spit out the <code>CREATE ROLE &amp;#8230;</code> SQL; I changed that to <code>ALTER ROLE &amp;#8230;</code> and it worked.</p> <p><a href="http://www.postgresql.org/docs/9.1/static/auth-pg-hba-conf.html">Section 19.1</a> presents an exhaustive enumeration of the authentication methods of postgres where I would have appreciated successive elaboration: start with the simplest, most typical setup, which seems to be peer. Then have sections in increasing complexity, where the complexity is motivated by related issues; e.g. "md5 for local connections," "passwords with SSL," and then LDAP, and then rocket-science like kerberos and such. In each section, show one complete worked example ending with an actual SQL query that worked, even if that worked example doesn't exercise all of the options. The less typical options can be explained reference style without an example.</p> <p>The root of many of the problems I ran into is perhaps not with postgres itself but the way it's packaged for CentOS, the phpPgAdmin documentation, or even apache or php logging configuration. But the community around mysql is such that concretely documented solutions to these integration issues are, at most, a web search away.</p> Sun, 05 Apr 2015 00:00:00 +0000 http://www.madmode.com//2015/phpPgAdmininstallwoes/ Pebble beats out Garmin Vivofit for my wrist http://www.madmode.com//2014/12-watch1/ <p>My wife, big on walking but not usually a gadget freak, got so addicted to tracking with her Nike+ that when the web site stopped working, it was a major problem. She asked for a fitbit for her birthday.</p> <p>The sleep tracking would be a nice bonus.</p> <p>She talked about getting me one for my birthday too so that I could join her fitbit friends leaderboard.</p> <p>Why not a smartwatch while I'm at it?</p> <p>I have been tracking the market for years, but the Casio 3090 that I already have is good enough that it would take something pretty special to get me to switch:</p> <ol> <li>At about $50, I can afford to replace it every five years or so when I break it or lose it.</li> <li>It's maintenance free. It sets itself from WWV radio every night; it's water resistant; and it's solar powered, so it <strong>never needs charging</strong>.</li> <li>It does one thing really well: keeps time. To the small part of a second. And date. And weekday. And timezones.</li> </ol> <p>A smartwatch that I have to charge every night loses out to the fitbit on sleep tracking. That and the price rules out the current crop (Android Wear, Apple Watch).</p> <p>A friend recommended the Garmin Vivofit ($75). It interoperates with the fitbit web site, tells time, and claims a battery life of around a year.</p> <p>Another acquaintance loves his Pebble ($100). So do lots of other reviewers, while some gripe about style and some report its fitness tracking features don't really cut it.</p> <p>I ordered them both to see which one I like better. I opened the Pebble first.</p> <ul> <li>It's not as big or bulky as I expected. It's actually smaller and lighter than my Casio.</li> <li>Sleep tracking just works. The fitbit took a few nights of fidgeting to figure out how to get it in and out of sleep tracking mode. The Pebble featured a MisFit app when I turned it on. I said sure, go ahead. That night I didn't bother to figure out how to turn on sleep tracking, but when I awoke, lo, there were the data.</li> </ul> <p>I did hit one glitch where the sleep tracker kicked in when I was watching a movie. And there's no web site integration; Android sync is <a href="https://apps.getpebble.com/applications/53a898a2cfee2a02c900006c">"coming soon"</a>.</p> <p>Speaking of sleep, I didn't get much that first night because...</p> <ul> <li><strong>Developer support is amazing</strong>. I knew about their open platform with a C SDK, but what blew me away was <a href="http://developer.getpebble.com/guides/js-apps/pebble-js/">Pebble.js</a> and <a href="https://cloudpebble.net/">CloudPebble</a>. Zero install. Just sign up, grab the example javascript, and hit <em>Install and Run</em>, and there it is, running on your watch. It seems like magic, but it's all just open source.</li> </ul> <p>Two hours later, I had a <a href="https://github.com/dckc/watch1">working prototype</a> of an app I've been thinking about since I re-joined the world of commuting: <em>On my way home, figure out my ETA and send it to my wife so she can figure it in to dinner plans.</em> And if I weren't new to ordinary stuff like <a href="http://stackoverflow.com/a/1197947/846824">javascript date handling</a>, the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Geolocation/Using_geolocation">geolocation API</a>, and the <a href="https://developers.google.com/maps/documentation/directions/">google maps API</a>, it would have been a lot less than two hours.</p> <p>Quite a contrast from trying out <a href="http://blog.jetbrains.com/idea/2014/09/developer-tools-for-phonegap-cordova-and-ionic-in-intellij-idea-14/">IntelliJ's IDE support for Cordova</a>; two hours wasn't even enough to get the underlying Android SDK installed. I ran out of space on my devtools and had to grow it. Twice.</p> <p>While I have the ETA on the Pebble now, I haven't figured out how to actually send it. I can see how to do it with Ajax to a service like twilo, but that seems silly; surely I can just get the phone to send the text.</p> <p>I might build an app to sync my MisFit data to the fitbit API (though I'd rather somebody else did that for me). And I'm itching to try embedded app development with <a href="https://github.com/franc0is/RustyPebble">RustyPebble</a>, too.</p> <p>Anyway, it's game over. I returned the Vivofit without even opening it. The Pebble beat out my Casio watch too:</p> <ol> <li>It's a good value, though if I lost or broke it, I'm not sure whether I'd replace it right away or make do with the old watch for a while. Time will tell.</li> <li>It's pretty low maintenance. It charges quickly enough that there's no conflict with sleep tracking. I just charge it for an hour or so every few days while I commute or while I'm at my desk. The defaults for notifications were a little overwhelming, but it's fine now that I played with the options a bit.</li> <li>It tells time and date, plus the weather and steps and sleep and appointments texts and email and anything else developers like me can dream up. Once a boring email message got in the way of job 1 when I looked for the time. But I'm happy with the trade. I never miss a phone call now. And thanks to Android Smart Lock integration, it saves me keying in my PIN but about once a day.</li> </ol> Fri, 19 Dec 2014 00:00:00 +0000 http://www.madmode.com//2014/12-watch1/ Rust-Sqlite3 -- Rustic bindings for sqlite3 http://www.madmode.com//2014/08-rust-sqlite3/ <p>I was looking into <a href="https://sandstorm.io/">sandstorm</a>, a personal cloud platform with an architecture based on the wonderful <a href="http://www.erights.org/elib/capability/ode/ode-capabilities.html">capability security</a> paradigm, and I found a rust application, <a href="https://github.com/dwrensha/acronymy">acronymy</a>, that uses the native API rather than the traditional POSIX environment.</p> <p>I started poring over the code and followed the dependency link to linuxfood's <a href="https://github.com/linuxfood/rustsqlite">rustsqlite</a>. I started working on a <a href="https://github.com/linuxfood/rustsqlite/issues/92">memory safety issue</a> etc. but soon found a number of large-scale API design issues that I wasn't sure how to approach with the upstream developers. I was also inspired by <code>FromSql</code>, <code>ToSql</code> and such from sfackler's <a href="https://github.com/sfackler/rust-postgres">rust-postgres</a> API.</p> <p>So I started from scratch, using <a href="https://github.com/crabtw/rust-bindgen">bindgen</a>, <code>Result</code> (sum types) etc.</p> <p>Thanks to <strong>apoelstra</strong> and others in the <a href="http://www.rust-lang.org/">rust community</a> IRC channel, I'm making pretty good progress. The API isn't stable yet; testing continues to turn up issues at a pretty high rate. But it's getting there.</p> <p>Open source collaboration and QA tools are great these days. Not only do we have <a href="https://github.com/dckc/rust-sqlite3">rust-sqlite3 on github</a>, but every time I push there:</p> <ul> <li><a href="http://www.rust-ci.org/dckc/rust-sqlite3">rust-sqlite3 on travis-ci</a> runs all the tests and builds the docs an pushes them to</li> <li><a href="http://www.rust-ci.org/dckc/rust-sqlite3/doc/sqlite3/">rust-sqlite3 on rust-ci</a>, where docs are published.</li> </ul> Wed, 13 Aug 2014 00:00:00 +0000 http://www.madmode.com//2014/08-rust-sqlite3/ Capability Security Advances: seL4, sandstorm, Rserve http://www.madmode.com//2014/08-ocap-here-and-there/ <p>They did it. It's done. General Dynamics C4 Systems and NICTA released <a href="http://sel4.systems/">seL4</a> under an open source license. I've been wishing for this since 2012, when I bookmarked the <a href="http://www.ertos.nicta.com.au/research/sel4/">seL4 papers</a> under <a href="https://www.diigo.com/user/dckc-madmode/security%20research?sort=created">security research</a>.</p> <p>Oh for time to study all the details! The formal model of C, the translator from C to Isabelle, and on and on.</p> <p>I wonder what it would take to put node-js on seL4, with no linux kernel in between. Then, using <a href="http://research.google.com/pubs/pub40673.html">secure-ecmascript and such</a>, we might have a complete capability based platform.</p> <p>Meanwhile, <a href="https://sandstorm.io/">sandstorm</a> is an emerging personal clound platform with a capability security architecture (<a href="http://kentonv.github.io/capnproto/">cap'n proto</a> etc.). I pitched in to the kickstarter campaign because I really want to use it but I doubt I'll find time to host it myself.</p> <p>But what really surprised me was that while wrestling with some python/R foreign function interface issues, I discovered this bit of <a href="http://rforge.net/Rserve/dev.html">rserve news</a>:</p> <blockquote> <h2>Additions in version 1.7</h2> <p>... Another major change is the new, optional object capability mode in which all commands are disabled except for CMD_OCcall. In this mode the server does not send an ID string, but instead sends a regular QAP1 message with CMD_OCinit. This message is guaranteed to have at least 16 bytes of payload so it will satisfy the read for an ID string. The command has been chosen to correspond to "RsOC" (in little-endian) as to identify this mode. The payload is DT_SEXP which holds all initial capabilities that can be used in CMD_OCcall. Each CMD_OCcall is DT_SEXP encoding a call (i.e., LANGSXP) with an OCref object in place of the closure. Rserve will de-reference it before calling eval. The main purpose of this mode is to create a basis for a secure interface where arbitrary evaluation is not possible. Only code exposed by capabilities can be executed.</p> </blockquote> <p>The <a href="https://github.com/att/rcloud/issues/73">rcloud move to Rserve OCAP mode</a> seems to be done. I filed a <a href="https://github.com/ralhei/pyRserve/issues/5">wish for support in pyRserve</a>.</p> Wed, 13 Aug 2014 00:00:00 +0000 http://www.madmode.com//2014/08-ocap-here-and-there/ A Start in the Craft of Quality Software Development http://www.madmode.com//2014/06-pada1/ <p>I've taken on an open source software development apprentice.</p> <p>He's passionate about music and gaming, so we looked at <a href="http://music-suite.github.io/docs/ref/">The Music Suite</a> as an introduction to Haskell, but it's too bleeding edge: the <em>Hello World</em> example has an extraneous dependency on <a href="http://hackage.haskell.org/package/unix">unix</a>, which won't fly on his Windows development machine. I looked into re-arranging the dependencies, but even on linux, the released version doesn't install cleanly. We looked at haskell games, but installing OpenGL doesn't look like instant gratification either.</p> <p>It turns out he has an idea for a web site to automate some game player-ranking stuff that he does.</p> <p>He's done some Java development, so I thought perhaps using <a href="http://code.google.com/p/joe-e/">Joe-E</a> would be a good way to expose him to <a href="http://erights.org/elib/capability/ode/ode-capabilities.html]">object capability security</a>, but Joe-E evidently went fallow in 2011. I can't get it to work with any handy version of Eclipse.</p> <p><a href="http://www.scala-lang.org/documentation/getting-started.html">Starting with Scala</a> seems reasonable; it was <a href="../2010/advogato_entry0071.html">my bridge from python to functional programming</a>, after all.</p> <p>As I had hoped, the tools have matured. While I'm an emacs addict, I don't think I should infect the next generation, so I'm happy to find that <a href="http://www.jetbrains.com/idea/download/download_thanks.jsp">IntelliJ is open source</a> and its plug-in support does as well or better at things like:</p> <ul> <li><a href="http://www.jetbrains.com/idea/webhelp/publishing-a-project-on-github.html">Publishing a Project on GitHub</a></li> <li><a href="http://plugins.jetbrains.com/plugin/5970?pr=phpStorm">Markdown syntax support</a></li> </ul> <p><a href="http://confluence.jetbrains.com/display/IntelliJIDEA/Getting+Started+with+SBT">IntelliJ gets along with SBT</a> now too, granting wishes for software from our peers via <a href="http://search.maven.org/">maven</a>.</p> <p>The raw data for the player ranking is on the Web, so our first two wishes were:</p> <ul> <li>an HTTP client library (<a href="http://dispatch.databinder.net/Dispatch.html">dispatch</a>) and</li> <li>an HTML parse/query library (<a href="http://jsoup.org/">jsoup</a>).</li> </ul> <p>Dispatch makes good use of <a href="http://en.wikipedia.org/wiki/Functional_programming#Type_systems">types</a>, and using jsoup involved only a little <a href="http://stackoverflow.com/questions/1072784/how-can-i-convert-a-java-iterable-to-a-scala-iterable">adaptation between Java and Scala types</a>. He was pretty excited when he saw the program extracting various bits of information about each match.</p> <p>The next episode looks to be <a href="http://www.jetbrains.com/idea/features/google_app_engine.html">deployment with Google App Engine</a>.</p> Sat, 28 Jun 2014 00:00:00 +0000 http://www.madmode.com//2014/06-pada1/ Talking to the web http://www.madmode.com//2014/01-talking-to-the-web/ <p>Text chat is wonderfully near-real-time: you can think out loud and somebody might respond right away or quite a while later or not at all.</p> <blockquote> <p>The random rewards of gambling are much more seductive than a more predictable reward cycle. &mdash; <a href="http://www.boston.com/news/globe/ideas/articles/2007/08/19/your_brain_on_gambling/">Your brain on gambling</a></p> </blockquote> <p>I'm usually in the #swig IRC channel when I'm hacking, and <a href="http://chatlogs.planetrdf.com/swig/2006-03-29.html#T04-45-32http://chatlogs.planetrdf.com/swig/2006-03-29.html#T04-45-32">this nugget</a> from Dave Beckett hit the nail on the head:</p> <pre><code>&lt;DanC&gt; lots about mouse clicks and files and folders. sigh. I think folks want TOC items like: instant messagining, email, and web browsing. maybe photo editing, CD ripping &lt;DanC&gt; "Configuring a CD Database" is under "9. Using Preference Tools" &lt;DanC&gt; er... wha? how do I actually see the documents that these search results refer to? http://gnomedesktop.org/search/node/keyring &lt;DanC&gt; oh... wierd... the headlines are invisible &lt;Ontogon_&gt; dan, are you talking to yourself? &lt;dajobe&gt; he's talking to the web </code></pre> Thu, 16 Jan 2014 00:00:00 +0000 http://www.madmode.com//2014/01-talking-to-the-web/ Exploring the Web Hosting Marketplace http://www.madmode.com//2013/fs-tt/web-host-shopping/ <div class="text_cell_render border-box-sizing rendered_html"> <p>The client I develop <a href="https://bitbucket.org/DanC/hh-office">hh-office</a> for is seeing poor performance, so I'm shopping for web hosting alternatives.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="LAMP-on-Dreamhost:-it's-easier-than-thinking">LAMP on Dreamhost: it's easier than thinking<a class="anchor-link" href="#LAMP-on-Dreamhost:-it's-easier-than-thinking">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The app is an ordinary PHP/MySQL app with a few python bits and bobs. When I originally deployed it in late 2011, I knew:</p> <ol style="list-style-type: decimal"> <li>Lots of people used dreamhost <ul> <li>including <a href="http://impressive.net/archives/fogo/20070109173322.GO5388@impressive.net">Gerald, a world-class sysadmin, back in 2007</a></li> </ul></li> <li>Lots of people <em>complain</em> about dreamhost.</li> </ol> <p>But it's not clear that the complaints about dreamhost indicate anything other than popularity. After all, as the <a href="http://indiewebcamp.com/web_hosting">IndieWebCamp folks say</a>, picking a web hosting service is in some ways like picking a cell phone provider, and we all complain about our cell phone providers, don't we?</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Shared-Hosting,-VPS,-and-System-Administration">Shared Hosting, VPS, and System Administration<a class="anchor-link" href="#Shared-Hosting,-VPS,-and-System-Administration">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The first performance remedy I tried was the dreamhost $15/month <a href="http://www.dreamhost.com/servers/vps/">VPS hosting</a> upgrade, but it made little difference. I didn't read the documentation carefully enough to notice that <strong>the database is on a separate server</strong>. I think I saw some database-related upgrade options, but in <a href="https://drupal.org/node/120736">drupal performance discussion</a>, dreamhost is notorious for poor MySQL architecture and hence performance. So I went shopping for alternatives.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>I tried the <a href="http://bitnami.com/stack/lamp">LAMP Stack by bitnami</a> on Amazon EC2, but the hourly fees stareted to add up. One VM is more than enough for this app and I guess I should have known that <strong>scalable cloud hosting isn't cost-effective if I'm not using the scalability</strong>.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>So I returned to exploring the overcrowded shared hosting marketplace. justhost and site5 were nominated in the drupal discussion, as was arvixe. I looked to see if any of them had free trials, and I was reminded of <a href="http://owncloud.org/providers/">ownCloud hosts</a> with free plans, one of which was arvixe. I might have liked to chat with <a href="http://www.meetup.com/kcphpug/">other KC PHP devs</a>, but I was too impatient to wait for the next meeting, so based on arvixe's delightful sign-up experience its hostjury reviews, I went ahead and paid a few dollars to experiment with their shared hosting for a month.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>What jumped out at me, after experimenting with a couple VPS platforms, is the economy of scale in <strong>system administration services</strong> with shared hosting. For just a few dollars a month, not only will they install wordpress or phpBB for a few clicks, but they will administer mailing lists, backups, log files, and databases.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>I suppose I could have tried NearlyFreeSpeech.Net. For less than a dollar a month, it works fine for static sites like this blog. But while but while they support everything this app needs (PHP, MySQL and installing python from source), I don't wouldn't expect performance to be better than dreamhost on their shoestring budget.</p> <p>Plus, I could see my boys making good use of the one-click phpBB installer and such.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>In the long term, here's hoping <a href="http://www.docker.io/">docker</a> drives the price of PAAS services like Heroku down to this price range and diversifies their feature set to include mailing lists.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Arvixe-over-Dreamhost?">Arvixe over Dreamhost?<a class="anchor-link" href="#Arvixe-over-Dreamhost?">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>As I explored arvixe, I had a little hiccup getting shell access, but <strong>live chat support</strong> took care of it fast enough that I didn't really lose stride.</p> <p>After <a href="http://www.arvixe.com/linux_web_hosting">claims</a> that they &quot;provide the latest Python version,&quot; I was a little disappointed to see that this means they let you <a href="http://blog.arvixe.com/create-your-own-python-enviroment-locally-in-your-shared-hosting-account/">build it from source</a>. Oh well; I had to do that on dreamhost too; it seems to be par for the course and it takes just a few minutes.</p> <p>Bandwidth seemed OK, not that downloading on the server is relevant to my app:</p> <pre><code>[~]# wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tgz ... 14,725,931 2.76M/s in 9.5s </code></pre> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>After copying the ~120K records about ~10K clients, the app did seem to respond a little quicker on arvixe, though I don't have any hard data. I did see that PHP and the MySQL database were running on the same server.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="A-Summary-of-My-Experience-with-Web-Hosting-Services">A Summary of My Experience with Web Hosting Services<a class="anchor-link" href="#A-Summary-of-My-Experience-with-Web-Hosting-Services">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>I missed out on the first ten or fifteen years of this marketplace, since I was in something of a bubble provided by the W3C systems team.</p> <p>Since then, these are the services I have used, with the one I'd most likely use again on top:</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <ul> <li>domain registrar: <ul> <li>namecheap</li> <li>nearlyfreespeech.net <ul> <li>A consequence of their transparent pricing model is deposit fees. Namecheap's scale seems to hide such things.</li> </ul></li> <li>gandi <ul> <li>International credit card transactions are a little inconvenient.</li> </ul></li> </ul></li> <li>DNS management (connecting domains to hosts) <ul> <li>namecheap</li> <li>nearlyfreespeech.net</li> <li>Amazon Route5</li> <li>zonedit.com <ul> <li>Was great when the only alternative was <code>bind</code> config files!</li> </ul></li> </ul></li> <li>Shared hosting: <ul> <li>arvixe</li> <li>NearlyFreeSpeech.Net <ul> <li>Image bandwidth seems limited, for understandable reasons.</li> </ul></li> <li>dreamhost <ul> <li>Docs suggest a long, heavy legacy (reminds me of W3C in that way).</li> </ul></li> </ul></li> <li>blogging SAS: <ul> <li>wordpress</li> <li>blogger <ul> <li>Cheaper than blogger for bring-your-own-domain service, but you get what you pay for.</li> </ul></li> </ul></li> <li>PAAS <ul> <li><a href="http://www.docker.io/">docker</a></li> </ul></li> <li>VPS: <ul> <li>AWS EC2</li> <li>dreamhost</li> </ul></li> </ul> </div> Sat, 04 Jan 2014 00:00:00 +0000 http://www.madmode.com//2013/fs-tt/web-host-shopping/ The Aaron Swartz Movie Match Pledge http://www.madmode.com//2014/aaronsw-match/ <p>As we mark the new year, I'm <a href="https://twitter.com/waxpancake/status/418224535003869184">right there with Andy Baio</a>:</p> <blockquote> <p>For me, 2013 will always be the year we lost Aaron. One of the hardest things I've ever written. <a href="http://waxy.org/2013/01/aaron/">waxy.org/2013/01/aaron/</a></p> </blockquote> <p>In <a href="http://www.aaronsw.com/weblog/000765">Aaron's account</a> of the 2002 Creative Commons license launch, he wrote:</p> <blockquote> <p>when I go to a movie, I donate money in the amount I spent to the <a href="https://www.eff.org/">EFF</a>.</p> </blockquote> <p>If 2013 is the year we lost Aaron, let 2014 be the year we pick up where he left off. In the <a href="http://archive.rootstrikers.org/www.rootstrikers.org/aaron_swartz.html">words of Lawrence Lessig</a>:</p> <blockquote> <p>... our fight was his fight. And [...] while only he was Aaron Swartz, we are all now Aaron Swartz.</p> </blockquote> <p>Every time you go to a movie, think about the impact on <a href="http://randomfoo.net/oscon/2002/lessig/">free culture</a>:</p> <blockquote> <ul> <li>Creativity and innovation always builds on the past.</li> <li>The past always tries to control the creativity that builds upon it.</li> <li>Free societies enable the future by limiting this power of the past.</li> <li>Ours is less and less a free society.</li> </ul> </blockquote> <p>I'd like to see lots of other people <strong>take <a href="http://www.pledgebank.com/aaronsw-match">this pledge</a></strong>, but I've been matching my movie spending with donations in Aaron's memory (to the EFF, <a href="http://www.rootstrikers.org/">rootstrikers</a>, etc.) for about a year now, and I plan to continue regardless of how many people join me. I picked 26 becase that's how old Aaron was when we lost him:</p> <blockquote> <p>I will <a href="http://www.pledgebank.com/aaronsw-match">match my spending on movies with donations in memory of Aaron Swartz</a> but only if <strong>26</strong> other people will do the same.</p> <p>— Dan Connolly, Open Web Advocate (contact)</p> </blockquote> Wed, 01 Jan 2014 00:00:00 +0000 http://www.madmode.com//2014/aaronsw-match/ Writing Madmode Articles with IPython and Docker http://www.madmode.com//2013/fs-tt/nbpub/ <div class="text_cell_render border-box-sizing rendered_html"> <p>When I was doing <a href="http://www.madmode.com/2012/light-runner-spelunking.html">exploratory signal processing</a> a year ago, the <a href="#Perez07">IPython notebook</a> was obviously a good tool. I tried it again recently for <a href="http://www.madmode.com/2013/fs-tt/fs86.html">bringing old math notes back to life</a>, and that went well too. So I'm putting a little effort into tooling support.</p> <p>Hypertext editing with markdown works pretty well, especially a cell at a time. I was a little concerned that I'd miss the ability to select/cut/copy/paste multiple cells like Mathematica or do file-wide search and replace like emacs, but so far I haven't needed to.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Installing-IPython-notebook-via-a-docker-container">Installing IPython notebook via a docker container<a class="anchor-link" href="#Installing-IPython-notebook-via-a-docker-container">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The Ubuntu 12.04 ipython notebook package isn't up to the task, and between these episodes, my manual installation bit-rotted. I got it running again and jotted down some rough notes for future reference:</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <pre><code>#!/bin/sh &gt; virtualenv ~/pyenv/pynb &gt; . ~/pyenv/pynb/bin/activate (pynb)&gt; pip install ipython (pynb)&gt; sudo apt-get install libzmq-dev (pynb)&gt; pip install pyzmq ZMQ version detected: 2.1.11 Warning: Detected ZMQ version: 2.1.11, but pyzmq targets ZMQ 4.0.3. Warning: libzmq features and fixes introduced after 2.1.11 will be unavailable. (pynb)&gt; pip install jinja2 Downloading Jinja2-2.7.1.tar.gz (377Kb): 377Kb downloaded (pynb)&gt; pip install tornado Downloading tornado-3.1.1.tar.gz (374Kb): 374Kb downloaded (pynb)&gt; ipython notebook</code></pre> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Those notes quickly get out of date. For example, nbconvert requires pandoc.</p> <p>Then I realized a docker container would be just the thing. And lo, <a href="https://index.docker.io/u/dckc/ipython-docker/">dckc/ipython-docker</a> is born:</p> <pre><code>$ sudo docker run -p 8123:8888 -v `/bin/pwd`:/notebooks -t dckc/ipython-docker 2013-12-31 04:28:05.305 [NotebookApp] Created profile dir: u&#39;/.ipython/profile_default&#39; 2013-12-31 04:28:05.308 [NotebookApp] Using MathJax from CDN: http://cdn.mathjax.org/mathjax/latest/MathJax.js 2013-12-31 04:28:05.320 [NotebookApp] Serving notebooks from local directory: /notebooks 2013-12-31 04:28:05.320 [NotebookApp] The IPython Notebook is running at: http://0.0.0.0:8888/ 2013-12-31 04:28:05.321 [NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).</code></pre> <p>Note http://0.0.0.0:8888/ is an address from inside the container. From outside the container, we use port 8123.</p> <p>The Control-C message needs some context too: you'd have to attach the container to send it signals via the keyboard. I typically just use <code>sudo docker kill</code> to stop the service. I haven't bothered with the details of starting at boot and such.</p> <p>Of course, after I got it all working, I found several other <a href="https://index.docker.io/search?q=ipython">ipython images</a> in the index. But I'm not sorry I worked it out for myself.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h3 id="Getting-started-with-docker">Getting started with docker<a class="anchor-link" href="#Getting-started-with-docker">&#182;</a></h3> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Docker is moving rapidly, but it's considerably more polished now than when I first looked at it. The <a href="http://docs.docker.io/en/latest/installation/ubuntulinux/">docker apt-repositories for Ubuntu</a> (key fingerprint: 36A1 D786 9245 C895 0F96 6E92 D857 6A8B A88D 21E9) work just fine. My only issue getting it started this time was that I had an old installation lying around in <code>/usr/local/bin</code> and it was getting in the way, and the diagnostics were a little mysterious:</p> <pre><code>$ sudo docker run -p :8888 -t ipython-notebook WARNING: The mapping to public ports on your host has been deprecated. Use -p to publish the ports.</code></pre> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Generating-a-static-HTML-version-of-a-notebook">Generating a static HTML version of a notebook<a class="anchor-link" href="#Generating-a-static-HTML-version-of-a-notebook">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>IPython supports conversion to HTML, but out-of-the-box, you either get:</p> <ol style="list-style-type: decimal"> <li>a stand-alone HTML document <ul> <li>with all sorts of CSS that may or may not conflict with a blog style</li> <li>with no links to blog context</li> </ul></li> <li>a stripped-down HTML document body with <ul> <li>no style</li> <li>no syntax highlighting</li> </ul></li> </ol> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Fortunately, the API for custom renditions is straightforward and well documented. My <a href="/static/code/ipynb_pub/mm_ipy.py">mm_ipy.py</a> is serviceable, though I'm still working through some issues with <a href="https://github.com/ipython/ipython/pull/4682">pygments vs. javascript code highlighting</a> and such.</p> <p>Let's import it to take a look:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[1]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="kn">import</span> <span class="nn">imp</span> <span class="n">mm_ipy</span> <span class="o">=</span> <span class="n">imp</span><span class="o">.</span><span class="n">load_source</span><span class="p">(</span><span class="s">&#39;mm_ipy&#39;</span><span class="p">,</span> <span class="s">&#39;code/ipynb_pub/mm_ipy.py&#39;</span><span class="p">)</span> </pre></div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Then let's connect the dots between rst markup in documentation and HTML renditions of values in ipython notebook cell outputs:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[2]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="kn">from</span> <span class="nn">docutils.core</span> <span class="kn">import</span> <span class="n">publish_parts</span> <span class="k">class</span> <span class="nc">Doc</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span> <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">it</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">it</span> <span class="o">=</span> <span class="n">it</span> <span class="k">def</span> <span class="nf">_repr_html_</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="k">return</span> <span class="n">publish_parts</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">it</span><span class="o">.</span><span class="n">__doc__</span><span class="p">,</span> <span class="n">writer_name</span><span class="o">=</span><span class="s">&#39;html&#39;</span><span class="p">)[</span><span class="s">&#39;html_body&#39;</span><span class="p">]</span> </pre></div> </div> </div> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[3]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">Doc</span><span class="p">(</span><span class="n">mm_ipy</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt output_prompt"> Out[3]:</div> <div class="box-flex1 output_subarea output_pyout"> <div class="output_html rendered_html"> <div class="document" id="mm-ipy-convert-ipython-notebook-to-markdown-for-madmode-blog"> <h1 class="title">mm_ipy -- convert ipython notebook to markdown for madmode blog</h1> <p>Usage:</p> <pre class="literal-block"> $ python article_in.ipynb article_out.md </pre> <p>See <span class="func">article_meta</span> for conventions for title, date, tags, etc.</p> <div class="note"> <p class="first admonition-title">Note</p> <p class="last">IPython.nbconvert.HTMLExporter has late-binding dependencies on pandoc, pygments, etc.</p> </div> <div class="section" id="acknowledgements"> <h1>Acknowledgements</h1> <blockquote> <ul class="simple"> <li><a class="reference external" href="http://nbviewer.ipython.org/github/Carreau/posts/blob/master/06-NBconvert-Doc-Draft.ipynb#noqa">How to Use NBConvert</a> (<a class="reference external" href="https://github.com/Carreau/posts/blob/master/06-NBconvert-Doc-Draft.ipynb">source</a>) Matthias Bussonnier (Carreau) Dec 01, 2013</li> </ul> </blockquote> </div> </div> </div> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The notebook should have some article metadata in a markdown cell surrounded by a certain kind of pre tags:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[4]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">Doc</span><span class="p">(</span><span class="n">mm_ipy</span><span class="o">.</span><span class="n">article_meta</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt output_prompt"> Out[4]:</div> <div class="box-flex1 output_subarea output_pyout"> <div class="output_html rendered_html"> <div class="document"> <p>Collect article metadata from a notebook.</p> <blockquote> <p>The title is taken from the (first) heading level 1 cell.</p> <p>Other metadata is taken from the (first) cell that starts with:</p> <pre class="literal-block"> &gt;&gt;&gt; print article_meta.func_defaults[0] &lt;pre class=&quot;about yaml&quot;&gt; </pre> <p>Metadata is written in YAML-ish name: value style (see <span class="func">grok_yaml</span> for details).</p> <p>The closing tag is ignored:</p> <pre class="literal-block"> &gt;&gt;&gt; print article_meta.func_defaults[1] &lt;/pre&gt; </pre> </blockquote> </div> </div> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[5]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">Doc</span><span class="p">(</span><span class="n">mm_ipy</span><span class="o">.</span><span class="n">grok_yaml</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt output_prompt"> Out[5]:</div> <div class="box-flex1 output_subarea output_pyout"> <div class="output_html rendered_html"> <div class="document"> <p>Quick-n-dirty YAML parser.</p> <blockquote> <pre class="doctest-block"> &gt;&gt;&gt; grok_yaml(&quot;&quot;&quot;&lt;pre&gt; ... date: 2001-01-01 ... tags: ['travel', 'humor'] ... &lt;/pre&gt;&quot;&quot;&quot;, excludes=['&lt;']) [('date', '2001-01-01'), ('tags', &quot;['travel', 'humor']&quot;)] </pre> <div class="note"> <p class="first admonition-title">Note</p> <p class="last">TODO: handle continuation lines properly.</p> </div> <pre class="doctest-block"> &gt;&gt;&gt; grok_yaml(&quot;&quot;&quot;&lt;pre&gt; ... summary: What I did ... this summer. ... &lt;/pre&gt;&quot;&quot;&quot;, excludes=['&lt;']) [('summary', 'What I did'), (' this summer.',)] </pre> </blockquote> </div> </div> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Packages-from-apt-and-PyPI">Packages from apt and PyPI<a class="anchor-link" href="#Packages-from-apt-and-PyPI">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The container doesn't have access to packages installed in the host system via pip or apt-get. But I can install from pypi within the container.</p> <p>Installing from within the Dockerfile makes the package part of the container, but it involves killing and re-starting the container. And it feels less minimal/modular somehow.</p> <p>Installing from within a notebook (e.g. <code>!pip install docutils</code>) is handy, but once the container is stopped, the installation goes away (unless the container is committed and the image kept handy somehow).</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="File-layout-limitations">File layout limitations<a class="anchor-link" href="#File-layout-limitations">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The IPython notebook service can only see notebooks in one directory. I wish it were more web-like, i.e. it expected to be <a href="http://www.w3.org/DesignIssues/Principles.html#TOII">part of a larger whole</a>. I'd like to use it to edit <code>.ipynb</code> files under the various date-oriented subdirectories of my blog. Linking from a notebook to files elsewhere in the blog is also pretty awkward.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="References">References<a class="anchor-link" href="#References">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p><em>TODO: Zotero integration</em></p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <ul> <li><a name="Perez07">Pérez, Fernando, and Brian E. Granger. 2007</a>. “IPython: a System for Interactive Scientific Computing.” Computing in Science &amp; Engineering 9 (3): 21–29. doi:10.1109/MCSE.2007.53. URL: http://ipython.org</li> </ul> </div> Mon, 30 Dec 2013 00:00:00 +0000 http://www.madmode.com//2013/fs-tt/nbpub/ Digital Restoration for Math Notes: Natural Deduction http://www.madmode.com//2013/fs-tt/fs86/ <div class="text_cell_render border-box-sizing rendered_html"> <p>My <a href="/search/label/archive-math-notes/">quest</a> to to find a good digital preservation technique for my college math and computer science notebooks has been rekindled most recently by <a href="http://www.idris-lang.org/">idris</a> and earlier by <a href="http://us.metamath.org/index.html">metamath</a> and <a href="https://pypi.python.org/pypi/proofcheck">proofcheck</a>.</p> <p>Meanwhile, the IPython notebook with MathJax and FLiP makes for an interesting editing and collaboration tool.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Rendering-Inference-Rules-with-MathJax">Rendering Inference Rules with MathJax<a class="anchor-link" href="#Rendering-Inference-Rules-with-MathJax">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>This is a typical page of my notes:</p> <figure> <img src="https://lh6.googleusercontent.com/-q_hDdsOim7k/Ur9JxjlmwnI/AAAAAAAABg4/i0UTdHqQcq0/w614-h613-no/M373K_notebook_pg2-e.png" /> <figcaption>M373K notebook page 2</figcaption> </figure> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Math and C.S. literature is traditionally written with \(\TeX\). I've used it only occasionally and reluctantly, by <a href="http://www.w3.org/2004/04/xhlt91/">writing in HTML and converting</a>, but but MathJax seems to do a great job of integrating it into the web.</p> <p>The first definition on that page of number theory notes is &quot;a divides b&quot;:</p> <figure> <img src="https://lh6.googleusercontent.com/-CmJUpTOqcKc/Urz6D_3W5zI/AAAAAAAABd4/Dx7cZZrdznw/w500-h123-no/defn_a_divides_b.png" alt="a divides b iff ..." /> <figcaption>Definition of &quot;a divides b&quot;</figcaption> </figure> <p>Using MathJax, I can get a pretty good rendition:</p> <blockquote> <p>\(\operatorname{Def^n}\) \(a | b\) &quot;a divides b&quot;</p> <p>Note: universe = \(\Bbb Z\)</p> <p>\[ a|b \iff \exists k (b = ka)\]</p> </blockquote> <p>I looked all over for a way to render bi-directional inference rules, but I could only find MathJax support for one direction:</p> <blockquote> <p>\[ \frac{a|b}{\exists k (b = ka)}\]</p> </blockquote> <p>Perhaps using this notation for &quot;equivalent by definition&quot; would be better than bi-directional implication in some ways, though it's not much better for visual pattern matching:</p> <blockquote> <p>\[ a|b \equiv \exists k (b = ka)\]</p> </blockquote> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Proofcheck:-Formalizing-TeX-proofs-with-Morris-Logic">Proofcheck: Formalizing TeX proofs with Morris Logic<a class="anchor-link" href="#Proofcheck:-Formalizing-TeX-proofs-with-Morris-Logic">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The point is not just to <em>render</em> the notes nicely but to <em>capture the knowledge</em> in a way I (and my collaborators) can exploit by machine.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Proofs in the \(\TeX\) dialect in Bob Neveln's <a href="http://cs.widener.edu/proofcheck/">ProofCheck</a> system can be checked by a few thousand lines of python code. The dialect imposes very little in the way of logical constraints over and above the way articles are typically written:</p> <figure> <img src="http://cs.widener.edu/proofcheck/examples/divides1.png" alt="" /> <figcaption>formal proof: Divisibility is Transitive</figcaption> </figure> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>I tried to figure out how it works from the python source code but quickly got lost. While attempting a scala port (<a href="https://bitbucket.org/DanC/pfmorris">pfmorris</a>) to sort out the latent types, I realized the python source wasn't the best explanation of what's going on. The <a href="http://cs.widener.edu/proofcheck/commonnotions.html">common notions</a> file explain the use of <a href="#Alps-Neveln81">Morris Logic</a> with second-order schemators and <a href="https://en.wikipedia.org/wiki/Epsilon_calculus">Hilbert's epsilon</a> for indefinite description.</p> <p>Working on the scala port got sufficiently repetetive and tedious that I wondered if automating it might work better. The byproduct is <a href="https://bitbucket.org/DanC/py2scala">py2scala</a>, which turns out to be more directly useful for python refactoring than porting. More on that in another episode, I hope.</p> <p>Meanwhile, back to the quest to preserve my notebooks...</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Natural-Deduction-and-Fitch-Diagrams">Natural Deduction and Fitch Diagrams<a class="anchor-link" href="#Natural-Deduction-and-Fitch-Diagrams">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The notation I used for formal proofs throughout my time at U.T. Austin, and to this day for similar tasks, comes from the <em>Philosophy 313K: Logic, Sets and Functions</em> course. The instructors were Kant and Bonevac. I remember Kant giving most of the lectures, but Bonevac wrote the <a href="#Bonevac86">text</a>:</p> <figure> <a href="https://plus.google.com/photos/112068148589999713385/albums/5961914947303558753/5962551333982176850?pid=5962551333982176850&oid=112068148589999713385"> <img src="https://lh5.googleusercontent.com/LPbzFM_hRIeV3kr3y4kMv4-uObT_lfN8ys8G5AN3CO0=w159-h207-p-no" alt="book and notebook" /> <figcaption>Proof text by Bonevac and my PHL313k notebook</figcaption> </a> </figure> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The system is introduced on p. 98:</p> <blockquote> <p>The idea of an axiomatic system is old, dating at least from the time of Euclid. The Stoics, who were Greek philosophers of the third century B.C. were the first logicians to organize logic aximatically. In contrast, natural deduction systems are relatively new; Gerhard Gentzen, a German logician, and Stanislaw Jaskowski, a Polish student of Jan Lukasiewicz, independently proposed the first natural deduction systems in 1934. The system of this book owes a great deal, as well, to innovations by the American logicians Willard van Orman Quine, Frederic B. Fitch, Donald Kalish and Richard Montague.</p> </blockquote> <p>The history was lost on me at the time, but it took on practical relevance as I looked at metamath. I could read the proofs fairly well, but when I tried to write even a simple one, I was stuck. It wasn't until I discovered <a href="http://wiki.planetmath.org/cgi-bin/wiki.pl/Natural_deduction_based_metamath_system">a natural deduction based metamath system</a> that I realized the conventional <a href="http://us.metamath.org/mpegif/mmset.html#traditional">metamath proofs are written Hilbert-style</a> and the system I learned is a natural deduction system, and converting Hilbert-style to natural deduction is notoriously difficult.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The modern rendition of the text seems to be a more polished book, <a href="http://bonevac.info/deduction/About_the_Book.html">Deduction</a>. The text isn't handy for me to link to, but the system seems to be based on <a href="https://en.wikipedia.org/wiki/Fitch-style_calculus">fitch diagrams</a>, which are supported by various tools around the web.</p> <p>Consider the first example from chapter 4, <em>Formal Proof</em>, of the '86 text:</p> <figure> <img src="https://lh6.googleusercontent.com/-lKLjhVZbU5Q/Ur876JuXUCI/AAAAAAAABfU/HimdrgZ6iCI/w768-h290-no/bonevac86_p108_ex.jpg" alt="Show p and q implies q and p" /> <figcaption>First example from chapter 4, <em>Formal Proof</em></figcaption> </figure> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>It looks like this in <a href="http://www.proofmood.com/index_en.php">proofmood</a>:</p> <figure> <img src="https://lh4.googleusercontent.com/-FPbInyr_wcY/Ur876b0Ej2I/AAAAAAAABfI/WWfxFItLbNw/w538-h394-no/fitch-screenshot.png" alt="Proofmood screenshot" /> <figcaption>screenshot of Proofmood verifying proof of P ^ Q -&gt; Q ^ P</figcaption> </figure> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>In their Input/Output syntax:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[1]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">derivation</span> <span class="o">=</span> <span class="s">&quot;&quot;&quot;</span> <span class="s">[ entails [ p &amp; q entails q :&amp; elim 2 ;p :&amp; elim 2 ;q &amp; p :&amp; intro 3,4 ] ;(p &amp; q) $ (q &amp; p) :$ intro 2-5 ] !line_cnc5&quot;&quot;&quot;</span> </pre></div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Formal-Logic-in-Python-(FLiP)">Formal Logic in Python (FLiP)<a class="anchor-link" href="#Formal-Logic-in-Python-(FLiP)">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>I'm increasingly happy with the IPython notebook for immersive hypertext editing with integrated computation. I'd like to integrate it with the idris REPL, but meanwhile, a search for <strong>python</strong> and <strong>natural deduction</strong> turned up <a href="https://pypi.python.org/pypi/FLiP/">FLiP</a>.</p> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p><code>import *</code> is usually not a good idea, but for interactive use, it makes sense:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[2]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="kn">from</span> <span class="nn">flip.logic.fol_session</span> <span class="kn">import</span> <span class="o">*</span> </pre></div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Python syntax is used to enter formulas and proof steps. FLiP then generates a more traditional notation:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[3]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">clear</span><span class="p">()</span> <span class="n">checkp</span><span class="p">(</span><span class="n">And</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">q</span><span class="p">),</span> <span class="n">assume</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> |p &amp; q (0) Assumption </pre> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>After the assumption, we can apply some elimination and introduction rules:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[4]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">apropos</span><span class="p">(</span><span class="n">And</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> [(&apos;ai&apos;, [&apos;m1&apos;, &apos;m2&apos;, &apos;And(m1,m2)&apos;]), (&apos;aer&apos;, [&apos;And(m1,m2)&apos;, &apos;m1&apos;]), (&apos;ael&apos;, [&apos;And(m1,m2)&apos;, &apos;m2&apos;])] </pre> </div> </div> </div> </div> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[5]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">rapply</span><span class="p">(</span><span class="n">ael</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="n">rapply</span><span class="p">(</span><span class="n">aer</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="n">rapply</span><span class="p">(</span><span class="n">ai</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> |q (1) And-Elimination (Left) (0) |p (2) And-Elimination (Right) (0) |q &amp; p (3) And-Introduction (1) (2) </pre> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>That much was pretty obvious, but to use subproofs, I had to read the documentation:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[6]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">checkp</span><span class="p">(</span><span class="n">Impl</span><span class="p">(</span><span class="n">And</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">q</span><span class="p">),</span> <span class="n">And</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">p</span><span class="p">)),</span> <span class="n">impli</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> (p &amp; q) -&gt; (q &amp; p) (4) Implication-Introduction (0) (3) </pre> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>The whole proof looks pretty much like the example from chapter 4:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[7]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">pp</span><span class="p">()</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> |p &amp; q (0) Assumption |q (1) And-Elimination (Left) (0) |p (2) And-Elimination (Right) (0) |q &amp; p (3) And-Introduction (1) (2) (p &amp; q) -&gt; (q &amp; p) (4) Implication-Introduction (0) (3) </pre> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="Quantification,-Rules,-and-a-hint-of-Type-Theory">Quantification, Rules, and a hint of Type Theory<a class="anchor-link" href="#Quantification,-Rules,-and-a-hint-of-Type-Theory">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>This natural deduction system is not just propositional but first order, with quantification. The &quot;hello world&quot; example we used for <a href="http://www.w3.org/2000/10/swap/">semantic web research</a> was <a href="http://www.w3.org/2000/10/swap/test/reason/socrates.n3">socrates.n3</a>:</p> <p>\[ \operatorname{Man}(\operatorname{socrates}) \\ \forall x (\operatorname{Man}(x) \implies \operatorname{Mortal}(x))\\ \therefore \operatorname{Mortal}(socrates) \]</p> <p>Kludging the naming a bit, it looks like this in FLiP:</p> </div> <div class="cell border-box-sizing code_cell vbox"> <div class="input hbox"> <div class="prompt input_prompt"> In&nbsp;[8]: </div> <div class="input_area box-flex1"> <div class="highlight"><pre><span class="n">clear</span><span class="p">()</span> <span class="n">Man</span> <span class="o">=</span> <span class="n">P</span> <span class="n">Mortal</span> <span class="o">=</span> <span class="n">Q</span> <span class="n">socrates</span> <span class="o">=</span> <span class="n">a</span> <span class="n">checkp</span><span class="p">(</span><span class="n">Man</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">given</span><span class="p">)</span> <span class="n">checkp</span><span class="p">(</span><span class="n">A</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">Impl</span><span class="p">(</span><span class="n">Man</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">Mortal</span><span class="p">(</span><span class="n">x</span><span class="p">))),</span> <span class="n">given</span><span class="p">)</span> <span class="n">rapply</span><span class="p">(</span><span class="n">Ae</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">a</span><span class="p">)</span> <span class="n">rapply</span><span class="p">(</span><span class="n">imple</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> </pre></div> </div> </div> <div class="vbox output_wrapper"> <div class="output vbox"> <div class="hbox output_area"><div class="prompt"></div> <div class="box-flex1 output_subarea output_stream output_stdout"> <pre> P(a) (0) Given Ax.(P(x) -&gt; Q(x)) (1) Given P(a) -&gt; Q(a) (2) A-Elimination (1), with a Q(a) (3) Implication-Elimination (Modus Ponens) (2) (0) </pre> </div> </div> </div> </div> </div> <div class="text_cell_render border-box-sizing rendered_html"> <p>Using idris and dependent types is another episode altogether, but to give a hint...</p> <p>We can state the theorem this way:</p> <pre><code>thm1 : {thing: Type} -&gt; {Man, Mortal: thing -&gt; Type} -&gt; ((x: thing) -&gt; (Man x -&gt; Mortal x)) -&gt; (socrates: thing) -&gt; (Man socrates) -&gt; (Mortal socrates)</code></pre> <p>and the proof is really simple:</p> <pre><code>thm1 all_men_mortal socrates socrates_a_man = all_men_mortal socrates socrates_a_man</code></pre> </div> <div class="text_cell_render border-box-sizing rendered_html"> <h2 id="References">References<a class="anchor-link" href="#References">&#182;</a></h2> </div> <div class="text_cell_render border-box-sizing rendered_html"> <ul> <li><a name="Alps-Neveln81">Alps, Robert A., and Robert C. Neveln. 1981</a>. “A Predicate Logic Based on Indefinite Description and Two Notions of Identity.” Notre Dame J. Formal Logic 22 (3): 251–263. doi:doi:10.1305/ndjfl/1093883460. http://projecteuclid.org/euclid.ndjfl/1093883460.</li> <li><a name="Bonevac86">Bonevac, Daniel. 1986</a>. <em>Proof: A Text for Philosophy 313K Logic, Sets and Functions</em>. Austin, Texas: Department of Philosophy, The University of Texas at Austin.</li> </ul> <p><em>cf. <a href="https://www.zotero.org/connolly/items/tag/fs86">fs86</a> tag in my zotero library.</em></p> </div> Sun, 29 Dec 2013 00:00:00 +0000 http://www.madmode.com//2013/fs-tt/fs86/