Dan Connolly's tinkering lab notebook

## "Light Runner", an exercise in digital preservation

In my teens, I wrote a Tron work-alike called Light Runner in assembly for the Color Computer; I even got Prickly Pear Software to release it commercially. I can't find my source code (Grrr!), but I have a copy of the product on cassette tape. I didn't quite get it restored to working order, but I had some fun trying.

The analog-to-digital-signal bit was only complicated by the fact that the cassette player couldn't be connected to my desktop:

1. Put the cassette in the boombox.
2. Hook the boombox to the netbook.
3. Use "Sound Recorder" to make lr.wma.
4. Move lr.wma to my desktop via a thumb drive.
5. Use Audacity to chop off the dead air and save in .wav format.

Processing the digital signal involved a steep learning curve. As explained on pg. 10 of The FACTS:

G. Cassette Interface - Cassette data is stored onto the tape using a format called Frequency Shift Keying (FSK). This means that two sine waves of differing frequency are used to zeroes and ones on the tape. A sine wave of 2400 hz is used to store a one, and a sine wave of 1200 Hertz is used to store a zero.

I surfed around, wondering whether to make this an R project or a numpy project. The clincher was a StackOverflow clue on detecting zero crossings by Jim Brissom Oct 1 2010:

In [2]:
def zero_crossings(signal):
return numpy.where(numpy.diff(numpy.sign(signal)))[0]
a = [1, 2, 1, 1, -3, -4, 7, 8, 9, 10, -2, 1, -3, 5, 6, 7, -10]
zero_crossings(a)

Out[2]:
array([ 3,  5,  9, 10, 11, 12, 15])

## Aside/Colophon

I'm trying out IPython as an authoring tool. I'm fond of the interactive notebook idea. The 0.12 version that comes with Ubuntu didn't support inline plotting, and cell selection was glitchy. But I'm using 0.14dev, and while it doesn't quite feel as mature as RStudio, it's getting pretty close.

See coco.ipynb notebook source.

In [1]:
%pylab inline

Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline].


Making a numpy array out of a .wav file is a piece of cake. Scaling the amplitude to fit in an 8 bit DAC like the CoCo's helps eliminate some high frequency noise: (See cloadm.py for full details.)

In [3]:
import wave
tape_fn = 'lr-cut.wav'
dest_fn = 'light-runner'
tape = wave.open(tape_fn, 'r')
dest = open(dest_fn, 'w')
amp_max = 128  # 8 bit signed
signal = signal * amp_max / max(signal)
framerate, len(signal), signal[:5]

Out[3]:
(44100, 1223945, array([16, 17, 17, 17, 17]))

The next step is to find the long sequence of alternating 0s and 1s that mark the beginning of a sequence of bytes. In Audacity, I could hear the tone right around 2 seconds in. The signal at this point looks about right:

In [4]:
t = arange(2.0, 2.01, 1.0/framerate)
_ = plot(t,
signal[t[0] * framerate:t[0] * framerate + len(t)])
grid()


The pattern of bits was close, but not quite there, no matter how I played with the threshold:

In [5]:
freqs, wave_ix = c.waves(zero_crossings(signal), framerate)
threshold = 1400  # experimental; cf. (CoCo.rate0 + CoCo.rate1) / 2
bits = (freqs > threshold) + 0
bits[wave_ix > 2.0 * framerate][:40]

Out[5]:
array([1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0,
1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0])

I pored over the disassembly from The FACTS, wondering what I'd missed:

*** LOOK FOR THE SYNC BYTES - RETURN WITH ACCA = 0 IF SYNC’ED
*** ON HI - LO TRANSITION, ACCA = $A0 IF SYNC’ED ON THE *** LO - HI TRANSITION OF THE INPUT SIGNAL FROM THE CASSETTE. CASON ORCC #$50      DISABLE IRQ,FIRQ
BSR    LA7CA     TURN ON TAPE DECK MOTOR
CLR    CPULWD    RESET UP TO SPEED COUNTER
LA782  BSR    LA763     WAIT FOR LO-HI TRANSITION
LA784  BSR    LA7AD     WAIT FOR HI-LO TRANSITION
BHI    LA797     CASSETTE SPEED IN RANGE FOR 1200 HZ
LA788  BSR    LA7A7     WAIT FOR LO-HI TRANSITION
BCS    LA79B     CASSETTE SPEED IN RANGE FOR 2400 HZ
DEC    CPULWD    DECREMENT UP TO SPEED COUNTER IF SYNC’ED ON LO-HI
LDA    CPULWD    GET IT
CMPA   #-96      HAVE THERE BEEN 96 CONSECUTIVE 1-0-1-0 PATTERNS
LA792  BNE    LA782     NO
STA    CBTPHA    SAVE WHICH TRANSITION (HI-LO OR LO-HI)
RTS


Aside: I haven't found The FACTS online, but the disassemblies are also available in Color Basic Unravelled, also by Spectral Associates, digitally restored by Aaron Wolfe.

I finally found the bug by

• focussing on an even smaller section of the signal,
• together with the calculated frequencies and the derived bits:
In [6]:
lo = int(framerate * 0.560)
hi = int(framerate * 0.580)
lo, hi

ix = intersect1d(where(wave_ix > lo)[0], where(wave_ix < hi)[0])

ones = ix[bits[ix] == 1]
zeros = ix[bits[ix] == 0]
_ = plot(arange(lo, hi), signal[lo:hi], 'b-',
wave_ix[ix], (freqs/20)[ix], 'g*',
wave_ix[ones], bits[ones], 'r+',
wave_ix[zeros], bits[zeros], 'ro')


Do you see it?

The crossings start lo-hi for the first five bits, but somehow the next bit is marked at a hi-lo transition.

I added a line of code to say that they should all go the same direction:

    assert((numpy.sign(signal[z[::2]]) == numpy.sign(z[0])).all())


When the signal crossed zero by actually hitting zero, it threw things off. So I tweaked zero_crossings algorithm to use sign() > 0 so that we have just +/-, rather than +1/0/-1:

In [7]:
def zero_crossings(signal):
return numpy.where(numpy.diff(numpy.sign(signal) > 0))[0]

zero_crossings([2, 1, 0, -1, -2, 1, 4, -4])

Out[7]:
array([1, 4, 6])

Now the 1's and 0's alternate nicely:

In [8]:
freqs, wave_ix = c.waves(zero_crossings(signal), framerate)
threshold = 1400  # experimental; cf. (CoCo.rate0 + CoCo.rate1) / 2
bits = (freqs > threshold) + 0
bits[wave_ix > 2.0 * framerate][:40]

Out[8]:
array([1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])

This allows us to find the sync pattern:

In [9]:
c.find_sync(bits)

Out[9]:
1145

Putting it all together, we see that we get a certain distance before falling out of sync:

In [10]:
import logging
logging.basicConfig(level=logging.INFO)
import StringIO
tape = wave.open(tape_fn, 'r')
dest = StringIO.StringIO()
try:
c.decode(tape, dest)
except ValueError:
print "oops!"

INFO:cloadm:block type: 0 length: 15
WARNING:cloadm:expected check $88 does not match found$8
WARNING:cloadm:expected check $6d does not match found$ed
WARNING:cloadm:expected check $f0 does not match found$70
WARNING:cloadm:expected check $6 does not match found$86
WARNING:cloadm:expected block start $55 does not match found$ab
WARNING:cloadm:expected block start $3c does not match found$78

oops!


The filename and copyright statement confirm that we're on the right track:

In [11]:
content = dest.getvalue()
print "filename:", content[:8]
print
print content[229:298].replace('\x00', '\n')

filename: RUNNER

LIGHT RUNNER
COPR(C) 1983 BY CO&CO SOFTWARE
WRITTEN BY: DAN CONNOLLY



## Related Work

Browsing around the coco mailing list I found a CASIN.EXE program by Jeff Vavasour circa 1994; I got it running under dosbox; the first thing it said was:

This .WAV file is not a 11025Hz/mono/8-bit sample.


So sox to the rescue:

\$ sox ../../lr-cut.wav -r 11025 -b 8 lr8.wav


Then CASIN made some progress, but didn't get very far. It decoded even less data than my code did. I might have felt a little silly poring over all the low-level details if there was a well-known solution that worked out of the box. But data analysis and signal processing is a growing part of my day job, and learning the scientific python toolset has been on my todo list for quite some time, so I'm glad I did.