This is the first post in a series.
- Part 1: Studying an Old E-Reader for Fun
- Part 2: Studying an Old E-Reader for Fun : Text Compression 1
- Part 3: Studying an Old E-Reader : Compressing a Dictionary with Huffman Codes
- Part 4: Studying an Old E-Reader : Compressing a Dictionary with a Prefix Tree (coming soon)
I've recently picked up an old E-Reader, specifically an electronic version of the Bible, that was made around 1990. I remember my dad having one of these growing up, and over time it got misplaced. I was reminded of it this past year, when I was using a very modern equivalent to it, only to be frustrated by the lack of some very basic features, namely some reasonable search capabilities.
Over the course of the next however long, as time permits for me to write, I want to go into some details about this fascinating little device. Now you, the reader of this blog, may not be a person of faith, and might not care about an electronic device specifically made for people of faith. But, I want to try to explore both the Computer Science side as well as User Experience side of how this works, and hopefully extract some things that might be worth consideration 30 years later.
I am a Christian, and so are my parents. I grew up in the era of the personal computer coming into being, and grew up in a household that was affluent enough to afford one. My dad was interested in new technology, and was also someone who was also someone who spent a lot of time studying his faith. He believes he bought this device not too long after it was released on the market - some time in 1990. This was before we had a PC in our house. When we did buy that first PC in 1992, a 486 DX2-66, one of the first pieces of software Dad installed on it was a version of the Bible (probably PC Study Bible, which came on a series of floppy disks).
Throughout the years I have been fascinated by file formats and data compression. For my senior project in college I worked on bringing ISO-9660 support to Net-BSD (this is the format that most CD-ROMs are written in). For a big portion of my time at my first job outside of college, I worked on a project that involved parsing a proprietary file format that involved records stored in 2KB blocks. I've worked with text searching tools such as Lucene and ElasticSearch.
Franklin Electronics and Proximity Technologies
The device that I am talking about, a KJ-21, was made by Franklin Electronics. They made two other variants, the RS-22 and NIV-20 (RS : Revised Standard, and NIV: New International Version - both of these are different translations of the Bible). Over time they made a more modern version based on some of their more modern hardware (the Bookman series), and have made many other electronic book products. In fact, in high school I had an English-Spanish dictionary (which has also been misplaced) made by Franklin, which was also built on the Bookman platform. Recently we were at a family members house and my 3 year old found the speaking English dictionary that one of their daughters owned, which was also made by Franklin.
The person who appears to be responsible for much of the underlying technology behind all of this is Peter Yianilos. He is a computer scientist who founded a company called Proximity Technologies. Proximity supplied some of the text compression and search hardware and software to Franklin's early handheld devices. One of their first products was a wildcard word search tool.
But the user can also type in blanks for letters that are not known -- and herein lies its crossword-solving prowess. Type ''-u--d--y,'' for instance, and in a few seconds the computer will retrieve all the words that fit that pattern, which in this case happen to be cupidity, humidify, humidity, lucidity, quiddity and quandary. The machine contains an 80,000-word dictionary stored on 128,000 bytes of read-only memory and pattern matching technology supplied by Proximity Technology Inc. of Fort Lauderdale, Fla.BUSINESS TECHNOLOGY; AT LAST, HELP ON THE CROSSWORD PUZZLE, New York Times, 1987
Ultimately Proximity merged with Franklin in 1988. Peter's website and the patents that he wrote are fascinating reads, and I will likely bring them up in future posts.
I'm glad I purchased a "new" device off of Ebay for about $40, fresh in its original packaging. The original box and manual that my dad owned were lost many moves ago.
The manual for the KJ-21 conveniently contains its technical specifications, saving me the trouble of opening it up and trying to identify the chips on the main board.
- 1.125 Megabytes of [read only] memory, containing:
- Complete text of the Bible
- Footnote text
- Pronunciation database
- Word form database (inflections)
- Thesaurus database
- Help database
- Executable code
- 2,048 bytes of RAM
- High-speed 16-bit model V20 together with a Franklin proprietary master-control chip.
- LCD Display: "Super-twist" type - 4 lines of text plus 66 book names.
- Keyboard: 50 keys with permanent molded legends and positive tactile response
Why Research All This?
This is a reasonable question. Why study a 30 year old device, which would appear to be a niche product?
In the modern world of XML, JSON, YAML and the like, we can easily take for granted the days when storage came at a premium. With super-fast 64-bit CPUs that can scan through gigabytes of data in under a second, we forget the tradeoffs that our predecessors made when making their applications and devices perform fast enough to meet user expectations.
When I saw how much data was squeezed into the limited memory on this device, and what it was capable of doing with such modest hardware, I was impressed and wanted to figure out how it might have been implemented.
There are a few things that I have found interesting that I want to explore:
- How do you implement a rather advanced text search, specifically by inflections, synonyms and a variable search width on limited hardwware? I have not see another e-reader that does this, though that could be due to patents. It appears that many of the relevant patents should have expired around 2010, but I am admittedly not a patent lawyer.
- How do you compress or encode the raw text of a large book like the Bible, which comes in around 4.5MB as raw text, down into less than 1MB, along with the data structures for the inflection and thesaurus databases? For reference, ZIP can only do about 1.3MB, 7z can do 1.04MB, and a state of the art algorithm (zpaq) can do 0.7MB. All of those require a lot more CPU and working RAM than this device has.
- What are some unique user experience features of this device?
As time persists in the coming months, I plan on writing about each of these things and maybe more.