The Story of EtreRef
I have been a big fan of the built-in Dictionary program in MacOS X 10.4 Tiger. When Apple released MacOS X 10.5 Leopard, they provided a Dictionary Development Kit so that 3rd party software vendors (such as Etresoft) could create additional databases. These databases are fairly easy to build and fun for people who like XML and XSLT transformations.
The hard part was finding some free content for the database. It has to be something that has a well-defined way to lookup content. For these reasons, religious texts were a close match. They are frequently used as reference material and have a well-defined lookup mechanism.
About the databases
Each database has its own story as each presented a unique set of challenges.
-
Le Dictionnaire de l'Académie Française (8ème Édition 1935)
So maybe it isn't the most updated French dictionary in the world, but it is public domain. I found this via a StarDict version. I had also found a utility to automatically convert any StarDict dictionary into MacOS X Dictionary.app format. Unfortunately, the utility has some bugs and can't properly convert this particular dictionary. I fixed them pretty easily, but then found the XDXF version that I would rather use anyway. This dictionary was probably the most straightforward conversion I've done. I had to write a little Perl to extract the mascule/feminine endings and alternate forms. I have a few additional French databases to convert in the future. -
Holy Bible (King James edition)
There is already a well-established community of Bible researchers who have created Bible databases in many different formats. I chose to start with the USFX XML version of the Bible. While I think USFX is the best of the different XML encodings of the Bible, it certainly isn't easy to work with. Most of the XML tags identify only the start of a particular section. To make matters worse, it also includes tags to indicate paragraphs. These paragraph tags are hierarchical in nature, which is good, but they contain chapter and verse start tags, which was a mess. The USFX XSLT transformation I created for this project is, without doubt, the most complicated XSLT transformation I have ever seen. -
Holy Bible (American Standard edition)
In theory, if I can construct a dictionary from one USFX file, I can do it for any USFX file. So to test said theory, I used a different version of the Bible. Unfortunately, programming is a bit different than the real world. What do you call a program that works on 99 out of 100 test cases? A failure.
Eventually I did manage to get both dictionaries building with virtually identical transformations. -
Holy Quran (text edition)
My interest in these databases lies in the MacOS X Dictionary and XML. I wanted to be fair and not just do Christian works. The Quran added an additional dimension of Unicode into the mix. Unlike the Christian Bible, a well-formatted XML version of the Quran is fairly easy to find. Still, Muslims are very particular about the Quran. It has to be an accurate representation of the words of Allah and it may take years to verify a particular digital Quranic encoding. To make matters worse, some people don't feel that Unicode can accurately represent the cursive text of the Quran. No problem! They just hack up a new version of the Unicode standard and create a new font to match. Unfortunately, MacOS X can't handle an OpenType font pushed to the limits like that. Eventually, I was able to find Unicode text that can be represented on a Mac and is, apparently, an accurate transcription. At least I now have quite a bit of supporting material for future, multimedia-enhanced, versions of the Quran.
About EtreRef
EtreRef 1.0.1 is freeware, not shareware. All of the content comes from various open-source databases. I cannot charge money for a reformatted version of an open-source database. These databases are quite large, however. If their download causes Etresoft to exceed it's web hosting bandwidth I reserve the right to charge for the downloads - but hopefully that won't be necessary.
My immediate list of enhancements to EtreRef is as follows:
- Additional databases.
- Enhancements to the Quran - pronunciation, GIF images for Unicode validation, recitals.
- Preferences to allow user-defined content.
If you would like to see a new feature or two in a future version of EtreRef, please let me know! As many Decoder users can attest, I’m quite happy to add features, even for a single user. You won’t know until you ask. If it is possible in the Dictionary program, I will try to do it.
