Blog RSS Feed

Archive for the ‘Code’ Category

Bible Reference Parser Code Update

Tuesday, November 20th, 2012

The semiannual release schedule of the Javascript Bible Passage Reference Parser continues. This release:

  • Improves support for parentheses.
  • Adds some alternate versification systems.
  • Supports French book names.
  • Removes the “docs” folder because it was getting unwieldy; the source itself remains commented.
  • Reorganizes some of the source code.
  • Increases the number of real-world strings from 200,000 to 370,000. I ran the parser on all 85 million tweets and Facebook posts in the Realtime Bible Search database to produce the list.

One of the main goals of this parser is to give you a starting point to build your own parser, so the source is thoroughly documented and has many tests you can use to validate your code.

Try a demo or browse the source on GitHub.

A Javascript Bible Passage Reference Parser

Friday, November 18th, 2011

Browse the Github repository of a new Bible-reference parser written in Coffeescript / Javascript (it understands text like “John 3:16”), try a demo, or review the annotated source. You can use the parser as-is or as a starting point for building your own–the source code includes 200,000 real-world passage references to give you a head start. It’s designed to handle how people actually type Bible references (typos and all) and tries hard to make sense of any input you give it.

From the readme:

This is the fourth complete Bible reference parser that I’ve written. It’s how I try out new programming languages: the first one was in PHP (2002), which saw production usage on a Bible search website from 2002-2011; the second in Perl (2007), which saw production usage on a Bible-related site starting in 2007; and the third in Ruby (2009), which never saw production usage because it was way too slow. This Coffeescript parser (at least on V8) is faster than the Perl one and 100 times faster than the Ruby one.

I chose Coffeescript out of curiosity–does it make Javascript that much more pleasant to work with? From a programming perspective, the easy loops and array comprehensions alone practically justify its use. From a readability perspective, the code is easier to follow (and come back to months later) than the equivalent Javascript–the tests, in particular, are much easier to follow without all the Javascript punctuation.

My main interest in open-sourcing and thoroughly documenting this code lies in giving future programmers data and code that they can use to build better parsers. While this code reflects my experience, it’s hardly the last word on the subject.