Blog RSS Feed

Archive for the ‘Technology’ Category

“Hacking the Bible” in Christianity Today

Thursday, March 6th, 2014

Read the article This month’s Christianity Today cover story, The Bible in the Original Geek, talks about how programmers are using technology to change how we read, study, and interpret the Bible. If you’re interested in the Bible and technology (and if you’re reading this blog, you probably are), then you should go read it.

(Ted Olsen, the author of the article, is doing also did an AMA on Reddit about the article.)

The article talks about the “academic priesthood,” and I think it’s particularly interesting that so few universities are interested in “digital theology” (for lack of a better term). You can study at Durham (like John Dyer is) or King’s College London, or you can try to work a biblical emphasis into a digital rhetoric Ph.D. But I’m surprised that more institutions, especially evangelical seminaries, aren’t at the forefront of the kind of research described in the article.

Religious Interest among Facebook Users

Thursday, April 25th, 2013

I have a post on the Bible Gateway blog that briefly looks at how religious interest among Facebook users varies with age.

In particular, eighteen-year-old women appear to have an especially strong interest in religion, which drops off sharply during their 20s. (Barna in 2003 published findings that corroborate the dropoff.)

The post makes some possibly unwarranted inferences from the original data published yesterday by Stephen Wolfram:

quotes + life philosophy data by age

How to Train Your Franken-Bible

Saturday, March 16th, 2013

This is the outline of a talk that I gave earlier today at the BibleTech 2013 conference.

  • There are two parts to this talk
    • Inevitability of algorithmic translations. An algorithmic translation (or “Franken-Bible”) means a translation that’s at least partly done by a computer.
    • How to produce one with current technology.

  • The number of English translations has grown over the past five centuries and has accelerated recently. This growth mirrors overall trend in publishing as costs have diminished and publishers have proliferated.
  • I argue that this trend of ever-more translations will not only continue but will accelerate, that we’re heading for a post-translation world, where the number of English translations becomes so high, and the market so fragmented, that Bible translations as distinct identities will have much less meaning than they do today.
    • This trend isn’t specific to Bibles, but Bible translations participate in the larger shift to more diverse and fragmented cultural expressions.

  • Like other media, Bible translations are subject to a variety of pressures.
    • Linguistic (e.g., as English becomes more gender-inclusive, pressure rises on Bible translations to also become more inclusive).
    • Academic (discoveries that shed new light on existing translations). My favorite example is Matthew 17:15, where some older translations say, “My son is a lunatic,” while newer translations say, “My son has epilepsy.”
    • Theological / doctrinal (conform to certain understandings or agendas).
    • Social (decline in public religious influence leads to a loss of both shared stories and religious vocabulary).
    • Moral (pressure to address whatever the pressing issue of the day is).
    • Institutional (internally, where Bible translators want control over their translations; and externally, with the wider, increasing distrust of institutions).
    • Market. Bible translators need to make enough money (directly or indirectly) to sustain translation operations.
    • Technological. Technological pressure increases the variability and intensity of other pressures and is the main one we’ll be investigating today.

  • If you’re familiar with Clayton Christensen’s The Innovator’s Dilemma, you know that the basic premise is that existing dominant companies are eventually eclipsed by newer companies who release an inferior product at low prices. The dominant companies are happy to let the new company operate in this low-margin area, since they prefer to focus on their higher-margin businesses. The new company then steadily encroaches on the territory previously staked out by the existing companies until the existing companies have a much-diminished business. Eventually the formerly upstart company becomes the incumbent, and the cycle begins again.
    • One of the main drivers of this disruption is technology, where a technology that’s vastly inferior to existing methods in terms of quality comes at a much lower price and eventually supersedes existing methods.
  • I argue that English Bible translation is ripe for disruption, and that this disruption will take the form of large numbers of specialized translations that are, from the point of view of Bible translators, vastly inferior. But they’ll be prolific and easy to produce and will eventually supplant existing modern translations.

  • For an analogy, let’s look at book publishing (or any media, really, like news, music, or movies. But book publishing is what I’m most familiar with, so it’s what I’ll talk about). In the past twenty years, it’s gone through two major disruptions.
    • The first is a disruption in distribution, with Amazon.com and other web retailers. National bookstore chains consolidated or folded as they struggled to figure out how to compete with the lower prices and wider selection offered online.
    • This change hurt existing retailers but didn’t really affect the way that content creators like publishers and authors did business. From their perspective, selling through Amazon isn’t that different from selling through Barnes & Noble.
    • The second change is more disruptive to content creators: this change is the switch away from print books to ebooks. At first, this change seems more like a difference in degree rather than in kind. Again, from a publisher’s perspective, it seems like selling an ebook through Amazon is just a more-convenient way of selling a print book through Amazon.
    • But ebooks actually allow whole new businesses to emerge.

  • I’d argue that these are the main functions that publishers serve for authors–in other words, why would I, as an author, want a publisher to publish my book in exchange for some small cut of the profit?
    • Gatekeeping (by publishing a book, they’re saying it’s worth your time and has a certain level of quality: it’s been edited and vetted).
    • Marketing (making sure that people know about and buy books that are interesting to them).
    • Distribution (historically, shipping print books to bookstores).
  • Ebooks most-obviously remove the distribution pillar of this model–when producing and distributing an epub only involves a few clicks, it’s hard to argue that a publisher is adding a whole lot of value in distribution.
  • That leaves gatekeeping and marketing, which I’ll return to later in the context of Bibles.

  • But beyond just affecting these pillars, ebooks also allow new kinds of products:
    • First, a wider variety of content becomes more economically viable–content that’s too long or too short to print as a book can work great as an ebook, for example.
    • Second, self-publishing becomes more attractive: when traditional publishing shuns you because, say, your book is terrible, just go direct and let the market decide just how terrible it is. And if no one will buy it, you can always give it away–it’s not like it’s costing you anything.
  • So ebooks primarily allow large numbers of low-quality, low-priced books into the market, which fits the definition of disruption we talked about earlier.

  • Let’s talk specifically about Bible translations.
  • Traditionally, Bible translations have been expensive endeavors, involving teams of dozens of people working over several years.
    • The result is a high-quality product that conforms to the translation’s intended purpose.
    • In return for this high level of quality, Bible publishers charge money to, at a minimum, recoup their costs.
  • What would happen if we applied the lessons from the ongoing disruption in book publishing to Bible translations?
    • First, like Amazon and physical bookstores, we disrupt distribution.
    • Bible Gateway in 1993 on the web first disrupted distribution by letting people browse translations for free.
    • YouVersion in 2010 on mobile then took that disruption a step further by letting people download and own translations for free.
    • But we’re really only talking about a disruption in distribution here. Just like with print books, this type of disruption doesn’t affect the core Bible-translation process.
  • That second type of disruption is still to come and will eventually arrive; I’m going to argue that, as with ebooks, the disruption to translations themselves will be largely technological and will result in an explosion of new translations.
    • I believe that this disruption will take the form of partially algorithmic personalized Bible translations, or Franken-Bibles.
    • Because Bible translation is a specialized skill, these Franken-Bibles won’t arise from scratch–instead, they’ll build on existing translations and will be tailored to a particular audience–either a single individual or a group of people.

  • In its simplest form, a Franken-Bible could involve swapping out a footnote reading for the reading that’s in the main text.

  • A typical Bible has around 1100 footnotes indicating alternate translations. What if a translation allowed you to decide which reading you preferred–either by setting a policy (for example, you might, say, “always translate adelphoi in a particular way”) or by deciding on a case-by-case basis.

  • By including footnote variants at all, translations have already set themselves up for this approach. Some translations go even further–the Amplified and Expanded Bibles embrace variants by embedding them right in the text. Here we see a more-extensive version of the same idea.
  • But that’s the simplest approach. A more-radical approach, from a translation-integrity perspective, would allow people to edit the text of the translation itself, not merely to choose among pre-approved alternatives at given points.
    • In many ways, the pastor who says in a sermon, “This verse might better be translated as…” is already doing this; it just isn’t propagated back into the translation.
    • People also do this on Twitter all the time, where they alter a few words in a verse to make it more meaningful to them.
    • The risk here, of course, is that people will twist the text beyond all recognition, either through incompetence or malice.

  • That risk brings us back to one of the other two functions that publishers serve: gatekeeping.
    • A translation is nothing if not gatekeeping: a group of people have gotten together and declared, “This is what Scripture says. These are our translation principles, and here’s where we stand in relation to other translations.”
    • What happens to gatekeeping–to the seal of authority and trust–when anyone can change the text to suit themselves?

  • In other words, what happens if the “Word of God” becomes the “Wiki of God?”
    • After all, people who aren’t interested in translating their own Bible still want to be able to trust that they’re reading an accurate, or at least non-heretical, translation.

  • I suggest that new axes of trust will form. Whereas today a translation “brand”–NIV, ESV, or whatever–carries certain trust signals, those signals will shift to new parties, and in particular to groups that people already trust.
    • The question of whom you’d trust to steward a Bible translation probably isn’t that different from whomever you already trust theologically, a group that probably includes some of the following:
      • Social network.
      • Teachers or elders in your church.
      • Pastors in your church.
      • Your denomination.
      • More indirectly, a megachurch pastor you trust.
      • A parachurch organization or other nonprofit.
      • Maybe even a corporation.
    • The point is not that trust in a given translation would go away, but rather that trust would become more networked, complicated, and fragmented, before eventually solidifying.
      • We already see this happening somewhat with existing Bible translations, where certain groups declare some translations as OK and others as questionable. Like your choice of cellphone, the translation you use becomes an indicator of group identity.
      • The proliferation of translations will allow these groups who are already issuing imprimaturs to go a step further and advance whole translations that fit their viewpoints.
      • In other words, they’ll be able to act as their own Magisteriums. Their own self-published Magisteriums.
      • This, by the way, also addresses the third function that publishers serve: marketing. By associating a translation with a group you already identify with, you reduce the need for marketing.

  • I said earlier that I think technology will bring about this situation, but there are a couple ways it could happen.
    • First, an existing modern translation could open itself up to such modifications. A church could say, “I like this translation except for these ten verses.” Or, “This translation is fine except that it should translate this Greek word this way instead of this other way.”
    • Such a flexible translation could provide the tools to edit itself–with enough usage, it’s possible that useful improvements could be incorporated back into the original translation.
  • A second alternative is to use technology to produce a new, cheap, low-quality translation with editing in mind from the get-go, to provide a base from which a monstrous hydra of translations can grow. Let’s take a look at what such a hydra translation, or a Franken-Bible, could look like.

  • The basic premise is this: there are around thirty modern, high-quality translations of the Bible into English. Can we combine these translations algorithmically into something that charts the possibility space of the original text?
    • Bible translators already consult existing translations to explore nuances of meaning. What I propose is to consult these translations computationally and lay bare as many nuances as possible.

  • You can explore the output of what I’m about to discuss at www.adaptivebible.com. I’m going to talk about the process that went into creating the site.
  • This part of the talk is pretty technical.

  • In all, there are fifteen steps, broken into two phases: alignment of existing translations and generation of a new translation.
  • It took about ten minutes of processor time for each verse to produce the result.
  • The total cost in server time on Amazon EC2 to translate the New Testament was about $10. Compared to the millions of dollars that a traditional translation costs, that’s a big savings–five or six orders of magnitude.

  • The first phase is alignment.
  • First step. Collect as many English translations as possible. Obviously there are copyright implications with doing that, so it’s important to deal with only one verse at a time, which is something that all translations explicitly allow. For this project, we used around thirty translations.
  • Second. Normalize the text as much as possible. For this project, we’re not interested in any formatting, for example, so we can deal with just the plain text.
  • Third. Tokenize the text and run basic linguistic analysis on it.
    • Off-the-shelf open-source software from Stanford called Stanford CoreNLP tokenizes the text, identifies lemmas (base forms of words) and analyzes how words are related to each other syntactically.
    • In general, it’s about 90% accurate, which is fine for our purposes; we’ll be trying to enhance that accuracy later.
  • Fourth. Identify Wordnet similarities between translations.
    • Wordnet is a giant database of word meanings that computers can understand.
    • We take the lemmas from the step 3 and identify how close in meaning they are to each other. The thinking is that even when translations use different words for the same underlying Greek word, the words they choose will at least be similar in meaning.
    • For this step, we used Python’s Natural Language Toolkit.
  • Fifth. Run an off-the-shelf translation aligner.
    • We used another open-source program called the Berkeley Aligner, which is designed to use statistics to align content between different languages. But it works just as well for different translations of the same content in the same language. It takes anywhere from two to ten minutes for each verse to run.
  • Sixth. Consolidate all this data for future processing.
    • By this point, we have around 4MB of data for each verse, so we consolidate it into a format that’s easy for us to access in later steps.
  • Seventh. Run a machine-learning algorithm over the data to identify the best alignment between single words in each pair of translations.
    • We used another Python module, scikit-learn, to execute the algorithm.
    • In particular, we used Random Forest, which is a supervised-learning system. That means we need to feed it some data we know is good so that it can learn the patterns in the data.

  • Where did we get this good data? We wrote a simple drag-and-drop aligner to feed the algorithm, where there are two lists of words and you drag them on top of each other if they match; it’s actually kind of fun: if you juiced it up a little, I can totally see it becoming a game called “Translations with Friends.”
    • In total, we hand-aligned around 30 pairs of translations across 25 verses. There are about 8,000 verses in the New Testament, so it doesn’t need a lot of training to get good results.

  • What the algorithm actually runs on is a big vector matrix. These are the ten factors we included in our matrix.
    • 1. One translation might begin a verse with the words “Jesus said,” while another might put that same phrase at the end of the verse. All things being equal, though, translations tend to put words in similar positions in the verse. When all else fails, it’s worth taking position into account.
    • 2. Similarly, even when translations rearrange words, they’ll often keep them in the same sentence. Again, all things being equal, it’s more likely that the same word will appear in the same sentence position across translations.
    • 3. If we know that a particular word is in a prepositional phrase, for example, it’s not unlikely that it will serve a similar grammatical role in another translation.
    • 4. If words in different translations are both nouns or both verbs, it’s more likely that they’re translating the same word than if one’s a noun and another’s an adverb.
    • 5. Here we use the output from the Berkeley Aligner we ran earlier. The aligner is bidirectional, so if we’re comparing the word “Jesus” in one translation with the word “he” in another, we look both at what the Berkeley Aligner says “Jesus” should line up with in one translation and with what “he” should line up with in the other translation. It provides a fuller picture than just going in one direction.
    • 6. Here we go more general. Even if the Berkeley Aligner didn’t match up “Jesus” and “he” in the two translations we’re currently looking at, if other translations use “he” and the Aligner successfully aligned them with “Jesus”, we want to take that into account.
    • 7. This is similar to grammatical context but looks specifically at dependencies, which describe direct relationships between words. For example, if a word is the subject of a sentence in one translation, it’s likely to be the subject of a sentence in another translation.
    • 8. Wordnet similarity looks at the similarities we calculated earlier–words with similar meanings are more likely to reflect the same underlying words.
    • 9. This step strips out all words that aren’t nouns, pronouns, adjectives, verbs, and adverbs and compares their sequence–if a different word appears between two identical words across translations, there’s a good chance that it means the same thing.
    • 10. Finally, we look at any dependencies between major words; it’s a coarser version of what we did in #7.
    • The end result a giant matrix of data–ten vectors for every word-combination in every translation in every verse–and we run our machine-learning algorithm on it, which produces an alignment between every word in every translation.
    • At this point, we’ve generated between 50 and 250MB of data for every verse.

  • Eighth. Now that we have the direct alignment, we supplement it with indirect alignment data across translations. In other words, to reuse our earlier example, the alignment between two translations may not align “Jesus” and “he,” but alignments in other translations might strongly suggest that the two should be aligned.
  • At this point, we have a reasonable alignment among all the translations. It’s not perfect, but it doesn’t have to be. Now we shift to the second phase: generating a range of possible translations from this data.

  • First. Consolidate alignments into phrases, where we look for runs of parallel words. You can see that we’re only looking at lemmas here–dealing with every word creates a lot of noise that doesn’t add much value, so we ignore the less-important words. In this case, the first two have identical phrases even though the words differ slightly, while the third structures the sentence differently.
  • Second. Arrange translations into clusters based on how similar they are to each other structurally. In this example, the first two form a cluster, and the the third would be part of a different cluster.

  • Third. Insert actual keyword text. I’ve been using words in the examples I’ve been giving, but in the actual program, we use numerical ids assigned to each word. Here we start to introduce actual words.
  • Fourth. Fill in the gaps between keywords. We add in small words like conjunctions and prepositions that are key to producing recognizable English.
  • Fifth. Add in punctuation. Up to this point, we’ve been focusing on the commonalities among translations. Now we’re starting to focus on differences to produce a polished output.
  • Sixth. Reduce the possibility space to accept only valid bigrams. “Bigrams” just means two words in a row. We remove any two-word combinations that, based on our algorithm thus far, look like they should work but don’t. We check each pair of words to see whether they exist anywhere in one of our source translations. If they don’t, we get rid of them.

  • Seventh. Produce rendered output.

  • In this case, the output is just for the Adaptive Bible website. It shows the various translation possibilities for each verse.
    • Hovering over a reading shows what the site thinks are valid next words based on what you’re hovering over. (That’s the yellow.)
    • You can click a particular reading if you think it’s the best one, and the other readings disappear. Clicking again restores them. (That’s the green.)
    • The website shows a single sentence structure that it thinks has the best chance of being valid, but most verses have multiple valid structures that we don’t bother to show here.

  • To consider a verse successfully translated, this process has to produce readings supported by two independent translation streams (e.g., having a reading supported only by ESV and RSV doesn’t count because ESV is derived from RSV).
    • Using this metric, the process I’ve described produces valid output for 96% of verses in the New Testament.
    • On the current version of adaptivebible.com, I use stricter criteria, so only 91% of verses show up.

  • Limitations
    • Just because a verse passes the test, that doesn’t mean it’s actually grammatical, and it certainly doesn’t mean that every alternative presented within a verse is valid.
    • Because we use bigrams for validity, we can get into situations like what you see here, where all these are valid bigrams, but the result (“Jesus said, ‘Be healed,’ Jesus said”) is ridiculous.
    • There’s no handling of inter-verse transitions; even if a verse is totally valid, it may not read smoothly into the next verse.
    • Since we removed all formatting at the beginning of the process, there’s no formatting.
  • Despite those limitations, the process produced a couple of mildly interesting byproducts.

  • Probabilistic Strongs-to-Wordnet sense alignment. Given a single Strong’s alignment and a variety of translations, we can explore the semantic range of a Strong’s number. Here we have dunamis. This seems like a reasonably good approximation of its definition in English.

  • Identifying translation similarity. This slide explores how structurally similar translations are to each other, based on the phrase clusters we produced. The results are pretty much what I’d expect: translations that are derived from each other tend to be similar to each other.

  • What I’ve just described is one pretty basic approach to what I think is inevitable: the explosion of translations into Franken-Bibles as technology gets better. In the future, we won’t be talking about particular translations anymore but rather about trust networks.
  • To be clear, I’m not saying that I think this development is a particularly great one for the church, and it’s definitely not good for existing Bible translations. But I do think it’s only a matter of time until Franken-Bibles arrive. At first they’ll be unwieldy and ridiculously bad, but over time they’ll adapt, improve, and will need to be taken seriously.

Rise of the Robosermon

Sunday, April 29th, 2012

In a recent issue of Wired, Steven Levy writes about Narrative Science, a company that uses data to write automated news stories. Right now, they mostly deal in data-intensive fields like sports and finance, but the company is confident that it will easily expand into other areas—the company’s co-founder even predicts that an algorithm will win a Pulitzer Prize in the next five years.

In February 2012, I attended a session at the TOC Conference given in part by Kristian Hammond, the CTO and co-founder of Narrative Science. During the session, Hammond mentioned that sports stories have a limited number of angles (e.g., a “blowout win” or a “come-from-behind victory”)—you can probably sit down and think up a fairly comprehensive list in short order. Even in fictional sports stories, writers only use around sixty common tropes as part of the narrative. Once you have an angle (or your algorithm has decided on one), you just slot in the relevant data, add a little color commentary, and you have your story.

At the time, I was struggling to understand how automated content could apply to Bible study; Levy’s article leads me to think that robosermons, or sermons automatically generated by a computer program, are the way of the future.

Parts of a Robosermon

Futurama has a robot preacher. I've never seen these episodes, so hopefully this image isn't terribly heretical. After all, from a data perspective, sermons don’t differ much from sports stories. In particular, they have three components:

First, as with sports stories, sermons follow predictable structures and patterns. David Schmitt of Concordia Theological Seminary suggests a taxonomy of around thirty sermon structures. Even if this list isn’t comprehensive, it would probably take, at most, 100 to 200 structures to categorize nearly all sermons.

Second, sermons deal with predictable content: whereas sports have box scores, sermons have Bible texts and topics. A sermon will probably deal with a passage from the Bible in some way—the 31,000 verses in the Bible comprise a large but manageable set of source material (especially since most sermons involve a passage, not a single verse; you can probably cut this list down to around 2,000 sections). Topically, SermonCentral.com lists only 500 sermon topics in their database of 120,000 sermons. The power-law popularity distribution (i.e., the 80/20 rule) of verses preached on (on SermonCentral.com are 1,200 sermons on John 1 compared to seven on Numbers 35) and topics (1,400 sermons on “Jesus’ teachings” vs. four on “morning”) means that you can categorize most sermons using a small portion of the available possibilities.

Third, sermons generally involve illustrations or stories, much like the color commentary of sports stories. Finding raw material for illustrations shouldn’t present a problem to a computer program; a quick search on Amazon turns up 1,700 books on sermon illustrations and an additional 10,000 or so on general anecdotes. You can probably extract hundreds of thousands of illustrations from just these sources. Alternately, if a recent news story relates to your topic, the system can add the relevant parts to your sermon with little trouble (especially if a computer wrote the news story to begin with).

Application

You end up being able to say, “I want to preach a sermon on Philippians 2 that emphasizes Christ’s humility as a model for us.” Then—and here’s the part that doesn’t exist yet but that technology like Narrative Science’s will provide—an algorithm suggests, say, an amusing but poignant anecdote to start with, followed by three points of exegesis, exhortation, and application, and finishing with a trenchant conclusion. You tweak the content a bit, throwing in a shout-out to a behind-the-scenes parishioner who does a lot of work but rarely receives recognition, and call it done.

Why limit sermons to pastors, though? Why shouldn’t churchgoers be able to ask for custom sermons that fit exactly their circumstances? “I’d like a ten-part audio sermon series on Revelation from a dispensational perspective where each sermon exactly fits the length of my commute.” “Give me six weeks of premarital devotions for my boyfriend and me. I’ve always been a fan of Charles Spurgeon, so make it sound like he wrote them.”

Levy opens his Wired article with an anecdote about how grandparents would find articles about their grandchildren’s Little League games just as interesting as “anything on the sports pages.” He doesn’t mention that what they really want is a recap with their grandchild as the star (or at least as a strong supporting character—it’s like one of those children’s books where you customize the main character’s name and appearance). Robosermons let you tailor the sermon’s content so that your specific problems or questions form the central theme.

The logical end of this technology is a sermonbot that develops a following of eager listeners and readers, in the same way that an automated newspaper reporter would create fans on its way to winning a Pulitzer.

You may argue that robosermons diminish the role of the Holy Spirit in preparing sermons, or that they amount to plagiarism. I’m not inclined to disagree with you.

Conclusion

Building a robosermon system involves five components: (1) sermon structures; (2) Bible verses; (3) topics; (4) illustrations; and (5) technology like Narrative Science’s to put everything together coherently. It would also be helpful to have (6) a large set of existing sermons to serve as raw data. It’s a complicated problem but hardly an insurmountable one over the next ten years, should someone want to tackle it.

I’m not sure they should; that way lies robopologetics and robovangelism.

If you’re not an algorithm and you want to know how to prepare and deliver a sermon, I suggest listening to this 29-part course on preaching by Bryan Chapell at Biblical Training. It’s free and full of homiletic goodness.

The Topical Index and the Living Index

Monday, January 2nd, 2012

The New York Times writes about the first-ever topical index for the Talmud. It looks like a topical Bible and contains 6,600 topics and 27,000 subtopics. (For comparison, Nave’s Topical Bible contains about 5,300 topics and 20,000 subtopics.)

The first page of the Talmud topical index shows entries for Aaron (seven subtopics), Abandon (five subtopics), and Abba (part of two subtopics).

The Sabbath

Two parts of the article stand out. First:

For three decades, Talmud students have been able to use a Nexis-like CD search engine, the Responsa Project, created by Bar Ilan University in Israel…. Bar Ilan officials acknowledged that the CD had one major disadvantage: it cannot be accessed on the Sabbath, when much learning takes place. It also costs $790.

Of course it makes sense that you wouldn’t be able to use a digital study tool on the Sabbath; it had just never occurred to me. The evangelical analog might be having to use a print Bible in church instead of a mobile or projected version.

The Living Index

The second highlight from the article is:

Rabbi Benjamin Blech, professor of Talmud at Yeshiva University, said the rabbis believed that study should not be made too easy. “We want people to struggle with the text because by figuring it out you will have a deeper comprehension,” he said. “They wanted a living index, not a printed index.

Bible software, websites, and apps are all working to create a “living index” (or at least a “responsive index”) of the Bible that lets you find comprehensive answers to every question that pops into your head while studying the Bible. But will this work devalue the actual experts (pastors and Bible teachers) who currently serve as living indexes?

The book The Lean Startup provides a framework for answering this question. It quotes Farbood Nivi, founder of test-preparation website Grockit:

“Whether you’re studying for the SAT or you’re studying for algebra, you study in one of three ways. You spend some time with experts, you spend some time on your own, and you spend some time with your peers. Grockit offers these three same formats of studying. What we do is we apply technology and algorithms to optimize those three forms.”

Or, in evangelical terms:

Study Type Current Source
Time on your own Personal Bible study; daily quiet time
Time with your peers Small group Bible study; Sunday school (not taught by clergy)
Time with experts Sunday morning sermon; radio and television programs; in-person academic Bible classes

Bible software has historically augmented “time on your own” by tying together study materials: connecting documents to each other.

Recently, Bible software has expanded into “time with your peers” by mixing in the social layer that is enveloping the wider world of technology—“away from a web that connects documents together to a web that connects people together,” as Paul Adams puts it in his book Grouped. Bible software has three options when embedding social technologies: (1) Inject technology into existing offline practices (e.g., automating the irritating or the expensive); (2) Copy technology from secular sources (e.g., foursquare-style checkins); or (3) Come up with something new. The most likely outcome will involve a combination of these options.

Eventually, Bible software will delve into “time with experts,” as well, whether those experts are your local pastor or nationally recognized figures. Biblical Training is pioneering this approach in the field of Bible studies and theology, while Stanford and MIT are leading the way in other fields.

Will Bible software someday become an “expert” itself, giving you custom answers to your questions and a personal study plan? It’s certainly possible. Khan Academy is already disrupting math education and assessment. Someone will undoubtedly explore whether a similar approach works for Bible studies.

Of course, the open question is just how much we want Bible software to function as a living index; the rabbis who preferred that students “struggle” with text have a point that learning and wisdom come with effort. In a future where Bible software can provide time for you to study on your own, with peers, and with experts, I guess we’ll find out just how “easy” we want Bible study to become.

Apologetics and Anti-Apologetics Apps

Friday, July 2nd, 2010

The New York Times discusses the latest iPhone trend: apps that provide talking points for the existence of God, pro and con.

In a dozen new phone applications, whether faith-based or faith-bashing, the prospective debater is given a primer on the basic rules of engagement — how to parry the circular argument, the false dichotomy, the ad hominem attack, the straw man — and then coached on all the likely flashpoints of contention….

Users can scroll from topic to topic to prepare themselves or, in the heat of a dispute, search for the point at hand — and the perfect retort.

I expect that, eventually, we’ll just let our phones argue the basic questions of existence among themselves, and then they’ll let us know what they’ve decided.

A screenshot from the LifeWay Fast Facts app

“Printing” 3D Maps and Bible Objects

Thursday, December 13th, 2007

The Wall Street Journal yesterday had an article about 3D printers turning web creations into physical models (the link may or may not work for you). Here’s the paragraph that got my attention:

A World of Warcraft figurine created by a 3D printer.In Redmond, Wash., a start-up called 3D Outlook Corp. this month will begin using software from NASA to sell 3-D models of mountains and other terrain priced at under $100, says Tom Gaskins, the company’s chief executive officer. Mr. Gaskins says hikers, resorts and real-estate firms are likely customers for 3-D maps and models that show the topographic contours of ski slopes, golf courses and other landscapes.

So you could, in theory, take Google Earth satellite and elevation data, combine it with the Bible geocoding data, and produce a custom 3D physical model of the Holy Land with key places labeled—or you could focus on one area, like Jerusalem, and show how it changed over time.

Or you could take a 3D model of Herod’s Temple and turn it into a 3D “printout.”

Anything you can represent in three dimensions—for example, the Ark of the Covenant or an ancient synagogue—can become a 3D model. In the future, I imagine technically minded Bible students producing a 3D model of something in the Bible, backed by research, as a final project in a class.

Imagine a 3D recreation of Nineveh in the time of Jonah. Or a complete reconstruction of a partially uncovered artifact uncovered on an archaeological dig. Or a way to recreate a variety of pottery vessels based on location and period (to help archaeology students familiarize themselves with identifying pottery from sherds—maybe you could “print” several vessels, then break them, mix up the remains, and have to identify the location and period of several sherds; it sounds like an interesting exam to me). Or a way for museums to share exact replicas of items in their collections with universities, so students can examine the items more closely than they can the originals. And if you break one? Just print another copy.

Obviously the possibilities extend beyond the realm of biblical studies, but affordable 3D printing opens a lot of intriguing doors. Hopefully it’ll get cheaper and more widespread soon.

Bible Microformats

Saturday, May 26th, 2007

Sean from Blogos proposes a microformat for marking up Bible references on the web.

About Microformats

Microformats are a way of marking up semantic data in HTML without inventing new elements or attributes. For example, here’s how you might mark up geographic coordinates:

<div class="geo">GEO: <span class="latitude">37.386013</span>, <span class="longitude">-122.082932</span></div>

In this way, computer programs can figure out without any ambiguity that the above sequence of numbers refers to latitude and longitude. Browsers, for example, might automatically link the coordinates to Google Maps or your mapping application of choice. Firefox 3 is evolving along these lines.

Bible Microformat

I’ve been thinking for a while about the best syntax to use for a Bible microformat. The problem I kept running into was in coming up with the One True Representation of a Bible verse (i.e., is it “John 3:16” or “Jn 3:16” or “John.3.16” or something else).

Sean neatly sidesteps the problem with a “good enough” solution. He proposes a format akin to the following:

<abbr class="bibleref" title="John 3:16">Jn 3:16</abbr>

The crucial aspect is that it doesn’t matter exactly how you specify the Bible verse—once you do the hard part of indicating that a string of text is a verse reference (the class="bibleref"), any decent reference parser should be able to figure out which verses you mean. It’s so simple it’s brilliant.

Now let’s push it a little further.

I suggest that the microformat should take advantage of the underused and, in this case, semantically more meaningful <cite> tag rather than the <abbr> tag. You are, after all, citing the Bible.

<cite class="bibleref">John 3:16</cite>

However, you also have to account for people who link the verse to their favorite online Bible. You could double-up the tags:

<a href="…"><cite class="bibleref">John 3:16</cite></a>

But it’s messier than need be. Since the practice of linking is widespread, why not overload the <a> tag with the appropriate class:

<a href="…" class="bibleref">John 3:16</a>

Both <cite> and <a> have a title attribute in which you can place a human- (and machine-) readable version of the verse if you choose. The title is optional as long as the verse reference is the only text inside the tag. Indeed, a title is required only if the element’s text is ambiguous (a verse without a chapter and book, for example, or completely unrelated text). (The practice of not recording duplicate information is the Don’t Repeat Yourself principle.) For example:

<p>God <a href="…" class="bibleref" title="John 3:16">loves</a> us.</p>

Corner Cases

So how would you specify a Bible translation if a specific translation were germane to the citation’s context? (In theory, when you don’t specify a translation, people consuming the microformat could choose to see the passage in the translation of their choice, similar to how some people prefer to look up an address in Google Maps, while others prefer Yahoo Maps.) I’m sympathetic to the OSIS practice of indicating the translation first, followed by the reference. For example:

<cite class="bibleref" title="ESV: John 3:16">Jn 3:16</cite>

This practice follows the logical progression of going from general to specific:

[Implied Language] → Translation → Book → Chapter → Verse

The title is also human-readable, though it departs from the standard practice of placing the translation identifier after the reference.

Sean mentions two other cases of note: verse ranges (e.g., “John 3:16-17”) and compound verses (e.g., “John 3:16, 18, 20”). Personally, I see no reason for a biblerefrange attribute as he suggests. A bible reference parser should be able to handle a continuous range as easily as a single verse.

But compound verses present a more complex problem. How do you mark them up? The above examples all stand on their own, which is one of the principles of microformats—you parse the element and get everything you need. But let’s say you have the text “John 3:16, 18.” Treating the range as a unit is easy:

<cite class="bibleref">John 3:16, 18</cite>

Any parser will handle that text; though it could be ambiguous (do you mean John 3:16 and John 18?), in practice it rarely is. But what if you mark them up separately?

<cite class="bibleref">John 3:16<cite>, <cite class="bibleref">18</cite>

In this case, the “18” doesn’t communicate enough information to the parser. The parser could maintain a state and know that the previous reference was to John 3:16, but state requirements increase the parser’s complexity, which in turn defeats the purpose of the microformat in the first place. In such cases, then, I would argue that a title attribute is necessary:

<cite class="bibleref">John 3:16<cite>, <cite class="bibleref" title="John 3:18">18</cite>

Putting It All Together

Here’s my Bible microformat proposal:

Citing a Bible Verse without Linking to It

<cite class="bibleref">[reference]</cite>

Citing a Bible Verse while Linking to It

<a class="bibleref">[reference]</a>

Citing a Bible Verse Indirectly (or When the Text Is Ambiguous) without Linking to It

<cite class="bibleref" title="[reference]">[any text]</cite>

Citing a Bible Verse Indirectly (or When the Link Text Is Ambiguous) while Linking to It

<a class="bibleref" title="[reference]">[any text]</a>

Verse Reference Format

The [reference] in the above examples refers to a machine-parsable and human-readable representation of a single verse, a range of verses, or a series of verses. You should use unambiguous abbreviations if you use abbreviations. See Appendix C in the OSIS spec (pdf) for a list of possible abbreviations.

When you’re in doubt about whether the reference text is parsable, use the title attribute to encode a fuller representation. In particular, when the reference doesn’t include all the text necessary to produce an unambiguous book/chapter/verse reference, place an unambiguous reference in title.

About the title Attribute

The title attribute, when present, takes precedence over the contents of the element (<cite> or <a>). When the title is not present, the contents of the element are assumed to be the verse reference. The title attribute contains an unambiguous machine-parsable representation of the verse reference.

The attribute can also contain an optional translation identifier at the beginning of the value, followed by a colon. Appendix D in the OSIS spec (pdf) has a list of translation identifiers. For example:

<cite title="ESV: John 3:16">…</cite>

To be comprehensive, you would ideally include a language identifier (e.g., “en:ESV: John 3:16”) before the translation identifier. I would argue that a language identifier is only necessary if you’re using a non-standard abbreviation.

However, you should only include a translation identifier if it is important that your readers see a particular translation or language. Otherwise, you should allow the parsing software to use your readers’ preferred translation and language.

Here is a Perl regex for allowed formats in the title. $1 is the optional language identifier. $2 is the optional translation identifier. $3 is the verse reference, which is deliberately wide-open to accommodate many different reference formats.

title="([\w\-]+:)?([\w\-]+:)?\s*([^"]+)"

Bible Reading Tech

Saturday, May 12th, 2007

Bob Pritchett from Logos Bible Software points to Live Ink, a technology that takes normally formatted text and breaks its clauses into lines to improve reading comprehension. It looks like this:

A sports report is broken into lines: Hal Atkinson, / Mickey Walters / and rookie sensation…

Having been an English major, I keep trying to assign meaning to the line breaks—you pay close attention to line breaks when explicating poetry, and it’s hard to stop noticing them everywhere once you start noticing them in poetry. Live Ink has research to back up their reading claims; I’m not one to question their findings. So maybe I’m an anomaly, but studying the Bible in a similar format would distract me.

Speaking of things that distract me, a few years ago I came across an example (now lost on the hard drive of a deceased computer) of an interesting way of reading the Bible: one word at a time. The program cycles through the text of the Bible, showing you only one word on the screen—blink and you miss it. This Flash demo I just whipped up gives you the general idea:

You may need to read this post outside an RSS reader to see the demo. Not inclined? Here’s a screenshot. Picture the words rapidly scrolling upward:

The words “at my watchpost and station” are visible, with “watchpost” being the most prominent.

Honestly, I can’t decide if it’s a good idea or a bad one. The hardest part for me is that it requires more sustained concentration than I usually devote to reading. It might be useful if someone could come up with an intuitive way to control the speed and allow time to blink. Like Live Ink, it may simply take some getting used to.

(The way the demo scrolls kind of reminds me of William Shatner’s line delivery in Star Trek—it pauses slightly on longer words, which was apparently the key to Shatner’s acting in the 1960s. Actually, the line breaks in the Live Ink screenshot above remind me of how he would read a box score, too. Maybe he was just ahead of his time.)

Virtual Tours with Photosynth

Wednesday, April 4th, 2007

Photosynth is a Microsoft technology preview that creates 3D spaces from a bunch of photographs of the same place. Here’s a screenshot of the application in the space it’s constructed from dozens of photographs of the Piazza San Marco in Venice:

A 3D space showing a photograph of the tower with pointillistic representations of the rest of the piazza.

In the live application you can move around and highlight different photos taken from all sorts of angles. Check out the screencast if you’re hesitant about installing an ActiveX control on your computer.

I can see using this technology to reconstruct highly photographed places like Jerusalem in 3D. It’s not quite as good as building your own virtual-reality model of Jerusalem, but it’s a lot easier. Consider that flickr has about 9200 photos tagged “Jerusalem” under a Creative Commons license, and you can imagine the possibilities.

I look forward to being able to upload your own (or others’) photos and have the application create a space for you. Right now you can only use predefined photo collections.

Via Digital Media Minute.