Blog RSS Feed

Archive for the ‘Technology’ Category

BibleTech 2019

Friday, January 11th, 2019

If you’re reading this blog, then you’re probably interested in attending the BibleTech conference, held on April 11-12, 2019, in Seattle.

You may even be interested in submitting a proposal for a talk; if so, the deadline is January 31.

Here’s what I plan to talk about if they accept me:

Designing for Agency in Bible Study

This talk explores the theory and practice of designing a Bible study experience so that the distinctive property of digital media–interactivity at scale–enhances rather than constrains the participant’s agency, or ability to act. We’ll discuss how people’s psychological needs for competence, relatedness, and autonomy affect their approach to and expectations of the Bible and church life, and how developers can support these needs by considering agency during the design process. We’ll also look at a specific application that HarperCollins Christian Publishing has developed to put these ideas into practice and promote agency in the context of daily Bible reading, explaining how and why we transformed a product that wasn’t a good fit for print into one that feels digitally native.

“Data and the Bible Online” Interviews

Monday, November 21st, 2016

BigBible has been running a series of interviews called “Data and the Bible Online.” The latest is with me.

A recent interview in the series is with John Dyer. In part of his interview, he says, “I do worry that even for those of who value Scripture as part of their spiritual lives, it’s easy to confuse access to the Bible (i.e., having an app installed) with wisdom, maturity, or formation.” Just today I read This $1,500 Toaster Oven Is Everything That’s Wrong With Silicon Valley Design, with the tagline, “Automated yet distracting. Boastful yet mediocre. Confident yet wrong.” That tagline could well describe the future of digital Bibles.

In the article, Mark Wilson writes:

June [the oven] is taking something important away from the cooking process: the home cook’s ability to observe and learn. The sizzle of a steak on a pan will tell you if it’s hot enough. The smell will tell you when it starts to brown. These are soft skills that we gain through practice over time. June eliminates this self-education. Instead of teaching ourselves to cook, we’re teaching a machine to cook. And while that might make a product more valuable in the long term for a greater number of users, it’s inherently less valuable to us as individuals, if for no other reason than that even in the best-case scenarios of machine learning, we all have individual tastes. And what averages out across millions of people may end up tasting pretty . . . average.

Bible software can fall into these same traps, especially if an AI is involved. As Nicholas Carr likes to remind us, the more you automate something, the worse you become at it. Or, in Idiocracy terms:

Smartspeek: We think for you. We speek for you.
Image: 20th Century Fox

On the other hand, I can’t say that the American church succeeded in developing believers’ “wisdom, maturity, and formation” even before digital technology increased the Bible’s availability and immediacy. Matthew Block writes in First Things that evangelicals in particular misapply the Reformation idea of Sola Scriptura:

many Christians seem to think saying Sola Scriptura is the ultimate authority somehow means it is my personal “solo” reading of Scripture that is authoritative. They reject the witness of the Church down through the ages in favor of a personal, private understanding of Scripture (which is not at all what the reformers meant by the term “Scripture alone”).

Digital Bibles will accelerate this process by emphasizing personal application and understanding of Scripture, possibly–but not necessarily–building a theological echo chamber in which an AI can present you internally consistent interpretations that nevertheless fall outside what most would consider the bounds of orthodoxy.

Dyer argues elsewhere that embodied practices could serve as one antidote to these virtualization trends seeping into culture and consequently the church. We already see a secular reaction against the always-connected mentality with organizations like Time Well Spent, and Dyer suggests that the church has the tools it needs to speak to people both discomfited and empowered by technology.

As the U.S. moves to a post-industrial Christianity likely characterized by increased fragmentation and polarization, and as gatekeepers shift–Christianity Today recently chronicled the rise of parachurch women’s ministries that counterpoint the decline of traditional Protestant denominations–developing (or rediscovering) a solid theology around embodiment and presence will become increasingly important to the church. I lament in my BigBible interview the paucity of formal theological education related to digital engagement: the need to equip pastors and others to understand and respond deeply to people’s relationship with a technology-mediated existence. I believe the American church will need to grapple with this relationship in the near future–if only because the wider culture will also struggle with it.

Your Next Bible Will Be a Hologram

Friday, January 23rd, 2015

Or, how Microsoft may have just invented the future of intensive Bible study.

Microsoft this week unveiled HoloLens, an augmented-reality headset that overlays text and images on the real world and, in particular, anchors them to precise locations in space, as if they were real objects. Here’s one of Microsoft’s promotional shots to give you an idea of what wearing HoloLens is like:

A man is wearing HoloLens in his kitchen.

In this image, the man is apparently so obsessed with going to Maui that he maintains a Sims-like vacation paradise on his counter. The TV, “Recipes” button, Maui simulation, and to-do list are all virtual—using the device on his head, only he can see whether his Sims manage find a staircase to the beach or if instead they simply leap the fifteen feet off the cliff to the sand.

At this year’s BibleTech conference, I’m going to discuss why the idea of the “digital library” doesn’t appeal to certain kinds of people, and one aspect of the discussion involves the tension between print books and digital ones, each of which has advantages and disadvantages. Microsoft’s holographic technology (I recognize that one, they’re not really holograms, and two, what I’m describing here may go beyond what’s possible in the first devices) presents an intriguing way to bridge the physical and digital worlds of Bible study.

Certain kinds of people prefer to study from print Bibles, and for them digital resources serve as study augmentations: parallel Bibles and commentaries feature prominently in this kind of study practice. The melding of physical and digital has always been awkward for this type of person, although tablet computers have eased this awkwardness somewhat. Still, the main limitation of digital resources for this person is space; small screens (compared to the size of a desk) don’t provide enough room to look at very many resources simultaneously, forcing them to toggle between resources. Edward Tufte calls this phenomenon being “stacked in time” rather than “adjacent in space,” saying that the latter is generally preferable.

Holograms remove this space limitation by expanding your working area to your entire physical desktop:

An open Bible appears in the middle of a desk with holographic text around it.

In this image, only the physical Bible and the desk are actually there. The rest of the text appears to float on top of the desk, providing enough room to engage in the kind of deep study that you might crave. Here I imagine that you, wearing holographic goggles, have tapped Psalm 27:1 in your print Bible. The goggles recognize the gesture, draw a box around the text in your Bible, and provide all sorts of supplementary material in which you’ve previously expressed interest: photos for some sort of illustration, various commentary and exegetical helps, and cross-references. The digital resources displayed on the desk are interactive, letting you tap and scroll much as you would on a tablet computer. It’s a tablet without a tablet.

Of course, if you have a whole lot of material, there’s no need to limit supplementary material to a desktop; the whole room is available to you:

Holographic text appears on the walls of a room.

This image limits content to walls, but Microsoft’s HoloLens demo shows that the content could just as easily exist as three-dimensional objects in the middle of the room. And while I focus on low-density information displays here, there could easily be hundreds of information cards. Do you want to conduct a keyword search with hundreds of results? You can see all of them at once, all around you, rather than paging through them a few at a time.

Further, holograms give you the opportunity to merge print and digital resources in new ways. Suppose you’re studying Psalm 27:1, as above, and vaguely remember something you read once in one of your books. If you look over at your bookshelves, you might see something like this:

Bookshelves appear behind holographic text showing search results from three books on the shelves.

Here the holographic goggles have identified relevant books for you and show you where they are on your bookshelves, in addition to providing relevant excerpts for you to peruse. (The goggles know the page numbers either because you own the same volume digitally or because you originally read the book with your goggles on, and the goggles remember everything you read, even if you don’t; it’s like a super-Evernote.) The goggles surface passages related to the verse you’re reading and even remember passages you’ve highlighted (the yellow lines in the image). You can interact further with the holograms, looking through more search results, perhaps, or you can pull one of the books off the shelf and physically peruse it.

Finally, and most obviously, holograms push the 3D models, timelines, and maps that are now study-Bible staples into new dimensions of interactivity. They can literally pop off the page and expand into space, letting you manipulate them in ways that are impossible in the 2D space of a screen.

Holographic technology neatly sidesteps several limitations of current digital Bible study and could potentially usher widespread, transformative, digitally assisted Bible study. Or they may be just too geeky-looking. We’ll have to see.

Photo credits: endyk, Hc_07, 4thglryofgod, worshipbackgrounds, listentothemountains, coloneljohnbritt, 4thglryofgod, titobalangue, quoteseverlasting, steven_jamesP, nlcwood, netzanette, and williamhook on Flickr. The terrible Photoshopping is all my fault, not theirs.

“Hacking the Bible” in Christianity Today

Thursday, March 6th, 2014

Read the article This month’s Christianity Today cover story, The Bible in the Original Geek, talks about how programmers are using technology to change how we read, study, and interpret the Bible. If you’re interested in the Bible and technology (and if you’re reading this blog, you probably are), then you should go read it.

(Ted Olsen, the author of the article, is doing also did an AMA on Reddit about the article.)

The article talks about the “academic priesthood,” and I think it’s particularly interesting that so few universities are interested in “digital theology” (for lack of a better term). You can study at Durham (like John Dyer is) or King’s College London, or you can try to work a biblical emphasis into a digital rhetoric Ph.D. But I’m surprised that more institutions, especially evangelical seminaries, aren’t at the forefront of the kind of research described in the article.

Religious Interest among Facebook Users

Thursday, April 25th, 2013

I have a post on the Bible Gateway blog that briefly looks at how religious interest among Facebook users varies with age.

In particular, eighteen-year-old women appear to have an especially strong interest in religion, which drops off sharply during their 20s. (Barna in 2003 published findings that corroborate the dropoff.)

The post makes some possibly unwarranted inferences from the original data published yesterday by Stephen Wolfram:

quotes + life philosophy data by age

How to Train Your Franken-Bible

Saturday, March 16th, 2013

This is the outline of a talk that I gave earlier today at the BibleTech 2013 conference.

  • There are two parts to this talk
    • Inevitability of algorithmic translations. An algorithmic translation (or “Franken-Bible”) means a translation that’s at least partly done by a computer.
    • How to produce one with current technology.

  • The number of English translations has grown over the past five centuries and has accelerated recently. This growth mirrors overall trend in publishing as costs have diminished and publishers have proliferated.
  • I argue that this trend of ever-more translations will not only continue but will accelerate, that we’re heading for a post-translation world, where the number of English translations becomes so high, and the market so fragmented, that Bible translations as distinct identities will have much less meaning than they do today.
    • This trend isn’t specific to Bibles, but Bible translations participate in the larger shift to more diverse and fragmented cultural expressions.

  • Like other media, Bible translations are subject to a variety of pressures.
    • Linguistic (e.g., as English becomes more gender-inclusive, pressure rises on Bible translations to also become more inclusive).
    • Academic (discoveries that shed new light on existing translations). My favorite example is Matthew 17:15, where some older translations say, “My son is a lunatic,” while newer translations say, “My son has epilepsy.”
    • Theological / doctrinal (conform to certain understandings or agendas).
    • Social (decline in public religious influence leads to a loss of both shared stories and religious vocabulary).
    • Moral (pressure to address whatever the pressing issue of the day is).
    • Institutional (internally, where Bible translators want control over their translations; and externally, with the wider, increasing distrust of institutions).
    • Market. Bible translators need to make enough money (directly or indirectly) to sustain translation operations.
    • Technological. Technological pressure increases the variability and intensity of other pressures and is the main one we’ll be investigating today.

  • If you’re familiar with Clayton Christensen’s The Innovator’s Dilemma, you know that the basic premise is that existing dominant companies are eventually eclipsed by newer companies who release an inferior product at low prices. The dominant companies are happy to let the new company operate in this low-margin area, since they prefer to focus on their higher-margin businesses. The new company then steadily encroaches on the territory previously staked out by the existing companies until the existing companies have a much-diminished business. Eventually the formerly upstart company becomes the incumbent, and the cycle begins again.
    • One of the main drivers of this disruption is technology, where a technology that’s vastly inferior to existing methods in terms of quality comes at a much lower price and eventually supersedes existing methods.
  • I argue that English Bible translation is ripe for disruption, and that this disruption will take the form of large numbers of specialized translations that are, from the point of view of Bible translators, vastly inferior. But they’ll be prolific and easy to produce and will eventually supplant existing modern translations.

  • For an analogy, let’s look at book publishing (or any media, really, like news, music, or movies. But book publishing is what I’m most familiar with, so it’s what I’ll talk about). In the past twenty years, it’s gone through two major disruptions.
    • The first is a disruption in distribution, with and other web retailers. National bookstore chains consolidated or folded as they struggled to figure out how to compete with the lower prices and wider selection offered online.
    • This change hurt existing retailers but didn’t really affect the way that content creators like publishers and authors did business. From their perspective, selling through Amazon isn’t that different from selling through Barnes & Noble.
    • The second change is more disruptive to content creators: this change is the switch away from print books to ebooks. At first, this change seems more like a difference in degree rather than in kind. Again, from a publisher’s perspective, it seems like selling an ebook through Amazon is just a more-convenient way of selling a print book through Amazon.
    • But ebooks actually allow whole new businesses to emerge.

  • I’d argue that these are the main functions that publishers serve for authors–in other words, why would I, as an author, want a publisher to publish my book in exchange for some small cut of the profit?
    • Gatekeeping (by publishing a book, they’re saying it’s worth your time and has a certain level of quality: it’s been edited and vetted).
    • Marketing (making sure that people know about and buy books that are interesting to them).
    • Distribution (historically, shipping print books to bookstores).
  • Ebooks most-obviously remove the distribution pillar of this model–when producing and distributing an epub only involves a few clicks, it’s hard to argue that a publisher is adding a whole lot of value in distribution.
  • That leaves gatekeeping and marketing, which I’ll return to later in the context of Bibles.

  • But beyond just affecting these pillars, ebooks also allow new kinds of products:
    • First, a wider variety of content becomes more economically viable–content that’s too long or too short to print as a book can work great as an ebook, for example.
    • Second, self-publishing becomes more attractive: when traditional publishing shuns you because, say, your book is terrible, just go direct and let the market decide just how terrible it is. And if no one will buy it, you can always give it away–it’s not like it’s costing you anything.
  • So ebooks primarily allow large numbers of low-quality, low-priced books into the market, which fits the definition of disruption we talked about earlier.

  • Let’s talk specifically about Bible translations.
  • Traditionally, Bible translations have been expensive endeavors, involving teams of dozens of people working over several years.
    • The result is a high-quality product that conforms to the translation’s intended purpose.
    • In return for this high level of quality, Bible publishers charge money to, at a minimum, recoup their costs.
  • What would happen if we applied the lessons from the ongoing disruption in book publishing to Bible translations?
    • First, like Amazon and physical bookstores, we disrupt distribution.
    • Bible Gateway in 1993 on the web first disrupted distribution by letting people browse translations for free.
    • YouVersion in 2010 on mobile then took that disruption a step further by letting people download and own translations for free.
    • But we’re really only talking about a disruption in distribution here. Just like with print books, this type of disruption doesn’t affect the core Bible-translation process.
  • That second type of disruption is still to come and will eventually arrive; I’m going to argue that, as with ebooks, the disruption to translations themselves will be largely technological and will result in an explosion of new translations.
    • I believe that this disruption will take the form of partially algorithmic personalized Bible translations, or Franken-Bibles.
    • Because Bible translation is a specialized skill, these Franken-Bibles won’t arise from scratch–instead, they’ll build on existing translations and will be tailored to a particular audience–either a single individual or a group of people.

  • In its simplest form, a Franken-Bible could involve swapping out a footnote reading for the reading that’s in the main text.

  • A typical Bible has around 1100 footnotes indicating alternate translations. What if a translation allowed you to decide which reading you preferred–either by setting a policy (for example, you might, say, “always translate adelphoi in a particular way”) or by deciding on a case-by-case basis.

  • By including footnote variants at all, translations have already set themselves up for this approach. Some translations go even further–the Amplified and Expanded Bibles embrace variants by embedding them right in the text. Here we see a more-extensive version of the same idea.
  • But that’s the simplest approach. A more-radical approach, from a translation-integrity perspective, would allow people to edit the text of the translation itself, not merely to choose among pre-approved alternatives at given points.
    • In many ways, the pastor who says in a sermon, “This verse might better be translated as…” is already doing this; it just isn’t propagated back into the translation.
    • People also do this on Twitter all the time, where they alter a few words in a verse to make it more meaningful to them.
    • The risk here, of course, is that people will twist the text beyond all recognition, either through incompetence or malice.

  • That risk brings us back to one of the other two functions that publishers serve: gatekeeping.
    • A translation is nothing if not gatekeeping: a group of people have gotten together and declared, “This is what Scripture says. These are our translation principles, and here’s where we stand in relation to other translations.”
    • What happens to gatekeeping–to the seal of authority and trust–when anyone can change the text to suit themselves?

  • In other words, what happens if the “Word of God” becomes the “Wiki of God?”
    • After all, people who aren’t interested in translating their own Bible still want to be able to trust that they’re reading an accurate, or at least non-heretical, translation.

  • I suggest that new axes of trust will form. Whereas today a translation “brand”–NIV, ESV, or whatever–carries certain trust signals, those signals will shift to new parties, and in particular to groups that people already trust.
    • The question of whom you’d trust to steward a Bible translation probably isn’t that different from whomever you already trust theologically, a group that probably includes some of the following:
      • Social network.
      • Teachers or elders in your church.
      • Pastors in your church.
      • Your denomination.
      • More indirectly, a megachurch pastor you trust.
      • A parachurch organization or other nonprofit.
      • Maybe even a corporation.
    • The point is not that trust in a given translation would go away, but rather that trust would become more networked, complicated, and fragmented, before eventually solidifying.
      • We already see this happening somewhat with existing Bible translations, where certain groups declare some translations as OK and others as questionable. Like your choice of cellphone, the translation you use becomes an indicator of group identity.
      • The proliferation of translations will allow these groups who are already issuing imprimaturs to go a step further and advance whole translations that fit their viewpoints.
      • In other words, they’ll be able to act as their own Magisteriums. Their own self-published Magisteriums.
      • This, by the way, also addresses the third function that publishers serve: marketing. By associating a translation with a group you already identify with, you reduce the need for marketing.

  • I said earlier that I think technology will bring about this situation, but there are a couple ways it could happen.
    • First, an existing modern translation could open itself up to such modifications. A church could say, “I like this translation except for these ten verses.” Or, “This translation is fine except that it should translate this Greek word this way instead of this other way.”
    • Such a flexible translation could provide the tools to edit itself–with enough usage, it’s possible that useful improvements could be incorporated back into the original translation.
  • A second alternative is to use technology to produce a new, cheap, low-quality translation with editing in mind from the get-go, to provide a base from which a monstrous hydra of translations can grow. Let’s take a look at what such a hydra translation, or a Franken-Bible, could look like.

  • The basic premise is this: there are around thirty modern, high-quality translations of the Bible into English. Can we combine these translations algorithmically into something that charts the possibility space of the original text?
    • Bible translators already consult existing translations to explore nuances of meaning. What I propose is to consult these translations computationally and lay bare as many nuances as possible.

  • You can explore the output of what I’m about to discuss at I’m going to talk about the process that went into creating the site.
  • This part of the talk is pretty technical.

  • In all, there are fifteen steps, broken into two phases: alignment of existing translations and generation of a new translation.
  • It took about ten minutes of processor time for each verse to produce the result.
  • The total cost in server time on Amazon EC2 to translate the New Testament was about $10. Compared to the millions of dollars that a traditional translation costs, that’s a big savings–five or six orders of magnitude.

  • The first phase is alignment.
  • First step. Collect as many English translations as possible. Obviously there are copyright implications with doing that, so it’s important to deal with only one verse at a time, which is something that all translations explicitly allow. For this project, we used around thirty translations.
  • Second. Normalize the text as much as possible. For this project, we’re not interested in any formatting, for example, so we can deal with just the plain text.
  • Third. Tokenize the text and run basic linguistic analysis on it.
    • Off-the-shelf open-source software from Stanford called Stanford CoreNLP tokenizes the text, identifies lemmas (base forms of words) and analyzes how words are related to each other syntactically.
    • In general, it’s about 90% accurate, which is fine for our purposes; we’ll be trying to enhance that accuracy later.
  • Fourth. Identify Wordnet similarities between translations.
    • Wordnet is a giant database of word meanings that computers can understand.
    • We take the lemmas from the step 3 and identify how close in meaning they are to each other. The thinking is that even when translations use different words for the same underlying Greek word, the words they choose will at least be similar in meaning.
    • For this step, we used Python’s Natural Language Toolkit.
  • Fifth. Run an off-the-shelf translation aligner.
    • We used another open-source program called the Berkeley Aligner, which is designed to use statistics to align content between different languages. But it works just as well for different translations of the same content in the same language. It takes anywhere from two to ten minutes for each verse to run.
  • Sixth. Consolidate all this data for future processing.
    • By this point, we have around 4MB of data for each verse, so we consolidate it into a format that’s easy for us to access in later steps.
  • Seventh. Run a machine-learning algorithm over the data to identify the best alignment between single words in each pair of translations.
    • We used another Python module, scikit-learn, to execute the algorithm.
    • In particular, we used Random Forest, which is a supervised-learning system. That means we need to feed it some data we know is good so that it can learn the patterns in the data.

  • Where did we get this good data? We wrote a simple drag-and-drop aligner to feed the algorithm, where there are two lists of words and you drag them on top of each other if they match; it’s actually kind of fun: if you juiced it up a little, I can totally see it becoming a game called “Translations with Friends.”
    • In total, we hand-aligned around 30 pairs of translations across 25 verses. There are about 8,000 verses in the New Testament, so it doesn’t need a lot of training to get good results.

  • What the algorithm actually runs on is a big vector matrix. These are the ten factors we included in our matrix.
    • 1. One translation might begin a verse with the words “Jesus said,” while another might put that same phrase at the end of the verse. All things being equal, though, translations tend to put words in similar positions in the verse. When all else fails, it’s worth taking position into account.
    • 2. Similarly, even when translations rearrange words, they’ll often keep them in the same sentence. Again, all things being equal, it’s more likely that the same word will appear in the same sentence position across translations.
    • 3. If we know that a particular word is in a prepositional phrase, for example, it’s not unlikely that it will serve a similar grammatical role in another translation.
    • 4. If words in different translations are both nouns or both verbs, it’s more likely that they’re translating the same word than if one’s a noun and another’s an adverb.
    • 5. Here we use the output from the Berkeley Aligner we ran earlier. The aligner is bidirectional, so if we’re comparing the word “Jesus” in one translation with the word “he” in another, we look both at what the Berkeley Aligner says “Jesus” should line up with in one translation and with what “he” should line up with in the other translation. It provides a fuller picture than just going in one direction.
    • 6. Here we go more general. Even if the Berkeley Aligner didn’t match up “Jesus” and “he” in the two translations we’re currently looking at, if other translations use “he” and the Aligner successfully aligned them with “Jesus”, we want to take that into account.
    • 7. This is similar to grammatical context but looks specifically at dependencies, which describe direct relationships between words. For example, if a word is the subject of a sentence in one translation, it’s likely to be the subject of a sentence in another translation.
    • 8. Wordnet similarity looks at the similarities we calculated earlier–words with similar meanings are more likely to reflect the same underlying words.
    • 9. This step strips out all words that aren’t nouns, pronouns, adjectives, verbs, and adverbs and compares their sequence–if a different word appears between two identical words across translations, there’s a good chance that it means the same thing.
    • 10. Finally, we look at any dependencies between major words; it’s a coarser version of what we did in #7.
    • The end result a giant matrix of data–ten vectors for every word-combination in every translation in every verse–and we run our machine-learning algorithm on it, which produces an alignment between every word in every translation.
    • At this point, we’ve generated between 50 and 250MB of data for every verse.

  • Eighth. Now that we have the direct alignment, we supplement it with indirect alignment data across translations. In other words, to reuse our earlier example, the alignment between two translations may not align “Jesus” and “he,” but alignments in other translations might strongly suggest that the two should be aligned.
  • At this point, we have a reasonable alignment among all the translations. It’s not perfect, but it doesn’t have to be. Now we shift to the second phase: generating a range of possible translations from this data.

  • First. Consolidate alignments into phrases, where we look for runs of parallel words. You can see that we’re only looking at lemmas here–dealing with every word creates a lot of noise that doesn’t add much value, so we ignore the less-important words. In this case, the first two have identical phrases even though the words differ slightly, while the third structures the sentence differently.
  • Second. Arrange translations into clusters based on how similar they are to each other structurally. In this example, the first two form a cluster, and the the third would be part of a different cluster.

  • Third. Insert actual keyword text. I’ve been using words in the examples I’ve been giving, but in the actual program, we use numerical ids assigned to each word. Here we start to introduce actual words.
  • Fourth. Fill in the gaps between keywords. We add in small words like conjunctions and prepositions that are key to producing recognizable English.
  • Fifth. Add in punctuation. Up to this point, we’ve been focusing on the commonalities among translations. Now we’re starting to focus on differences to produce a polished output.
  • Sixth. Reduce the possibility space to accept only valid bigrams. “Bigrams” just means two words in a row. We remove any two-word combinations that, based on our algorithm thus far, look like they should work but don’t. We check each pair of words to see whether they exist anywhere in one of our source translations. If they don’t, we get rid of them.

  • Seventh. Produce rendered output.

  • In this case, the output is just for the Adaptive Bible website. It shows the various translation possibilities for each verse.
    • Hovering over a reading shows what the site thinks are valid next words based on what you’re hovering over. (That’s the yellow.)
    • You can click a particular reading if you think it’s the best one, and the other readings disappear. Clicking again restores them. (That’s the green.)
    • The website shows a single sentence structure that it thinks has the best chance of being valid, but most verses have multiple valid structures that we don’t bother to show here.

  • To consider a verse successfully translated, this process has to produce readings supported by two independent translation streams (e.g., having a reading supported only by ESV and RSV doesn’t count because ESV is derived from RSV).
    • Using this metric, the process I’ve described produces valid output for 96% of verses in the New Testament.
    • On the current version of, I use stricter criteria, so only 91% of verses show up.

  • Limitations
    • Just because a verse passes the test, that doesn’t mean it’s actually grammatical, and it certainly doesn’t mean that every alternative presented within a verse is valid.
    • Because we use bigrams for validity, we can get into situations like what you see here, where all these are valid bigrams, but the result (“Jesus said, ‘Be healed,’ Jesus said”) is ridiculous.
    • There’s no handling of inter-verse transitions; even if a verse is totally valid, it may not read smoothly into the next verse.
    • Since we removed all formatting at the beginning of the process, there’s no formatting.
  • Despite those limitations, the process produced a couple of mildly interesting byproducts.

  • Probabilistic Strongs-to-Wordnet sense alignment. Given a single Strong’s alignment and a variety of translations, we can explore the semantic range of a Strong’s number. Here we have dunamis. This seems like a reasonably good approximation of its definition in English.

  • Identifying translation similarity. This slide explores how structurally similar translations are to each other, based on the phrase clusters we produced. The results are pretty much what I’d expect: translations that are derived from each other tend to be similar to each other.

  • What I’ve just described is one pretty basic approach to what I think is inevitable: the explosion of translations into Franken-Bibles as technology gets better. In the future, we won’t be talking about particular translations anymore but rather about trust networks.
  • To be clear, I’m not saying that I think this development is a particularly great one for the church, and it’s definitely not good for existing Bible translations. But I do think it’s only a matter of time until Franken-Bibles arrive. At first they’ll be unwieldy and ridiculously bad, but over time they’ll adapt, improve, and will need to be taken seriously.

Rise of the Robosermon

Sunday, April 29th, 2012

In a recent issue of Wired, Steven Levy writes about Narrative Science, a company that uses data to write automated news stories. Right now, they mostly deal in data-intensive fields like sports and finance, but the company is confident that it will easily expand into other areas—the company’s co-founder even predicts that an algorithm will win a Pulitzer Prize in the next five years.

In February 2012, I attended a session at the TOC Conference given in part by Kristian Hammond, the CTO and co-founder of Narrative Science. During the session, Hammond mentioned that sports stories have a limited number of angles (e.g., a “blowout win” or a “come-from-behind victory”)—you can probably sit down and think up a fairly comprehensive list in short order. Even in fictional sports stories, writers only use around sixty common tropes as part of the narrative. Once you have an angle (or your algorithm has decided on one), you just slot in the relevant data, add a little color commentary, and you have your story.

At the time, I was struggling to understand how automated content could apply to Bible study; Levy’s article leads me to think that robosermons, or sermons automatically generated by a computer program, are the way of the future.

Parts of a Robosermon

Futurama has a robot preacher. I've never seen these episodes, so hopefully this image isn't terribly heretical. After all, from a data perspective, sermons don’t differ much from sports stories. In particular, they have three components:

First, as with sports stories, sermons follow predictable structures and patterns. David Schmitt of Concordia Theological Seminary suggests a taxonomy of around thirty sermon structures. Even if this list isn’t comprehensive, it would probably take, at most, 100 to 200 structures to categorize nearly all sermons.

Second, sermons deal with predictable content: whereas sports have box scores, sermons have Bible texts and topics. A sermon will probably deal with a passage from the Bible in some way—the 31,000 verses in the Bible comprise a large but manageable set of source material (especially since most sermons involve a passage, not a single verse; you can probably cut this list down to around 2,000 sections). Topically, lists only 500 sermon topics in their database of 120,000 sermons. The power-law popularity distribution (i.e., the 80/20 rule) of verses preached on (on are 1,200 sermons on John 1 compared to seven on Numbers 35) and topics (1,400 sermons on “Jesus’ teachings” vs. four on “morning”) means that you can categorize most sermons using a small portion of the available possibilities.

Third, sermons generally involve illustrations or stories, much like the color commentary of sports stories. Finding raw material for illustrations shouldn’t present a problem to a computer program; a quick search on Amazon turns up 1,700 books on sermon illustrations and an additional 10,000 or so on general anecdotes. You can probably extract hundreds of thousands of illustrations from just these sources. Alternately, if a recent news story relates to your topic, the system can add the relevant parts to your sermon with little trouble (especially if a computer wrote the news story to begin with).


You end up being able to say, “I want to preach a sermon on Philippians 2 that emphasizes Christ’s humility as a model for us.” Then—and here’s the part that doesn’t exist yet but that technology like Narrative Science’s will provide—an algorithm suggests, say, an amusing but poignant anecdote to start with, followed by three points of exegesis, exhortation, and application, and finishing with a trenchant conclusion. You tweak the content a bit, throwing in a shout-out to a behind-the-scenes parishioner who does a lot of work but rarely receives recognition, and call it done.

Why limit sermons to pastors, though? Why shouldn’t churchgoers be able to ask for custom sermons that fit exactly their circumstances? “I’d like a ten-part audio sermon series on Revelation from a dispensational perspective where each sermon exactly fits the length of my commute.” “Give me six weeks of premarital devotions for my boyfriend and me. I’ve always been a fan of Charles Spurgeon, so make it sound like he wrote them.”

Levy opens his Wired article with an anecdote about how grandparents would find articles about their grandchildren’s Little League games just as interesting as “anything on the sports pages.” He doesn’t mention that what they really want is a recap with their grandchild as the star (or at least as a strong supporting character—it’s like one of those children’s books where you customize the main character’s name and appearance). Robosermons let you tailor the sermon’s content so that your specific problems or questions form the central theme.

The logical end of this technology is a sermonbot that develops a following of eager listeners and readers, in the same way that an automated newspaper reporter would create fans on its way to winning a Pulitzer.

You may argue that robosermons diminish the role of the Holy Spirit in preparing sermons, or that they amount to plagiarism. I’m not inclined to disagree with you.


Building a robosermon system involves five components: (1) sermon structures; (2) Bible verses; (3) topics; (4) illustrations; and (5) technology like Narrative Science’s to put everything together coherently. It would also be helpful to have (6) a large set of existing sermons to serve as raw data. It’s a complicated problem but hardly an insurmountable one over the next ten years, should someone want to tackle it.

I’m not sure they should; that way lies robopologetics and robovangelism.

If you’re not an algorithm and you want to know how to prepare and deliver a sermon, I suggest listening to this 29-part course on preaching by Bryan Chapell at Biblical Training. It’s free and full of homiletic goodness.

The Topical Index and the Living Index

Monday, January 2nd, 2012

The New York Times writes about the first-ever topical index for the Talmud. It looks like a topical Bible and contains 6,600 topics and 27,000 subtopics. (For comparison, Nave’s Topical Bible contains about 5,300 topics and 20,000 subtopics.)

The first page of the Talmud topical index shows entries for Aaron (seven subtopics), Abandon (five subtopics), and Abba (part of two subtopics).

The Sabbath

Two parts of the article stand out. First:

For three decades, Talmud students have been able to use a Nexis-like CD search engine, the Responsa Project, created by Bar Ilan University in Israel…. Bar Ilan officials acknowledged that the CD had one major disadvantage: it cannot be accessed on the Sabbath, when much learning takes place. It also costs $790.

Of course it makes sense that you wouldn’t be able to use a digital study tool on the Sabbath; it had just never occurred to me. The evangelical analog might be having to use a print Bible in church instead of a mobile or projected version.

The Living Index

The second highlight from the article is:

Rabbi Benjamin Blech, professor of Talmud at Yeshiva University, said the rabbis believed that study should not be made too easy. “We want people to struggle with the text because by figuring it out you will have a deeper comprehension,” he said. “They wanted a living index, not a printed index.

Bible software, websites, and apps are all working to create a “living index” (or at least a “responsive index”) of the Bible that lets you find comprehensive answers to every question that pops into your head while studying the Bible. But will this work devalue the actual experts (pastors and Bible teachers) who currently serve as living indexes?

The book The Lean Startup provides a framework for answering this question. It quotes Farbood Nivi, founder of test-preparation website Grockit:

“Whether you’re studying for the SAT or you’re studying for algebra, you study in one of three ways. You spend some time with experts, you spend some time on your own, and you spend some time with your peers. Grockit offers these three same formats of studying. What we do is we apply technology and algorithms to optimize those three forms.”

Or, in evangelical terms:

Study Type Current Source
Time on your own Personal Bible study; daily quiet time
Time with your peers Small group Bible study; Sunday school (not taught by clergy)
Time with experts Sunday morning sermon; radio and television programs; in-person academic Bible classes

Bible software has historically augmented “time on your own” by tying together study materials: connecting documents to each other.

Recently, Bible software has expanded into “time with your peers” by mixing in the social layer that is enveloping the wider world of technology—“away from a web that connects documents together to a web that connects people together,” as Paul Adams puts it in his book Grouped. Bible software has three options when embedding social technologies: (1) Inject technology into existing offline practices (e.g., automating the irritating or the expensive); (2) Copy technology from secular sources (e.g., foursquare-style checkins); or (3) Come up with something new. The most likely outcome will involve a combination of these options.

Eventually, Bible software will delve into “time with experts,” as well, whether those experts are your local pastor or nationally recognized figures. Biblical Training is pioneering this approach in the field of Bible studies and theology, while Stanford and MIT are leading the way in other fields.

Will Bible software someday become an “expert” itself, giving you custom answers to your questions and a personal study plan? It’s certainly possible. Khan Academy is already disrupting math education and assessment. Someone will undoubtedly explore whether a similar approach works for Bible studies.

Of course, the open question is just how much we want Bible software to function as a living index; the rabbis who preferred that students “struggle” with text have a point that learning and wisdom come with effort. In a future where Bible software can provide time for you to study on your own, with peers, and with experts, I guess we’ll find out just how “easy” we want Bible study to become.

Apologetics and Anti-Apologetics Apps

Friday, July 2nd, 2010

The New York Times discusses the latest iPhone trend: apps that provide talking points for the existence of God, pro and con.

In a dozen new phone applications, whether faith-based or faith-bashing, the prospective debater is given a primer on the basic rules of engagement — how to parry the circular argument, the false dichotomy, the ad hominem attack, the straw man — and then coached on all the likely flashpoints of contention….

Users can scroll from topic to topic to prepare themselves or, in the heat of a dispute, search for the point at hand — and the perfect retort.

I expect that, eventually, we’ll just let our phones argue the basic questions of existence among themselves, and then they’ll let us know what they’ve decided.

A screenshot from the LifeWay Fast Facts app

“Printing” 3D Maps and Bible Objects

Thursday, December 13th, 2007

The Wall Street Journal yesterday had an article about 3D printers turning web creations into physical models (the link may or may not work for you). Here’s the paragraph that got my attention:

A World of Warcraft figurine created by a 3D printer.In Redmond, Wash., a start-up called 3D Outlook Corp. this month will begin using software from NASA to sell 3-D models of mountains and other terrain priced at under $100, says Tom Gaskins, the company’s chief executive officer. Mr. Gaskins says hikers, resorts and real-estate firms are likely customers for 3-D maps and models that show the topographic contours of ski slopes, golf courses and other landscapes.

So you could, in theory, take Google Earth satellite and elevation data, combine it with the Bible geocoding data, and produce a custom 3D physical model of the Holy Land with key places labeled—or you could focus on one area, like Jerusalem, and show how it changed over time.

Or you could take a 3D model of Herod’s Temple and turn it into a 3D “printout.”

Anything you can represent in three dimensions—for example, the Ark of the Covenant or an ancient synagogue—can become a 3D model. In the future, I imagine technically minded Bible students producing a 3D model of something in the Bible, backed by research, as a final project in a class.

Imagine a 3D recreation of Nineveh in the time of Jonah. Or a complete reconstruction of a partially uncovered artifact uncovered on an archaeological dig. Or a way to recreate a variety of pottery vessels based on location and period (to help archaeology students familiarize themselves with identifying pottery from sherds—maybe you could “print” several vessels, then break them, mix up the remains, and have to identify the location and period of several sherds; it sounds like an interesting exam to me). Or a way for museums to share exact replicas of items in their collections with universities, so students can examine the items more closely than they can the originals. And if you break one? Just print another copy.

Obviously the possibilities extend beyond the realm of biblical studies, but affordable 3D printing opens a lot of intriguing doors. Hopefully it’ll get cheaper and more widespread soon.