Blog RSS Feed

Religious Interest among Facebook Users

April 25th, 2013

I have a post on the Bible Gateway blog that briefly looks at how religious interest among Facebook users varies with age.

In particular, eighteen-year-old women appear to have an especially strong interest in religion, which drops off sharply during their 20s. (Barna in 2003 published findings that corroborate the dropoff.)

The post makes some possibly unwarranted inferences from the original data published yesterday by Stephen Wolfram:

quotes + life philosophy data by age

How to Train Your Franken-Bible

March 16th, 2013

This is the outline of a talk that I gave earlier today at the BibleTech 2013 conference.

  • There are two parts to this talk
    • Inevitability of algorithmic translations. An algorithmic translation (or “Franken-Bible”) means a translation that’s at least partly done by a computer.
    • How to produce one with current technology.

  • The number of English translations has grown over the past five centuries and has accelerated recently. This growth mirrors overall trend in publishing as costs have diminished and publishers have proliferated.
  • I argue that this trend of ever-more translations will not only continue but will accelerate, that we’re heading for a post-translation world, where the number of English translations becomes so high, and the market so fragmented, that Bible translations as distinct identities will have much less meaning than they do today.
    • This trend isn’t specific to Bibles, but Bible translations participate in the larger shift to more diverse and fragmented cultural expressions.

  • Like other media, Bible translations are subject to a variety of pressures.
    • Linguistic (e.g., as English becomes more gender-inclusive, pressure rises on Bible translations to also become more inclusive).
    • Academic (discoveries that shed new light on existing translations). My favorite example is Matthew 17:15, where some older translations say, “My son is a lunatic,” while newer translations say, “My son has epilepsy.”
    • Theological / doctrinal (conform to certain understandings or agendas).
    • Social (decline in public religious influence leads to a loss of both shared stories and religious vocabulary).
    • Moral (pressure to address whatever the pressing issue of the day is).
    • Institutional (internally, where Bible translators want control over their translations; and externally, with the wider, increasing distrust of institutions).
    • Market. Bible translators need to make enough money (directly or indirectly) to sustain translation operations.
    • Technological. Technological pressure increases the variability and intensity of other pressures and is the main one we’ll be investigating today.

  • If you’re familiar with Clayton Christensen’s The Innovator’s Dilemma, you know that the basic premise is that existing dominant companies are eventually eclipsed by newer companies who release an inferior product at low prices. The dominant companies are happy to let the new company operate in this low-margin area, since they prefer to focus on their higher-margin businesses. The new company then steadily encroaches on the territory previously staked out by the existing companies until the existing companies have a much-diminished business. Eventually the formerly upstart company becomes the incumbent, and the cycle begins again.
    • One of the main drivers of this disruption is technology, where a technology that’s vastly inferior to existing methods in terms of quality comes at a much lower price and eventually supersedes existing methods.
  • I argue that English Bible translation is ripe for disruption, and that this disruption will take the form of large numbers of specialized translations that are, from the point of view of Bible translators, vastly inferior. But they’ll be prolific and easy to produce and will eventually supplant existing modern translations.

  • For an analogy, let’s look at book publishing (or any media, really, like news, music, or movies. But book publishing is what I’m most familiar with, so it’s what I’ll talk about). In the past twenty years, it’s gone through two major disruptions.
    • The first is a disruption in distribution, with Amazon.com and other web retailers. National bookstore chains consolidated or folded as they struggled to figure out how to compete with the lower prices and wider selection offered online.
    • This change hurt existing retailers but didn’t really affect the way that content creators like publishers and authors did business. From their perspective, selling through Amazon isn’t that different from selling through Barnes & Noble.
    • The second change is more disruptive to content creators: this change is the switch away from print books to ebooks. At first, this change seems more like a difference in degree rather than in kind. Again, from a publisher’s perspective, it seems like selling an ebook through Amazon is just a more-convenient way of selling a print book through Amazon.
    • But ebooks actually allow whole new businesses to emerge.

  • I’d argue that these are the main functions that publishers serve for authors–in other words, why would I, as an author, want a publisher to publish my book in exchange for some small cut of the profit?
    • Gatekeeping (by publishing a book, they’re saying it’s worth your time and has a certain level of quality: it’s been edited and vetted).
    • Marketing (making sure that people know about and buy books that are interesting to them).
    • Distribution (historically, shipping print books to bookstores).
  • Ebooks most-obviously remove the distribution pillar of this model–when producing and distributing an epub only involves a few clicks, it’s hard to argue that a publisher is adding a whole lot of value in distribution.
  • That leaves gatekeeping and marketing, which I’ll return to later in the context of Bibles.

  • But beyond just affecting these pillars, ebooks also allow new kinds of products:
    • First, a wider variety of content becomes more economically viable–content that’s too long or too short to print as a book can work great as an ebook, for example.
    • Second, self-publishing becomes more attractive: when traditional publishing shuns you because, say, your book is terrible, just go direct and let the market decide just how terrible it is. And if no one will buy it, you can always give it away–it’s not like it’s costing you anything.
  • So ebooks primarily allow large numbers of low-quality, low-priced books into the market, which fits the definition of disruption we talked about earlier.

  • Let’s talk specifically about Bible translations.
  • Traditionally, Bible translations have been expensive endeavors, involving teams of dozens of people working over several years.
    • The result is a high-quality product that conforms to the translation’s intended purpose.
    • In return for this high level of quality, Bible publishers charge money to, at a minimum, recoup their costs.
  • What would happen if we applied the lessons from the ongoing disruption in book publishing to Bible translations?
    • First, like Amazon and physical bookstores, we disrupt distribution.
    • Bible Gateway in 1993 on the web first disrupted distribution by letting people browse translations for free.
    • YouVersion in 2010 on mobile then took that disruption a step further by letting people download and own translations for free.
    • But we’re really only talking about a disruption in distribution here. Just like with print books, this type of disruption doesn’t affect the core Bible-translation process.
  • That second type of disruption is still to come and will eventually arrive; I’m going to argue that, as with ebooks, the disruption to translations themselves will be largely technological and will result in an explosion of new translations.
    • I believe that this disruption will take the form of partially algorithmic personalized Bible translations, or Franken-Bibles.
    • Because Bible translation is a specialized skill, these Franken-Bibles won’t arise from scratch–instead, they’ll build on existing translations and will be tailored to a particular audience–either a single individual or a group of people.

  • In its simplest form, a Franken-Bible could involve swapping out a footnote reading for the reading that’s in the main text.

  • A typical Bible has around 1100 footnotes indicating alternate translations. What if a translation allowed you to decide which reading you preferred–either by setting a policy (for example, you might, say, “always translate adelphoi in a particular way”) or by deciding on a case-by-case basis.

  • By including footnote variants at all, translations have already set themselves up for this approach. Some translations go even further–the Amplified and Expanded Bibles embrace variants by embedding them right in the text. Here we see a more-extensive version of the same idea.
  • But that’s the simplest approach. A more-radical approach, from a translation-integrity perspective, would allow people to edit the text of the translation itself, not merely to choose among pre-approved alternatives at given points.
    • In many ways, the pastor who says in a sermon, “This verse might better be translated as…” is already doing this; it just isn’t propagated back into the translation.
    • People also do this on Twitter all the time, where they alter a few words in a verse to make it more meaningful to them.
    • The risk here, of course, is that people will twist the text beyond all recognition, either through incompetence or malice.

  • That risk brings us back to one of the other two functions that publishers serve: gatekeeping.
    • A translation is nothing if not gatekeeping: a group of people have gotten together and declared, “This is what Scripture says. These are our translation principles, and here’s where we stand in relation to other translations.”
    • What happens to gatekeeping–to the seal of authority and trust–when anyone can change the text to suit themselves?

  • In other words, what happens if the “Word of God” becomes the “Wiki of God?”
    • After all, people who aren’t interested in translating their own Bible still want to be able to trust that they’re reading an accurate, or at least non-heretical, translation.

  • I suggest that new axes of trust will form. Whereas today a translation “brand”–NIV, ESV, or whatever–carries certain trust signals, those signals will shift to new parties, and in particular to groups that people already trust.
    • The question of whom you’d trust to steward a Bible translation probably isn’t that different from whomever you already trust theologically, a group that probably includes some of the following:
      • Social network.
      • Teachers or elders in your church.
      • Pastors in your church.
      • Your denomination.
      • More indirectly, a megachurch pastor you trust.
      • A parachurch organization or other nonprofit.
      • Maybe even a corporation.
    • The point is not that trust in a given translation would go away, but rather that trust would become more networked, complicated, and fragmented, before eventually solidifying.
      • We already see this happening somewhat with existing Bible translations, where certain groups declare some translations as OK and others as questionable. Like your choice of cellphone, the translation you use becomes an indicator of group identity.
      • The proliferation of translations will allow these groups who are already issuing imprimaturs to go a step further and advance whole translations that fit their viewpoints.
      • In other words, they’ll be able to act as their own Magisteriums. Their own self-published Magisteriums.
      • This, by the way, also addresses the third function that publishers serve: marketing. By associating a translation with a group you already identify with, you reduce the need for marketing.

  • I said earlier that I think technology will bring about this situation, but there are a couple ways it could happen.
    • First, an existing modern translation could open itself up to such modifications. A church could say, “I like this translation except for these ten verses.” Or, “This translation is fine except that it should translate this Greek word this way instead of this other way.”
    • Such a flexible translation could provide the tools to edit itself–with enough usage, it’s possible that useful improvements could be incorporated back into the original translation.
  • A second alternative is to use technology to produce a new, cheap, low-quality translation with editing in mind from the get-go, to provide a base from which a monstrous hydra of translations can grow. Let’s take a look at what such a hydra translation, or a Franken-Bible, could look like.

  • The basic premise is this: there are around thirty modern, high-quality translations of the Bible into English. Can we combine these translations algorithmically into something that charts the possibility space of the original text?
    • Bible translators already consult existing translations to explore nuances of meaning. What I propose is to consult these translations computationally and lay bare as many nuances as possible.

  • You can explore the output of what I’m about to discuss at www.adaptivebible.com. I’m going to talk about the process that went into creating the site.
  • This part of the talk is pretty technical.

  • In all, there are fifteen steps, broken into two phases: alignment of existing translations and generation of a new translation.
  • It took about ten minutes of processor time for each verse to produce the result.
  • The total cost in server time on Amazon EC2 to translate the New Testament was about $10. Compared to the millions of dollars that a traditional translation costs, that’s a big savings–five or six orders of magnitude.

  • The first phase is alignment.
  • First step. Collect as many English translations as possible. Obviously there are copyright implications with doing that, so it’s important to deal with only one verse at a time, which is something that all translations explicitly allow. For this project, we used around thirty translations.
  • Second. Normalize the text as much as possible. For this project, we’re not interested in any formatting, for example, so we can deal with just the plain text.
  • Third. Tokenize the text and run basic linguistic analysis on it.
    • Off-the-shelf open-source software from Stanford called Stanford CoreNLP tokenizes the text, identifies lemmas (base forms of words) and analyzes how words are related to each other syntactically.
    • In general, it’s about 90% accurate, which is fine for our purposes; we’ll be trying to enhance that accuracy later.
  • Fourth. Identify Wordnet similarities between translations.
    • Wordnet is a giant database of word meanings that computers can understand.
    • We take the lemmas from the step 3 and identify how close in meaning they are to each other. The thinking is that even when translations use different words for the same underlying Greek word, the words they choose will at least be similar in meaning.
    • For this step, we used Python’s Natural Language Toolkit.
  • Fifth. Run an off-the-shelf translation aligner.
    • We used another open-source program called the Berkeley Aligner, which is designed to use statistics to align content between different languages. But it works just as well for different translations of the same content in the same language. It takes anywhere from two to ten minutes for each verse to run.
  • Sixth. Consolidate all this data for future processing.
    • By this point, we have around 4MB of data for each verse, so we consolidate it into a format that’s easy for us to access in later steps.
  • Seventh. Run a machine-learning algorithm over the data to identify the best alignment between single words in each pair of translations.
    • We used another Python module, scikit-learn, to execute the algorithm.
    • In particular, we used Random Forest, which is a supervised-learning system. That means we need to feed it some data we know is good so that it can learn the patterns in the data.

  • Where did we get this good data? We wrote a simple drag-and-drop aligner to feed the algorithm, where there are two lists of words and you drag them on top of each other if they match; it’s actually kind of fun: if you juiced it up a little, I can totally see it becoming a game called “Translations with Friends.”
    • In total, we hand-aligned around 30 pairs of translations across 25 verses. There are about 8,000 verses in the New Testament, so it doesn’t need a lot of training to get good results.

  • What the algorithm actually runs on is a big vector matrix. These are the ten factors we included in our matrix.
    • 1. One translation might begin a verse with the words “Jesus said,” while another might put that same phrase at the end of the verse. All things being equal, though, translations tend to put words in similar positions in the verse. When all else fails, it’s worth taking position into account.
    • 2. Similarly, even when translations rearrange words, they’ll often keep them in the same sentence. Again, all things being equal, it’s more likely that the same word will appear in the same sentence position across translations.
    • 3. If we know that a particular word is in a prepositional phrase, for example, it’s not unlikely that it will serve a similar grammatical role in another translation.
    • 4. If words in different translations are both nouns or both verbs, it’s more likely that they’re translating the same word than if one’s a noun and another’s an adverb.
    • 5. Here we use the output from the Berkeley Aligner we ran earlier. The aligner is bidirectional, so if we’re comparing the word “Jesus” in one translation with the word “he” in another, we look both at what the Berkeley Aligner says “Jesus” should line up with in one translation and with what “he” should line up with in the other translation. It provides a fuller picture than just going in one direction.
    • 6. Here we go more general. Even if the Berkeley Aligner didn’t match up “Jesus” and “he” in the two translations we’re currently looking at, if other translations use “he” and the Aligner successfully aligned them with “Jesus”, we want to take that into account.
    • 7. This is similar to grammatical context but looks specifically at dependencies, which describe direct relationships between words. For example, if a word is the subject of a sentence in one translation, it’s likely to be the subject of a sentence in another translation.
    • 8. Wordnet similarity looks at the similarities we calculated earlier–words with similar meanings are more likely to reflect the same underlying words.
    • 9. This step strips out all words that aren’t nouns, pronouns, adjectives, verbs, and adverbs and compares their sequence–if a different word appears between two identical words across translations, there’s a good chance that it means the same thing.
    • 10. Finally, we look at any dependencies between major words; it’s a coarser version of what we did in #7.
    • The end result a giant matrix of data–ten vectors for every word-combination in every translation in every verse–and we run our machine-learning algorithm on it, which produces an alignment between every word in every translation.
    • At this point, we’ve generated between 50 and 250MB of data for every verse.

  • Eighth. Now that we have the direct alignment, we supplement it with indirect alignment data across translations. In other words, to reuse our earlier example, the alignment between two translations may not align “Jesus” and “he,” but alignments in other translations might strongly suggest that the two should be aligned.
  • At this point, we have a reasonable alignment among all the translations. It’s not perfect, but it doesn’t have to be. Now we shift to the second phase: generating a range of possible translations from this data.

  • First. Consolidate alignments into phrases, where we look for runs of parallel words. You can see that we’re only looking at lemmas here–dealing with every word creates a lot of noise that doesn’t add much value, so we ignore the less-important words. In this case, the first two have identical phrases even though the words differ slightly, while the third structures the sentence differently.
  • Second. Arrange translations into clusters based on how similar they are to each other structurally. In this example, the first two form a cluster, and the the third would be part of a different cluster.

  • Third. Insert actual keyword text. I’ve been using words in the examples I’ve been giving, but in the actual program, we use numerical ids assigned to each word. Here we start to introduce actual words.
  • Fourth. Fill in the gaps between keywords. We add in small words like conjunctions and prepositions that are key to producing recognizable English.
  • Fifth. Add in punctuation. Up to this point, we’ve been focusing on the commonalities among translations. Now we’re starting to focus on differences to produce a polished output.
  • Sixth. Reduce the possibility space to accept only valid bigrams. “Bigrams” just means two words in a row. We remove any two-word combinations that, based on our algorithm thus far, look like they should work but don’t. We check each pair of words to see whether they exist anywhere in one of our source translations. If they don’t, we get rid of them.

  • Seventh. Produce rendered output.

  • In this case, the output is just for the Adaptive Bible website. It shows the various translation possibilities for each verse.
    • Hovering over a reading shows what the site thinks are valid next words based on what you’re hovering over. (That’s the yellow.)
    • You can click a particular reading if you think it’s the best one, and the other readings disappear. Clicking again restores them. (That’s the green.)
    • The website shows a single sentence structure that it thinks has the best chance of being valid, but most verses have multiple valid structures that we don’t bother to show here.

  • To consider a verse successfully translated, this process has to produce readings supported by two independent translation streams (e.g., having a reading supported only by ESV and RSV doesn’t count because ESV is derived from RSV).
    • Using this metric, the process I’ve described produces valid output for 96% of verses in the New Testament.
    • On the current version of adaptivebible.com, I use stricter criteria, so only 91% of verses show up.

  • Limitations
    • Just because a verse passes the test, that doesn’t mean it’s actually grammatical, and it certainly doesn’t mean that every alternative presented within a verse is valid.
    • Because we use bigrams for validity, we can get into situations like what you see here, where all these are valid bigrams, but the result (“Jesus said, ‘Be healed,’ Jesus said”) is ridiculous.
    • There’s no handling of inter-verse transitions; even if a verse is totally valid, it may not read smoothly into the next verse.
    • Since we removed all formatting at the beginning of the process, there’s no formatting.
  • Despite those limitations, the process produced a couple of mildly interesting byproducts.

  • Probabilistic Strongs-to-Wordnet sense alignment. Given a single Strong’s alignment and a variety of translations, we can explore the semantic range of a Strong’s number. Here we have dunamis. This seems like a reasonably good approximation of its definition in English.

  • Identifying translation similarity. This slide explores how structurally similar translations are to each other, based on the phrase clusters we produced. The results are pretty much what I’d expect: translations that are derived from each other tend to be similar to each other.

  • What I’ve just described is one pretty basic approach to what I think is inevitable: the explosion of translations into Franken-Bibles as technology gets better. In the future, we won’t be talking about particular translations anymore but rather about trust networks.
  • To be clear, I’m not saying that I think this development is a particularly great one for the church, and it’s definitely not good for existing Bible translations. But I do think it’s only a matter of time until Franken-Bibles arrive. At first they’ll be unwieldy and ridiculously bad, but over time they’ll adapt, improve, and will need to be taken seriously.

What Twitterers Are Giving up for Lent (2013 Edition)

February 16th, 2013

The top 100 things that people on Twitter are giving up for Lent in 2013.

This year saw a lot of churn in the top 100 things people were giving up for Lent.

The pope announced his resignation on Monday, leading many to say that he was giving up “being pope” for Lent. It came in at #1. (Related, at #18, people said they were giving up “the pope” for Lent.)

Specific social networking sites like Twitter and Facebook generally dropped this year, with the generic term “social networking” (#4) taking over as a catchall. Instagram (#10), Pinterest (#52), and Snapchat (#78) were all new to the top 100.

With Valentine’s Day falling on the day after Ash Wednesday this year, it came in at #13. My wife suggests that the timing may also have contributed to the drop in “chocolate” from #2 last year to #17 this year. “Valentines” is #97.

“Horse meat” (#20) refers to the ongoing European scandal.

The only celebrity to make the list was British boy band One Direction, up substantially at #41.

I learned several new words this year: “twerking” (#34), a type of dance move, “selfies” (#46), or self-shot photos taken with a phone, “subtweeting” (#57), or tweeting about someone without mentioning them by name, “oomf” (#71), or “one of my [Twitter] followers,” and “Nando’s” (#76), a chicken restaurant.

This list draws from 263,000 tweets from February 10-15, 2013, and excludes most retweets.

Rank Word Count Change from last year’s rank
1. Being pope 5,654  
2. Swearing 4,944 +1
3. Soda 2,648 +2
4. Social networking 2,264 +19
5. Alcohol 2,217 -1
6. Chips 1,690 +8
7. Virginity 933 +23
8. Marijuana 784 +17
9. Fast food 776 -2
10. Instagram 755 +270
11. Twitter 672 -10
12. Cookies 643 +19
13. Valentine’s day 514  
14. Masturbation 510 +18
15. Takeout 465 +59
16. Sweets 444 -7
17. Chocolate 417 -15
18. The pope 394 +10,224
19. Facebook 380 -13
20. Horse meat 375  
21. Junk food 362 -8
22. Smoking 355 -3
23. My swag 331 +373
24. Desserts 325 +21
25. Life 325 +40
26. New year’s resolutions 313 +47
27. My boyfriend 309 +99
28. Catholicism 255 +11
29. Straightening my hair 228 +89
30. Fried food 225 +5
31. Netflix 216 +255
32. Work 216 -5
33. Sobriety 213 +4
34. Twerking 185 +698
35. The playoffs 184 +3,556
36. French fries 173 +19
37. Coke 168 +1
38. Feelings 168 +207
39. Laziness 160 +28
40. Meat 158 -30
41. Onedirection 155 +103
42. You 154 -24
43. Procrastination 153 +1
44. Makeup 150 +16
45. Internet 149 +61
46. Selfies 149 +2,328
47. Exercise 144 +58
48. School 141 -36
49. My phone 135 +15
50. Classes 129 +84
51. Dip 127 +132
52. Pinterest 125 +133
53. Church 124 +33
54. Emotions 122 +397
55. Going to school 119 +163
56. My girlfriend 111 +207
57. Subtweeting 110 +253
58. College 106 +5
59. My face 106 +4,168
60. Ice cream 106 -27
61. McDonald’s 102 -32
62. Being ugly 101 +256
63. Snacking 99 +19
64. Spending 96 +89
65. Dunkin Donuts 96 +475
66. Chew 95 +418
67. Eating out 94 +28
68. Elevators 94 +99
69. Food 93 -47
70. Moaning 93 +123
71. Oomf 93 +78
72. Chick Fil A 90 +135
73. Healthy food 88 +180
74. Football 87 +145
75. Swimming 87 +200
76. Nando’s 86 +72
77. DVDs 84 +1,326
78. Snapchat 84  
79. Broccoli 83 +206
80. Ranch 81 +250
81. The snooze button 80 +176
82. Crystal meth 80 +219
83. Dignity 79 +116
84. Cake 77 -13
85. Unhealthy food 77 +34
86. Homework 76 -65
87. Busyness 75  
88. Schoolwork 74 +88
89. Chemistry 74 +34,949
90. Frozen yogurt 72 +480
91. iPhone 72 +100
92. FIFA 71 +143
93. Betting 70 +315
94. Doing homework 69 +158
95. Myself 68 +267
96. Supermarkets 67 +1,797
97. Valentines 66  
98. Domino’s 63 +323
99. Being negative 63 +212
100. Hookah 63 +340

Categories

Rank Category Number of Tweets
1. food 11,642
2. habits 8,083
3. religion 6,519
4. technology 4,782
5. smoking/drugs/alcohol 3,928
6. sex 1,771
7. relationship 1,399
8. health/hygiene 1,270
9. school work 1,095
10. irony 792
11. sports 648
12. entertainment 392
13. celebrity 246
14. clothes 235
15. money 133
16. shopping 111

The image is a Wordle.

Track What People Are Giving Up in 2013 for Lent in Real Time

February 11th, 2013

See the top 100 things people are giving up in 2013 for Lent on Twitter, continually updated until February 15, 2013.

As I write this post, with about 5,000 tweets analyzed, the new hot topics so far this year are: meowing, Valentine’s Day, and Snapchat.

Look for the usual post-mortem on February 16, 2013.

Street View through Israel

January 17th, 2013

Google yesterday announced that they’ve expanded their street view functionality throughout Israel, including a number of sites that you’d visit on any tour of the Holy Lands. Of particular note are the archaeological sites they walked around and photographed. Here, for example, is Megiddo:


View Larger Map

Previously available were many places in Jerusalem, like the Via Dolorosa. But the new imagery covers much more area–I can imagine it being particularly useful in Sunday School and classroom settings, where a semi-immersive environment communicates more than static photographs.

Via Biblical Studies and Technological Tools.

Bible Reference Parser Code Update

November 20th, 2012

The semiannual release schedule of the Javascript Bible Passage Reference Parser continues. This release:

  • Improves support for parentheses.
  • Adds some alternate versification systems.
  • Supports French book names.
  • Removes the “docs” folder because it was getting unwieldy; the source itself remains commented.
  • Reorganizes some of the source code.
  • Increases the number of real-world strings from 200,000 to 370,000. I ran the parser on all 85 million tweets and Facebook posts in the Realtime Bible Search database to produce the list.

One of the main goals of this parser is to give you a starting point to build your own parser, so the source is thoroughly documented and has many tests you can use to validate your code.

Try a demo or browse the source on GitHub.

Zoomable Map of the Greco-Roman World

October 4th, 2012

The Pelagios Project has produced a lovely zoomable map of the Greco-Roman world. Below, for example, is a static view of Israel during the Roman period.

A blog post about the map discusses how they created it (plus bonus technical background). I’m most impressed by how attractive the maps are—a lot of online maps present you the data but don’t try to be beautiful; this map succeeds on both counts.

More generally, the Pelagios Project, which I admit I hadn’t heard of before today, incorporates linked data to help people study the ancient world. It encompasses a variety of efforts (need to search for an inscription from ancient Palestine? No problem)—it’s all fascinating.

Terrain-shaded map of Roman Palestine (Israel) showing topography, cities, roads, and other features.
(Note the “Mortuum Mare” instead of the “Dead Sea.”)

Via O’Reilly Radar.

Calculating the Time and Cost of Paul’s Missionary Journeys

July 5th, 2012

Stanford University recently unveiled ORBIS, a site that lets you calculate the time and cost required to travel by road or ship around the Roman world in A.D. 200. It takes into account a lot of factors—my favorite is that it models ancient sea routes based on historical sources and wave height.

A view of the Mediterranean, including Roman cities and roads, from ORBIS.

The apostle Paul went on three missionary journeys from A.D. 46 to 57, traveling around much of Asia Minor and Greece. In 60, he was also taken to Rome. ORBIS allows us to calculate how long these journeys would have taken in pure travel time (excluding time spent at each destination) and how much they would have cost.

Journey Distance (miles) Travel Time (days) Cost per Person (denarii)*
First 1,581 53 237
Second 3,050 100 314
Third 3,307 92 481
Rome 2,344 36 699

* Ship travel only. According to Wikipedia, the denarius from 200, used here, is roughly 22% weaker than a denarius from the mid-first-century.

I conclude a few things from this exercise:

  1. The journeys get progressively costlier as more of each journey happens by ship. Sailing is fast but expensive—of course, Paul and his companions may not have had to pay the full fare.
  2. I like to imagine that Paul’s overnight escape from Thessalonica to Berea was partially by riverboat (though the costs above assume it was by road).
  3. Not much of the route of Paul’s journey is in doubt—Luke describes the trips pretty precisely in Acts. About the only question is whether Paul traveled from Berea to Athens by ship or by road. The above costs follow the ESV Study Bible and assume it was by ship.

For more about Paul’s missionary journeys, Dale Bargmann has written a good walkthrough with maps and photos.

Download the raw data (Excel).

Rise of the Robosermon

April 29th, 2012

In a recent issue of Wired, Steven Levy writes about Narrative Science, a company that uses data to write automated news stories. Right now, they mostly deal in data-intensive fields like sports and finance, but the company is confident that it will easily expand into other areas—the company’s co-founder even predicts that an algorithm will win a Pulitzer Prize in the next five years.

In February 2012, I attended a session at the TOC Conference given in part by Kristian Hammond, the CTO and co-founder of Narrative Science. During the session, Hammond mentioned that sports stories have a limited number of angles (e.g., a “blowout win” or a “come-from-behind victory”)—you can probably sit down and think up a fairly comprehensive list in short order. Even in fictional sports stories, writers only use around sixty common tropes as part of the narrative. Once you have an angle (or your algorithm has decided on one), you just slot in the relevant data, add a little color commentary, and you have your story.

At the time, I was struggling to understand how automated content could apply to Bible study; Levy’s article leads me to think that robosermons, or sermons automatically generated by a computer program, are the way of the future.

Parts of a Robosermon

Futurama has a robot preacher. I've never seen these episodes, so hopefully this image isn't terribly heretical. After all, from a data perspective, sermons don’t differ much from sports stories. In particular, they have three components:

First, as with sports stories, sermons follow predictable structures and patterns. David Schmitt of Concordia Theological Seminary suggests a taxonomy of around thirty sermon structures. Even if this list isn’t comprehensive, it would probably take, at most, 100 to 200 structures to categorize nearly all sermons.

Second, sermons deal with predictable content: whereas sports have box scores, sermons have Bible texts and topics. A sermon will probably deal with a passage from the Bible in some way—the 31,000 verses in the Bible comprise a large but manageable set of source material (especially since most sermons involve a passage, not a single verse; you can probably cut this list down to around 2,000 sections). Topically, SermonCentral.com lists only 500 sermon topics in their database of 120,000 sermons. The power-law popularity distribution (i.e., the 80/20 rule) of verses preached on (on SermonCentral.com are 1,200 sermons on John 1 compared to seven on Numbers 35) and topics (1,400 sermons on “Jesus’ teachings” vs. four on “morning”) means that you can categorize most sermons using a small portion of the available possibilities.

Third, sermons generally involve illustrations or stories, much like the color commentary of sports stories. Finding raw material for illustrations shouldn’t present a problem to a computer program; a quick search on Amazon turns up 1,700 books on sermon illustrations and an additional 10,000 or so on general anecdotes. You can probably extract hundreds of thousands of illustrations from just these sources. Alternately, if a recent news story relates to your topic, the system can add the relevant parts to your sermon with little trouble (especially if a computer wrote the news story to begin with).

Application

You end up being able to say, “I want to preach a sermon on Philippians 2 that emphasizes Christ’s humility as a model for us.” Then—and here’s the part that doesn’t exist yet but that technology like Narrative Science’s will provide—an algorithm suggests, say, an amusing but poignant anecdote to start with, followed by three points of exegesis, exhortation, and application, and finishing with a trenchant conclusion. You tweak the content a bit, throwing in a shout-out to a behind-the-scenes parishioner who does a lot of work but rarely receives recognition, and call it done.

Why limit sermons to pastors, though? Why shouldn’t churchgoers be able to ask for custom sermons that fit exactly their circumstances? “I’d like a ten-part audio sermon series on Revelation from a dispensational perspective where each sermon exactly fits the length of my commute.” “Give me six weeks of premarital devotions for my boyfriend and me. I’ve always been a fan of Charles Spurgeon, so make it sound like he wrote them.”

Levy opens his Wired article with an anecdote about how grandparents would find articles about their grandchildren’s Little League games just as interesting as “anything on the sports pages.” He doesn’t mention that what they really want is a recap with their grandchild as the star (or at least as a strong supporting character—it’s like one of those children’s books where you customize the main character’s name and appearance). Robosermons let you tailor the sermon’s content so that your specific problems or questions form the central theme.

The logical end of this technology is a sermonbot that develops a following of eager listeners and readers, in the same way that an automated newspaper reporter would create fans on its way to winning a Pulitzer.

You may argue that robosermons diminish the role of the Holy Spirit in preparing sermons, or that they amount to plagiarism. I’m not inclined to disagree with you.

Conclusion

Building a robosermon system involves five components: (1) sermon structures; (2) Bible verses; (3) topics; (4) illustrations; and (5) technology like Narrative Science’s to put everything together coherently. It would also be helpful to have (6) a large set of existing sermons to serve as raw data. It’s a complicated problem but hardly an insurmountable one over the next ten years, should someone want to tackle it.

I’m not sure they should; that way lies robopologetics and robovangelism.

If you’re not an algorithm and you want to know how to prepare and deliver a sermon, I suggest listening to this 29-part course on preaching by Bryan Chapell at Biblical Training. It’s free and full of homiletic goodness.

Bible Verses for the Pinterest Set

March 24th, 2012

Jonathan Ogden runs Typographic Verses, a collection of about seventy-five Pinterest-friendly Bible verses designed as posters. Here are three of my favorites:

Romans 8:38 crosses out all the things that can't separate us from the love of God. Matthew 6:22 is rendered as an eye chart. Psalm 48:1 uses striking, infographic-style type.

Also see Jim LePage’s Illustrations of Every Bible Book.