Blog RSS Feed

Archive for the ‘Visualizations’ Category

Visualizing Pericope Similarity in the New Testament

Monday, September 13th, 2010

This diagram plots the similarity of pericopes (sections) in the New Testament based on their linguistic similarity in Greek:

Blue = Gospels, Purple = Acts, Green = Paul’s Epistles, Red = General Epistles, Gray = Revelation

If you don’t have Silverlight installed (or are reading this post via RSS–I suggest you click through to the original post), here’s a thumbnail:

Pericope similarity in the New Testament (thumbnail).

Download the full-size PDF (300KB) or PNG (22 MB, 12,000 pixels wide).

Do we actually learn anything from this kind of diagram? The most interesting part to me is how the gospels on the right flow primarily through the Gospel of John to the epistles on the left. I wonder why that is.


I calculated the cosine similarity between the full text of the pericopes using the Greek lemmas (after removing about forty stopwords). The pericope titles come from the ESV. I produced the diagram with Cytoscape. The widget at the top of the post comes from, Microsoft’s Deep-Zoom-as-a-Service.

Bill Mounce’s excellent free New Testament Greek dictionary served as the source of the lemmas.

Bible Cross References Visualization

Friday, April 16th, 2010

Here’s a visualization of 340,000 Bible cross references:

Visualization of Bible cross references.
Larger version (2,000 x 1,600 pixels).

Does anything strike you as intriguing? A few trends jump out at me:

  1. The frequency of dense New Testament streaks in the Old Testament, especially in Leviticus and Deuteronomy; I didn’t expect to see them there.
  2. The loops in Samuel / Kings / Chronicles and in the Gospels indicating parallel stories.
  3. The sudden increased density of New Testament references in Psalms through Isaiah.
  4. The eschatological references in Isaiah and Daniel.
  5. The density of references from the Minor Prophets back to both the Major Prophets and earlier in the Old Testament.
  6. The surprising density of cross references in Hebrew-Jude.
  7. The asymmetry. If verse A cites verse B, verse B doesn’t necessarily cite verse A. I wonder if I should make the data symmetrical.

You can also download the full-size image (10,000 x 8,000 pixels, 75 MB PNG). It’s a very large image that could crash your browser. If you want it, I strongly recommend that you save it to your computer rather than trying to open it in your browser.

This visualization uses data from the Bible Cross References project. I used PHP’s GD library to create the graphic.

Inspired by Chris Harrison and Christian Swinehart’s wonderful Choose Your Own Adventure work.

Presentation on Tweeting the Bible

Friday, March 26th, 2010

Here’s a presentation I just gave at the BibleTech 2010 conference about how people tweet the Bible:

Also: PowerPoint, PDF.

I distributed the following handout at the presentation, showing the popularity of Bible chapters and verses cited on Twitter. It displays a lot of data: darker chapters are more popular, the number in the middle of each box is the most popular verse in the chapter, and sparklines in each box show the distribution of the popularity in each chapter. (Genesis 1:1 is by far the most popular verse in Genesis 1, while Genesis 3:15 is only a little more popular than other verses in the chapter.)

The grid shows the popularity of chapters and verses in the Bible as cited on Twitter.

Delving into Lent Data

Sunday, March 7th, 2010

Let’s look a little more at some of the data on what Twitterers are giving up for Lent.

Categories of Things Given up by Location

As I only track in English what people are giving up, there are concentrations in English-speaking countries.

Categories by Country
Size indicates the relative number of Twitterers in each country giving up something for Lent.

Categories by Location

Categories of Things Given up by State

These visualizations show the differences (or lack thereof) in what people are giving up among U.S. states.

Categories by State
Size indicates the relative number of Twitterers in each state giving up something for Lent. Sorry, Alaska and Hawaii.

Categories by State (%)
The composition of each state’s categories of tweets shows mostly minor variations among states. Some states (like Wyoming on the far right) have small numbers of tweets. I would have liked to use opacity or width to indicate this disparity but couldn’t figure out how to do it.

Comparison between 2009 and 2010

This treemap shows how the data changed between 2009 and 2010. The size of the box shows the number of people giving up each category and thing, while color indicates the percentage change from last year: dark blue indicates the steepest drop; dark orange indicates the steepest rise. The second chart shows the same data more conventionally expressed.

Categories and Terms: Term Changes: 2009-2010

Categories and Terms: Term Changes: 2009-2010

About the Visualizations

I created these charts mostly to explore how the new data-analysis software Tableau Public works. One of its claims to fame is that you can publish interactive visualizations to the web, a feature I didn’t take advantage of here. Tableau doesn’t do treemaps, so I used Many Eyes to create the treemap; the closest Tableau equivalent appears below the treemap.

What Twitterers Are Giving up for Lent (2010 Edition)

Tuesday, February 23rd, 2010

The top 100 things that Twitterers are giving up for Lent in 2010.

Snow makes the list this year, understandable given the Snowpocalypse and Snowmageddon that gripped much of the Eastern U.S. in the weeks preceding Ash Wednesday. IPods also made the list after the Bishop of Liverpool asked people to consider praying instead of listening to them. This year a celebrity, Justin Bieber, cracks the top 100. He beat out the Jonas Brothers, 64 votes to 11; draw your own conclusions.

The list largely tracks last year’s list. It draws from 40,000 tweets retrieved February 14-20, 2010.

Complete List of the Top 100

Rank Word Count Change from last year’s rank
1. Twitter 2089 +1
2. Facebook 1874 -1
3. Chocolate 1323 0
4. Alcohol 1258 +1
5. Swearing 1158 +5
6. Soda 1126 0
7. Lent 792 -3
8. Meat 720 0
9. Sex 701 +7
10. Fast food 695 +7
11. Sweets 627 0
12. Coffee 445 -5
13. iPod 437  
14. Candy 325 +18
15. Religion 305 -6
16. Catholicism 264 -4
17. Smoking 254 +5
18. Junk food 251 +34
19. Giving up things 241 -6
20. Beer 241 -5
21. Chips 234 +24
22. You 233 +13
23. Stuff 217 -3
24. Fried food 199 +33
25. Red meat 193 +19
26. Bread 187 +13
27. Sugar 183 -8
28. Work 176 -14
29. Shopping 174 +11
30. Food 162 -7
31. Shame 150  
32. Social networking 147 -2
33. Caffeine 136 -6
34. Rice 136 +44
35. Procrastination 127 -11
36. Internet 126 -11
37. Cheese 120 +1
38. Coke 120 +41
39. Starbucks 119 +14
40. School 118 +36
41. Ice cream 118 +13
42. Booze 117 -21
43. Texting 114 +28
44. Masturbation 111  
45. Cookies 110 +11
46. TV 97 -18
47. Christianity 96 0
48. Snow 96  
49. Wine 92 -13
50. Pizza 91 +12
51. MySpace 91 +4
52. Men 90 +31
53. Giving up 89 -19
54. Sobriety 89 -13
55. Liquor 87  
56. Desserts 87  
57. Lint 87 -20
58. Pancakes 82 -29
59. Homework 81 +28
60. Marijuana 80  
61. Diet Coke 80 -28
62. Hope 78 +15
63. Virginity 76  
64. French fries 75 -15
65. Laziness 71 +5
66. Boys 67  
67. Nothing 67 -19
68. Carbs 66 -4
69. Justin Bieber 64  
70. Pork 64  
71. Porn 63 +9
72. Me 62 0
73. Sleep 61 -42
74. Complaining 58 -16
75. Eating out 58 -8
76. Jesus 55 -26
77. McDonald’s 55  
78. Beef 54 +18
79. Church 54 +6
80. God 53 -21
81. Abstinence 53 -39
82. Cake 52  
83. Negativity 52  
84. Him 49  
85. Juice 47  
86. Celibacy 44 +13
87. Chicken 42  
88. Lying 42  
89. New Year’s resolutions 42 -29
90. Sarcasm 42 -39
91. Snacking 41  
92. My wife 39  
93. Tea 37  
94. iPhone 37  
95. Exercise 36 -6
96. Sweet tea 35  
97. People 35  
98. Vegetables 34  
99. Pasta 33  
100. Self control 33  

Image created using Wordle.

Interview with Yingyan Huang

Wednesday, August 26th, 2009

Yingyan Huang, who created the two visualizations on the unity of Scripture from the last post, graciously agreed to answer a few questions for me about her work:

Where did you get the inspiration for your pieces? What was your motivation to create them?

Initially I had started with ideas of creating an information visualizations on Paul’s journeys and the spread of Christianity. As I explored the topic further, I realized that the focus, as a Christian, should not be on Paul but really on Christ Himself. Since the objective of Paul’s journeys and ministry was Christ. So I began exploring the Gospels and the role of Christ in the Bible. Ultimately my motivation is to visually demonstrate, given the Bible as a historical relic and literature, Christ’s centrality in the Bible and His reality because of the sheer number of references to Him found in the OT even before His birth.

What was the process you used to create them? Where did you get the data? What tools did you use?

My data was obtained through many reading and primarily comes from the Tyndale Life Application Study Bible, ESV Study Bible, and The Christ of the Covenants by O. Palmer Robertson. I cross referenced the passages from ESV’s history of salvation and Palmer’s book against the 250 events in the life of Christ from the Tyndale study bible. Some computer programs I used include Adobe Illustrator, Acrobat, and Microsoft Excel.

With infographics, there’s often a tension between making something attractive and clearly portraying the information you want to communicate. As a designer, how did you resolve that tension with these pieces?

As mentioned above, I basically cross-referenced the bible verses/passages to create data points. I did not create my own data, merely synthesized available data that have been around for centuries to create the data sets.

Why do graphic designers create infographics about the Bible?

Perhaps the Bible remains a choice subject matter because it is rich with intricacies and complexities that come together so harmoniously.

What are you up to these days?

I am currently interning/working at a magazine as a art/photo assistant and assisting in teaching an information design class at Parsons. I am also cooking and baking to test and create recipes. Additionally I am involved in starting a new college ministry (City Campus Ministry) and a teaching assistant for the Children’s Ministry at my church, Redeemer Presbyterian Church in NYC.

Two Bible Visualizations on the Unity of Scripture

Sunday, August 2nd, 2009

Yingyan Huang, a graphic designer, recently created two Bible visualizations that emphasize the unity of Scripture.

Harmony of the Gospels. The first visualization, “The Harmony of the Gospels—250 Events in the Life of Christ,” identifies 250 events recorded in the Gospels, arranges them chronologically, and plots them to reveal which events occurred in which Gospels. In effect, it’s a visualization of the Composite Gospel Index, though the visualization is apparently using a different data set.

A Single Story. The second visualization, “A Single Story—The Bible through the Lens of 250 Events,” starts with the same 250 events but extends references to them into the Old Testament and the remainder of the New Testament. This visualization shows the centrality of Christ in the Bible.

Both visualizations convey useful information. I wish they were available in larger sizes, but even the low-resolution versions give you a sense of the message behind the images.

Phrase Net Bible Visualizations

Wednesday, March 25th, 2009

The social data visualization site Many Eyes just unveiled a new visualization type, the Phrase Net, which illustrates phrase connections in textual content. The hard part is finding the right phrase to produce a good visualization.

Here are a couple of visualizations of the Old and New Testaments in the KJV Bible, using the pattern “[word] of [word].”

Old Testament

The Old Testament visualization illustrates the centrality of ideas like “Children of Israel,” “King of Israel,” and “Land of Israel.”

See the Old Testament visualization at Many Eyes (requires Java)

New Testament

The New Testament visualization shows the shift in emphasis to “Son of God” and “Kingdom of God.”

See the New Testament visualization at Many Eyes (requires Java)

Both visualizations require Java.

Top 100 Things Twitterers Are Giving Up for Lent

Friday, February 27th, 2009

A Wordle of the below words shows the relative frequency of each one.

Some you’d expect (alcohol, chocolate), some are ironic (giving up Lent for Lent, giving up giving up things), some are odd (pants, lint), some are anti-religious (religion, Catholicism), and some are tech-related (Facebook, Twitter—even “Facebook and Twitter” makes the list).

Complete List

  1. Facebook (654)
  2. Twitter (317)
  3. Chocolate (272)
  4. Lent (216)
  5. Alcohol (187)
  6. Soda (139)
  7. Coffee (129)
  8. Meat (126)
  9. Religion (102)
  10. Swearing (94)
  11. Sweets (92)
  12. Catholicism (90)
  13. Giving up things (80)
  14. Work (70)
  15. Beer (60)
  16. Sex (59)
  17. Fast food (57)
  18. Facebook and twitter (57)
  19. Sugar (45)
  20. Stuff (43)
  21. Booze (41)
  22. Smoking (39)
  23. Food (39)
  24. Procrastination (38)
  25. Internet (37)
  26. Cursing (36)
  27. Caffeine (35)
  28. TV (33)
  29. Pancakes (33)
  30. Social networking (33)
  31. Sleep (32)
  32. Candy (32)
  33. Diet Coke (29)
  34. Giving up (29)
  35. You (28)
  36. Wine (28)
  37. Lint (28)
  38. Cheese (28)
  39. Bread (26)
  40. Shopping (26)
  41. Sobriety (26)
  42. Abstinence (24)
  43. Cussing (24)
  44. Red meat (24)
  45. Chips (23)
  46. Internet porn (22)
  47. Christianity (22)
  48. Nothing (21)
  49. French fries (21)
  50. Jesus (21)
  51. Sarcasm (19)
  52. Junk food (19)
  53. Starbucks (18)
  54. Ice cream (18)
  55. MySpace (18)
  56. Cookies (18)
  57. Fried food (17)
  58. Complaining (17)
  59. God (16)
  60. New years resolutions (15)
  61. Social media (15)
  62. Pizza (14)
  63. Tweeting (14)
  64. Carbs (13)
  65. MySpace and Facebook (13)
  66. Carbon (13)
  67. Eating out (13)
  68. Stress (13)
  69. Flaky guys (12)
  70. Laziness (12)
  71. Texting (12)
  72. Me (11)
  73. Some of your money (11)
  74. Annoying me (11)
  75. Sacrifice (11)
  76. School (11)
  77. Hope (10)
  78. Rice (10)
  79. Coke (10)
  80. Porn (10)
  81. The snooze button (10)
  82. Guilt (10)
  83. Men (9)
  84. Obama (9)
  85. Church (9)
  86. My job (9)
  87. Homework (9)
  88. Self denial (9)
  89. Moderation (9)
  90. Exercise (8)
  91. Bacon (8)
  92. Dieting (8)
  93. Paying taxes (8)
  94. Dr Pepper (8)
  95. Gossip (8)
  96. Beef (8)
  97. Pants (7)
  98. My sanity (7)
  99. Celibacy (7)
  100. Shaving (7)


Created using the Twitter Search API and Wordle. Data based on analysis of 15,000 tweets from February 22-26, 2009.

New Feature: Bible Verse Photo Composites

Monday, February 4th, 2008

Try Bible Verse Photo Composites. Move your mouse over the image to see a photo composite for a particular verse, and click to see composites for the words in that verse.

An example:
Genesis 1:1 shows images over the words “beginning,” “God,” “created,” “heavens,” and “earth.”

Here’s the idea: Use the Flickr API to find photos matching each of the words in the Bible. Then download the photos for each word and layer them on top of each other to produce a composite image for word. Once you do that, layer the important words from each verse on top of each other to produce a composite image for each verse.

Then put all the verses together in sequence to create the orangeish image you see above. About 300,000 images comprising 13,000 words make up the image. Each verse occupies about six pixels.

Technical Background

For layering the photos, I wanted the brightest (most-saturated) colors possible; I used a simple formula. It finds the difference between the brightest and darkest channels in a particular pixel. The brighter colors will tend have bigger differences.

function weight_pixel($pixel)
$rgb = array(( $pixel >> 16 ) & 0xFF, ( $pixel >> 8 ) & 0xFF, $pixel & 0xFF);
sort($rgb, SORT_NUMERIC);
return $rgb[2] - $rgb[0];

This formula differs from the usual conversion formula from RGB to HSL. I didn’t like the results of the RGB-HSL formula as much; the images were slightly darker.

Then I placed the darkest pixels on the bottom layer of the composite image, layering brighter (but about 50% transparent) pixels on top. The results for each word resemble abstract art:

11 (11) begrudge (begrudge) liquid (liquid) sharp (sharp) waterfalls (waterfalls)

I did a similar procedure to produce verse composites, layering the important words from each verse on top of each other. The results are generally less spectacular (in my opinion) than the word composites, as many of the verses look like each other. I’d like to explore other compositing algorithms to try to differentiate the verses more. (Feel free to leave a comment if you have any suggestions for an algorithm!)

Gen 1:2 (Gen 1:2) Josh 3:10 (Josh 3:10) Matt 5:4 (Matt 5:4)

The hardest part was the waiting. It took a long time to download the 300,000 images, and nearly as long to process them all. Plus, it takes a while to upload half a gigabyte of data after processing the images.

The verses composites omit images for about 200 common words (as listed in the Crossway Comprehensive Concordance of the ESV). I didn’t want common words to overwhelm the important ones.

From a coding standpoint, this project let me try out the jQuery Javascript library, which I’ve been wanting to do for some time.

Future Directions

Well, the result is awfully orange. As I said above, I would’ve liked to see some more differentiation. (I tried a few different formulas but didn’t come up with anything that works much better.) You can see some bands that are more orange than others, but I’m not sure how significant they are.

It would be interesting to try to calculate cross-references based on the similarity of the verse composites—are verses that look alike actually similar?

Any work of literature would lend itself to this kind of project. Project Gutenberg has lots of public-domain e-texts available. It would be interesting to compare the composite footprint of, say, Moby Dick to the Bible’s.


The inspiration for this project comes from the 80 Million Tiny Images project by Antonio Torralba, Rob Fergus, and William T. Freeman at MIT. They created a map of nouns in the English language by downloading images from search engines, combining the images, and then arranging them into a graphic based on the words’ semantic distance from each other. Fascinating stuff. Via ReadWriteWeb.