Brain Pickings

From Galileo to Google: How Big Data Illuminates Human Culture


“Through our scopes, we see ourselves. Every new lens is also a new mirror.”

Given my longtime fascination with the so-termed digital humanities and with data visualization, and my occasional dabbles in the intersection of the two, I’ve followed the work of data scholars Erez Aiden and Jean-Baptiste Michel with intense interest since its public beginnings. Now, they have collected and contextualized their findings in the compelling Uncharted: Big Data as a Lens on Human Culture (public library) — a stimulating record of their seven-year quest to quantify cultural change through the dual lens of history and digital data by analyzing the contents of the 30,000 books digitized by Google, using Google’s Ngram viewer tool to explore how the usage frequency of specific words changes over time and what that might reveal about corresponding shifts in our cultural values and beliefs about economics, politics, health, science, the arts, and more.

Aiden and Michel, who met at Harvard’s Program for Evolutionary Dynamics and dubbed their field of research “culturomics,” contextualize the premise:

At its core, this big data revolution is about how humans create and preserve a historical record of their activities. Its consequences will transform how we look at ourselves. It will enable the creation of new scopes that make it possible for our society to more effectively probe its own nature. Big data is going to change the humanities, transform the social sciences, and renegotiate the relationship between the world of commerce and the ivory tower.

And big data is indeed big — humongous, even. Each of us, on average, has an annual data footprint of nearly one terabyte, and together we amount to a staggering five zettabytes per year. Since each byte consists of eight bits — short for “binary digits,” with each bit representing a binary yes-no question answered either by a 1 (“yes”) or a 0 (“no”) — humanity’s aggregate annual data footprint is equivalent to a gobsmacking forty sextillion (40,000,000,000,000,000,000,000) bits. Aiden and Michel humanize these numbers, so challenging for the human brain to grasp, with a pause-giving analog analogy:

If you wrote out the information contained in one megabyte by hand, the resulting line of 1s and 0s would be more than five times as tall as Mount Everest. If you wrote out one gigabyte by hand, it would circumnavigate the globe at the equator. If you wrote out one terabyte by hand, it would extend to Saturn and back twenty-five times. If you wrote out one petabyte by hand, you could make a round trip to the Voyager 1 probe, the most distant man-made object in the universe. If you wrote out one exabyte by hand, you would reach the star Alpha Centauri. If you wrote out all five zettabytes that humans produce each year by hand, you would reach the galactic core of the Milky Way. If instead of sending e-mails and streaming movies, you used your five zettabytes as an ancient shepherd might have—to count sheep—you could easily count a flock that filled the entire universe, leaving no empty space at all.

But what makes our age unlike any preceding era is precisely that this information exists not as handwritten documents but as digital data, which opens up wholly new frontiers of making sense of the meaning embedded in these seemingly meaningless strings of 1′s and 0′s. Aiden and Michel put it beautifully:

Like an optic lens, which makes it possible to reliably transform and manipulate light, digital media make it possible to reliably transform and manipulate information. Given enough digital records and enough computing power, a new vantage point on human culture becomes possible, one that has the potential to make awe-inspiring contributions to how we understand the world and our place in it.

Aiden and Michel have focused their efforts on one particular, and particularly important, aspect of the big-data universe: books. More specifically, the more than 30 million books digitized by Google, or roughly a quarter of humanity’s existing books. They call this digital library “one of the most fascinating datasets in the history of history,” and it certainly is — not only due to its scale, which exceeds the collections of any university library, from Oxford’s 11 million volumes to Harvard’s 17 million, as well as the National Library of Russia with its 15 million and the National Library of China with its 26 million. At the outset of Aiden and Michel’s project, the only analog library still greater than the Google Books collection was the Library of Congress, which contains 33 million — but Google may well have surpassed that number by now.

Still, big data presents a number of problems. For one, it’s messy — something that doesn’t sit well with scientists’ preference for “carefully constructed questions using elegant experiments that produce consistently accurate results,” Aiden and Michel point out. By contrast, a big dataset tends to be “a miscellany of facts and measurements, collected for no scientific purpose, using an ad hoc procedure … riddled with errors, and marred by numerous, frustrating gaps.”

To further complicate things, big data doesn’t comply with the basic premise of the scientific method — rather than eventuating causal relationships borne out of pre-existing hypotheses, it presents a seemingly bottomless pit of correlations awaiting discovery, often through the combination of doggedness and serendipity, an approach diametrically opposed to hypothesis-driven research. But that, arguably, is exactly what makes big data so alluring — as Stuart Firestein has argued in his fantastic case for why ignorance rather than certitude drives science, modern science could use what the scientific establishment so readily dismisses as “curiosity-driven research” — exploratory, hypothesis-free investigations of processes, relationships, and phenomena.

Michel and Aiden address these biases of science:

As we continue to stockpile unexplained and underexplained patterns, some have argued that correlation is threatening to unseat causation as the bedrock of scientific storytelling. Or even that the emergence of big data will lead to the end of theory. But that view is a little hard to swallow. Among the greatest triumphs of modern science are theories, like Einstein’s general relativity or Darwin’s evolution by natural selection, that explain the cause of a complex phenomenon in terms of a small set of first principles. If we stop striving for such theories, we risk losing sight of what science has always been about. What does it mean when we can make millions of discoveries, but can’t explain a single one? It doesn’t mean that we should give up on explaining things. It just means that we have our work cut out for us.

Such curiosity-driven inquiries speak to the heart of science — the eternal question of what science actually is — which Michel and Aiden capture elegantly:

What makes a problem fascinating? No one really agrees. It seemed to us that a fascinating question was something that a young child might ask, that no one knew how to answer, and for which a few person-years of scientific exploration — the kind of effort we could muster ourselves — might result in meaningful progress. Children are a great source of ideas for scientists, because the questions they ask, though superficially simple and easy to understand, are so often profound.

Indeed, indeed.

The promise of big data, it seems, is at once to return us to the roots of our childlike curiosity and to advance science to new frontiers of understanding the world. Much like the invention of the telescope transformed modern science and empowered thinkers like Galileo to spark a new understanding of the world, the rise of big data, Aiden and Michel argue, offers to “create a kind of scope that, instead of observing physical objects, would observe historical change” — and, in the process, to catapult us into unprecedented heights of knowledge:

The great promise of a new scope is that it can take us to uncharted worlds. But the great danger of a new scope is that, in our enthusiasm, we too quickly pass from what our eyes see to what our mind’s eye hopes to see. Even the most powerful data yields to the sovereignty of its interpreter. … Through our scopes, we see ourselves. Every new lens is also a new mirror.

They illustrate this with an example by way of Galileo himself, who began a series of observations of Mars in the fall of 1610 and soon noticed something remarkably curious: Mars seemed to be getting smaller and smaller as the months progressed, shrinking down to a third of its September size by December. This, of course, indicated that the planet was drifting farther and farther from Earth, which went on to become that essential piece of evidence demonstrating that the Ptolemic idea of the geocentric universe was wrong: Earth wasn’t at the center of the cosmos, and the planets were moving according to their own orbits.

But Galileo, with this primitive telescope, couldn’t see any detail of red planet’s surface — that didn’t happen until centuries later when an astronomer by the name of Giovanni Schiaparelli aimed his far more powerful telescope at Mars. Suddenly, before his eyes were mammoth ridges that covered the planet’s surface like painted lines. These findings made their way to a man named Percival Lowell and impressed him so that in 1894, he built an entire observatory in Flagstaff, Arizona, equipped with a yet more powerful telescope, so that he could observe those mysterious lines. Lowell and his team went on to painstakingly record and map Mars’s mesh of nearly 700 criss-crossing “canals,” all the while wondering how they might have been created.

One of Lowell's drawings of the Martian canals.

Turning to the previous century’s theory that Mars’s scarce water reserves were contained in the planet’s frozen poles, Lowell assumed that the lines were a meticulous network of canals made by the inhabitants of a perishing planet in an effort to rehydrate it back to life. Based solely on his telescopic observations and the hypotheses of yore, Lowell concluded that Mars was populated by intelligent life — a “discovery” that at once excited and riled the scientific community, and even permeated popular culture. Even Henry Norris Russell, the unofficial “dean of American astronomers,” called Lowell’s ideas “perhaps the best of the existing theories, and certainly the most stimulating to the imagination.” And so they were — by 1898, H.G. Wells had penned The War of the Worlds.

While Lowell’s ideas dwindled in the centuries that followed, they still held their appeal. It wasn’t until NASA’s landmark Mariner mission beamed back close-up photos of Mars — the significance of which Carl Sagan, Ray Bradbury, and Arthur C. Clarke famously debated — that the anticlimactic reality set in: There were no fanciful irrigation canals, and no little green men who built them.

The moral, as Aiden and Michel point out, is that “Martians didn’t come from Mars: They came from the mind of [Lowell].”

What big data offers, then, is hope for unbridling some of our cultural ideas and ideologies from the realm of myth and anchoring them instead to the spirit of science — which brings us to the crux of the issue:

Digital historical records are making it possible to quantify our human collective as never before.


Human history is much more than words can tell. History is also found in the maps we drew and the sculptures we crafted. It’s in the houses we built, the fields we kept, and the clothes we wore. It’s in the food we ate, the music we played, and the gods we believed in. It’s in the caves we painted and the fossils of the creatures that came before us. Inevitably, most of this material will be lost: Our creativity far outstrips our record keeping. But today, more of it can be preserved than ever before.

What makes Aiden and Michel’s efforts particularly noteworthy, however, is that they are as much a work of scrupulous scholarship as of passionate advocacy. They are doing for big data in the humanities what Neil deGrasse Tyson has been doing for space exploration, instigating both cultural interest and government support. They remind us that in today’s era of big science, where the Human Genome Project’s price tag was $3 billion and the Large Hadron Collider’s quest for the Higgs boson cost $9 billion, there is an enormous disconnect between the cultural value of the humanities and the actual price we put on better understanding human history — by contrast to such big science enterprises, the entire annual budget of the National Endowment for the Humanities is a mere $150 million. Michel and Aiden remind us just what’s at stake:

The problem of digitizing the historical record represents an unprecedented opportunity for big-science-style work in the humanities. If we can justify multibillion-dollar projects in the sciences, we should also consider the potential impact of a multibillion-dollar project aimed at recording, preserving, and sharing the most important and fragile tranches of our history to make them widely available for ourselves and our children. By working together, teams of scientists, humanists, and engineers can create shared resources of extraordinary power. These efforts could easily seed the Googles and Facebooks of tomorrow. After all, both these companies started as efforts to digitize aspects of our society. Big humanities is waiting to happen.

And yet the idea is nothing new. Count on the great Isaac Asimov to have presaged it, much like he did online education, the fate of space exploration, and even Carl Sagan’s rise to stardom. In his legendary Foundation trilogy, Asimov conceives his hero, Hari Seldon, as a masterful mathematician who can predict the future through complex mathematical equations rooted in aggregate measurements about the state of society at any given point in time. Like Seldon, who can’t anticipate what any individual person will do but can foreshadow larger cultural outcomes, big data, Aiden and Michel argue, is the real-life equivalent of Asimov’s idea, which he termed “psychohistory” — an invaluable tool for big-picture insight into our collective future.

Perhaps more than anything, however, big data holds the promise of righting the balance of quality over quantity in our culture of information overabundance, helping us to extract meaning from (digital) matter. In a society that tweets more words every hour than all of the surviving ancient Greek texts combined, we certainly could use that.

Uncharted is an excellent and timely read in its entirety, both as a curious window into the secret life of language and as an important piece of advocacy for the value of the digital humanities in the age of data. Sample the project with Aiden and Michel’s entertaining and illuminating TED talk:

Donating = Loving

Bringing you (ad-free) Brain Pickings takes hundreds of hours each month. If you find any joy and stimulation here, please consider becoming a Supporting Member with a recurring monthly donation of your choosing, between a cup of tea and a good dinner:

You can also become a one-time patron with a single donation in any amount:

Brain Pickings has a free weekly newsletter. It comes out on Sundays and offers the week’s best articles. Here’s what to expect. Like? Sign up.

Party Like It’s 1903: Virginia Woolf on the Ecstasy of Music and Dance


“Dance music … stirs some barbaric instinct — lulled asleep in our sober lives — you forget centuries of civilization in a second, & yield to that strange passion which sends you madly whirling round the room.”

“Oh, how wonderful! How like the mind it is!” Helen Keller exclaimed in her moving first experience of dance. “Even poetry, Sweet Patron Muse forgive me the words, is not what music is,” young Edna St. Vincent Millay wrote in a letter to a friend. “Twyla Tharp reconciles me to being a woman … Non-sexist dancing — strong women with their own energy, subjects not objects, playful with men — not afraid of them,” Susan Sontag mused in her diary.

From A Passionate Apprentice: The Early Journals, 1897–1909 (public library) — the same wonderfully rich volume that gave us young Virginia Woolf on imitation and the arts and the glory of the human mind — comes a glimpse of a lesser-known side of the seemingly reserved author: Her love of music and dance.

In an essayistic entry from 1903, titled “A Dance at Queen’s Gate” and reproduced here with her original spelling, 21-year-old Virginia writes:

About two hours ago, when I went to bed, I heard what I took to be signs of merry making in the mews. A violin squeaked, there was a noise of loud voices & laughter. It reminded me how once, as a child, I woke at dead of night: it seemed to me — 8 or 9 I suppose really & I heard strange & horrible music as of a midnight barrel organ, & was so frightened that I had to crawl to the cot next mine for sympathy. But I am too old for that kind of blind terror; my critical mind when awake enough to think at all about it, decided that the fiddle squeaking &c. was token of a ball — not in our street — but in Queens Gate — the tall row of houses that makes a background to the mews. The music grew so loud, so rhythmic — as the night drew on & the London roar lessened, that I threw up my window, leant out into the cool air, & saw the illuminations which told surely from what house the music came.

Now I have been listening for an hour. The music stops — I hear the chatter, the light laughter of womens voices — the deeper notes of festive males. I can almost see the couples wandering out from the ball rooms to the balconies which are starred with small lamps. They look straight across the mews to me. The music has begun again — oh dear — the swing & the lilt of that waltz makes me almost feel as though I could jump from my bed & dance to it too. That is the quality which dance music has — no other: it stirs some barbaric instinct — lulled asleep in our sober lives — you forget centuries of civilization in a second, & yield to that strange passion which sends you madly whirling round the room — oblivious of everything save that you must keep swaying with the music — in & out, round & round — in the eddies & swirls of the violins. It is as though some swift current of water swept you along with it. It is magic music. Here the bars run low, passionate, regretful, but always in the same pulse. We dance as though we knew the vanity of dancing. We dance to drown our sorrows — but dance, dance — If you stop you are lost. This one night we will be mad — dance lightly — raise our hearts as the beat strengthens, grows buoyant — careless, defiant. What matters anything so long as ones step is in time — so long as one’s whole body & mind are dancing too — what shall end it?

Dinomania: (n) irresistible urge to dance

Artwork by Polly M. Law from her Word Project. Click image for details.

After a short contemplation of the fabric of the music, noting “the very height of the rhythm, some strange, solitary sound,” Woolf finds herself exhausted and consumed by the dense darkness of the night sky, then returns to the exhilaration of dance — but this time as a melancholy observer, painting an ominous, zombie-like picture of the dancing throng:

The music again! I begin to think someone has wound up this weary waltz & it will go on at intervals all thro‘ the night. Nobody is dancing in time to it now I am sure — or they dance as pale phantoms because so long as the music sounds they must dance — no help for them. Surely the music that seemed to ebb before, has gathered strength — it sounds louder & louder — it swings faster & faster — no one can stop dancing now. They are sucked in by the music. And how weary they look — pale men — fainting women — crumpled silks & trampled flowers. They are no longer masters of the dance — it has taken possession of them. And all joy & life has left it, & is diabolical, a twisting livid serpent, writhing in cold sweat & agony, & crushing the frail dancers in its contortions. What has brought about the change? It is the dawn.

Complement A Passionate Apprentice with the only surviving recording of Woolf’s voice and her timeless meditations on how to read a book, the language of film, the creative benefits of keeping a diary.

Donating = Loving

Bringing you (ad-free) Brain Pickings takes hundreds of hours each month. If you find any joy and stimulation here, please consider becoming a Supporting Member with a recurring monthly donation of your choosing, between a cup of tea and a good dinner:

You can also become a one-time patron with a single donation in any amount:

Brain Pickings has a free weekly newsletter. It comes out on Sundays and offers the week’s best articles. Here’s what to expect. Like? Sign up.

Kurt Vonnegut on the Secret of Happiness: An Homage to Joseph Heller’s Wisdom


The meaning of life, in a short verse.

“Don’t make stuff because you want to make money — it will never make you enough money. And don’t make stuff because you want to get famous — because you will never feel famous enough,” John Green advised aspiring writers. “If you worship money and things … then you will never have enough. Never feel you have enough. It’s the truth,” David Foster Wallace admonished in his timeless commencement address on the meaning of life. But what does it really mean to “have enough?”

There is hardly a better answer than the one implicitly given by Kurt Vonnegutman of discipline, champion of literary style, modern sage, one wise dad — in a poem he wrote for The New Yorker in May of 2005, reprinted in Robert Sutton’s The No Asshole Rule: Building a Civilized Workplace and Surviving One That Isn’t (public library) with Vonnegut’s permission:


True story, Word of Honor:
Joseph Heller, an important and funny writer
now dead,
and I were at a party given by a billionaire
on Shelter Island.

I said, “Joe, how does it make you feel
to know that our host only yesterday
may have made more money
than your novel ‘Catch-22′
has earned in its entire history?”
And Joe said, “I’ve got something he can never have.”
And I said, “What on earth could that be, Joe?”
And Joe said, “The knowledge that I’ve got enough.”
Not bad! Rest in peace!”

Complement with Vonnegut on how to write with style, the writer’s responsibility and the limitations of the brain, the shapes of stories, his daily routine, his heart-warming advice to his children, and his favorite erotic illustrations.

Donating = Loving

Bringing you (ad-free) Brain Pickings takes hundreds of hours each month. If you find any joy and stimulation here, please consider becoming a Supporting Member with a recurring monthly donation of your choosing, between a cup of tea and a good dinner:

You can also become a one-time patron with a single donation in any amount:

Brain Pickings has a free weekly newsletter. It comes out on Sundays and offers the week’s best articles. Here’s what to expect. Like? Sign up.