Never heard about Culturomics? No worries, just get a feel of its power by checking your favorite words using the Google Lab Ngram Viewer.
“Culturomics is a form of computational lexicology that studies human behavior and cultural trends through the analysis of digitized texts. Researchers data mine large digital archives to investigate cultural phenomena reflected in language and word usage. The term is an American neologism first described in a 2010 ‘Science’ article called ‘Quantitative Analysis of Culture Using Millions of Digitized Books,’ co-authored by Harvard researchers Jean-Baptiste Michel and Erez Lieberman Aiden.”
In this first talk, entitled ” What can we learn from 5 million books”, Jean-Baptiste Michel and Erez Lieberman Aiden introduce the concept of NGRAMS, a text-mining tool, and apply it to the analysis of human historical and cultural trends.
As an extension of the concepts, Kalev H. Leetaru introduces Culturomics 2.0. By applying Ngrams to digital and social media he presents a set of fascinating conclusions like the anticipation of the arab spring uprising or spotting Bin Laden’s hideout.
Ngrams are also becoming enormously important in Genomics, machine learning and natural language recognition. If you would like to know more about that just have a look at the folowing links:
- N-gram analysis of 970 microbial organisms reveals presence of biological language models
A visual framework for sequence analysis using n-grams and spectral rearrangement
- Daily Ngrams by WordPress
- The Berkeley Natural Language Processing Group
- Bigger, Better Google Ngrams: Brace Yourself for the Power of Grammar