Skip to main content

Creating a Global Dashboard. The GDELT Project

There is probably no bigger dream for a data geek like myself than creating the ultimate data dashboard or scorecard of the world. One that summarizes and enables the analysis of all the data in the world.

Well, for those of you who have also dreamt about this, Kalev H. Leetaru, a senior fellow at the George Washington University Center for Cyber & Homeland Security has tapped into your dreams—and is working on something in this realm. Leetaru, whom some have called “The Wizard of Big Data,” is developing a platform for monitoring and better understanding how human society works.

The project called Global Database of Events, Language, or simply The GDELT Project, is an ambitious endeavor created to “crack” the social numbers of the world, and has the aim of improving our understanding of human society.

As described by the folks at GDELT:

The GDELT Project came from a desire to better understand global human society and especially the connection between communicative discourse and physical societal-scale behavior. The vision of the GDELT Project is to codify the entire planet into a computable format using all available open information sources that provides a new platform for understanding the global world.

To do this, The GDELT Project has collected information dated back to 1979 and keeps updating it regularly, so its catalogs are always fresh. According to GDELT, the project has already more than a quarter billion event records in more than 300 categories. It also keeps up to date a massive network diagram that connects each individual with all existing entities and events in the world, such as locations, organizations, themes, emotions, and other data.

Information is gathered from many sources including: Google, Google ideas, Google News, the Internet Archive, BBC Monitoring, among many others.

So what makes The GDELT Project so interesting?

Well, it’s a perfect opportunity for data aficionados to lay their hands on social data from around the world in three different ways:

  1. Using GDELT Analysis Service, a free cloud-based offering that includes tools and services to visualize, explore, and export the data.
  2. Using the complete dataset available at Google’s Big Query service.
  3. Downloading data in CSV format.

This allows different types of users to get their hands on the data in the way that suits them best. So, for immediate consumption and analysis, users can go with the first option. Users with more specific requirements or with complex projects can use the data provided by the second or third option.

Whichever way you choose to access the worldwide data, this could be a great opportunity for you, my dear data junkie, to explore and embark upon a data deluge journey for a new school, entrepreneurial, or just playtime project.

This is just a brief intro into really cool project. I’ll update you on major advancements of The GDELT Project as they come along.

In the meantime, I would encourage you to have a look at this nice 20-minute video about The GDELT Project.

As always, you can also drop me a line below. Enjoy.


  1. It is a very interesting article. I would like to learn how to write such an article, or such article as professional writers who work for the cheapest essay service (about which you can read more here) write, which provides not only articles but also term papers, theses and other types of papers.


Post a Comment

Popular posts from this blog

Machine Learning and Cognitive Systems, Part 2: Big Data Analytics

In the first part of this series, I described a bit of what machine learning is and its potential to become a mainstream technology in the industry of enterprise software, and serve as the basis for many other advances in the incorporation of other technologies related to artificial intelligence and cognitive computing. I also mentioned briefly how machine language is becoming increasingly important for many companies in the business intelligence and analytics industry. In this post I will discuss further the importance that machine learning already has and can have in the analytics ecosystem, especially from a Big Data perspective. Machine learning in the context of BI and Big Data analytics Just as in the lab, and other areas, one of the reasons why machine learning became extremely important and useful in enterprise software is its potential to deal not just with huge amounts of data and extract knowledge from it—which can somehow be addressed with disciplines such as data

Next-generation Business Process Management (BPM)—Achieving Process Effectiveness, Pervasiveness, and Control

The range of what we think and do is limited by what we fail to notice. And because we fail to notice that we fail to notice there is little we can do to change until we notice how failing to notice shapes our thoughts and deeds. —R.D. Laing Amid the hype surrounding technology trends such as big data, cloud computing, or the Internet of Things, for a vast number of organizations, a quiet, persistent question remains unanswered: how do we ensure efficiency and control of our business operations? Business process efficiency and proficiency are essential ingredients for ensuring business growth and competitive advantage. Every day, organizations are discovering that their business process management (BPM) applications and practices are insufficient to take them to higher levels of effectiveness and control. Consumers of BPM technology are now pushing the limits of BPM practices, and BPM software providers are urging the technology forward. So what can we expect from the next

Teradata Open its Data Lake Management Strategy with Kylo: Literally

Still distilling good results from the acquisition of former consultancy company Think Big Analytics , Teradata , a powerhouse in the data management market took one step further to expand its data management stack and to make an interesting contribution to the open source community. Fully developed by the team at Think Big Analytics, in March of 2017 the company launched Kylo –a full data lake management solution– but with an interesting twist: as a contribution to the open source community. Offered as an open source project under the Apache 2.0 license Kylo is, according to Teradata, a new enterprise-ready data lake management platform that enables self-service data ingestion and preparation, as well the necessary functionality for managing metadata, governance and security. One appealing aspect of Kylo is it was developed over an eight year period, as the result of number of internal projects with Fortune 1000 customers which has enabled Teradata to incorporate several be