Crisis-mapping technology has emerged in the past five years as a tool to help humanitarian organizations deliver assistance to victims of civil conflicts and natural disasters. Crisis-mapping platforms display eyewitness reports submitted via e-mail, text message, and social media. The reports are then plotted on interactive maps, creating a geospatial record of events in real time.
The first generation of these humanitarian technologies was powered by free, open-source software produced by organizations such as InSTEDD, Sahana, and Ushahidi. For example, Ushahidi (the name means “witness” or “testimony” in Swahili) developed an interactive-mapping platform linked to a live multimedia inbox and used it to document violence that erupted in Kenya after the disputed presidential elections of 2008. Eyewitnesses sent reports of ethnic attacks and other violent incidents to the Ushahidi Web site via e-mail and text message. Ushahidi then plotted the location of each incident on a Google map, creating a public record of events.
The Ushahidi platform was later used to crowdsource a live crisis map of the 2010 earthquake in Haiti. In the days and weeks following the earthquake, eyewitnesses submitted a large volume of text messages, tweets, photographs, video, and Web-based reports to the Ushahidi in-box. Once these reports were manually collated and plotted on the Ushahidi platform, they became a live crisis map of urgent humanitarian needs. For example, the map showed exactly where victims lay buried under the rubble of collapsed buildings, and where medical supplies needed to be delivered. The US Marine Corps, one of the first responders to the earthquake, has stated that the map helped them save hundreds of lives. The Ushahidi platform has since been used in response to dozens of other disasters worldwide.
The pioneers behind the first wave of crisis-mapping technology were typically gifted hackers from the dynamic open-source community. Creating the next generation of these technologies will require additional skills in data analytics, artificial intelligence, machine learning, and social computing. This kind of expertise exists today in world-class research institutes staffed by experts who have the wherewithal to carry out cutting-edge R&D in multiple areas of advanced computing.
To understand what the next generation of humanitarian technology will look like, it helps to understand the limitations of today’s crisis-mapping platforms. I served as director of crisis mapping at Ushahidi, where I led a number of major crisis deployments, starting with the Haiti earthquake. Within a few hours of the earthquake, I started mapping Twitter and other social-media traffic related to the disaster and building the code that would allow our system to accept text messages about the quake. A few days later, hundreds of texts from disaster-affected communities in Port-au-Prince started landing in the Ushahidi in-box. Each incoming text had to be manually categorized and geotagged. For example, texts about earthquake victims buried under rubble were tagged as “trapped individuals” and georeferenced to the locations where individuals were thought to be buried.
We quickly realized that our platform was not equipped to handle this high volume and velocity of urgent information. For example, we had hundreds of volunteers available to process text messages, but our system could only accommodate a half-dozen volunteers at any one time. To handle the huge number of text messages pouring into Ushahidi’s in-box, we had to work outside our platform. We customized a third-party ticketing system to track incoming texts. That system allowed many more volunteers to categorize and tag urgent text messages at the same time, but it meant that we had to manually import messages from the Ushahidi platform and then export them back to Ushahidi after processing. While not ideal, this was the only working solution that we could rapidly deploy. Yet even with this approach in place, the backlog of unprocessed text messages grew larger with every passing day.
Fast-forward to the Japanese earthquake and tsunami in 2011, when eyewitnesses and other observers posted more than 300,000 tweets every minute during the disaster and its aftermath. In the fall of 2012, Hurricane Sandy struck the eastern seaboard of the United States, eliciting more than 20 million tweets. Welcome to the world of big (crisis) data, in which disaster-affected locations are increasingly becoming digital communities, thanks to the proliferation of social media and smartphones.
After three years with the Ushahidi team, I began to look for a new home where I could help create the next generation of humanitarian-technology solutions. I found this home at the Qatar Computing Research Institute (QCRI) in Doha. QCRI was launched two years ago to carry out world-class R&D in multiple areas of advanced computing, including big-data analytics, distributed systems, and social computing. As a member of the Qatar Foundation, QCRI’s mandate includes social impact. I was brought on as director of social innovation and given the task of harnessing the world-class expertise at QCRI to address major humanitarian challenges. We have 80 researchers on staff and may double our team in 2014, and again in 2016. My colleagues come from both industry and academia, hailing from institutions such as Microsoft Research, IBM, Yahoo Research, MIT, Georgia Tech, and the Max Planck Institutes.
One of my first moves at QCRI was to set up a crisis-computing team. Our first order of business? Finding a solution to exploding Ushahidi inboxes. After months of data-driven research on the operational value of Twitter for crisis response, plus conversations with the United Nations Office for the Coordination of Humanitarian Affairs, we decided to create a “Twitter Dashboard for Disaster Response.” We are developing Twitter “classifiers,” algorithms that can automatically identify relevant and informative tweets during crises. Individual classifiers will automatically capture eyewitness reports, infrastructure-damage assessments, casualties, humanitarian needs, offers of help, and so forth.
Our initial results have been promising, with accuracy rates ranging between 70 to 90 percent. This means that our algorithms are able to tag at least 70 percent of tweets correctly. We believe we can improve these accuracy rates, but there’s a catch. We optimized our classifiers for the type of disaster they have been “trained” on. In other words, an automatic classifier for infrastructure damage developed using historical Twitter data from the 2011 New Zealand earthquake will not work very well for Twitter data from Hurricane Sandy.
My colleagues and I have therefore been collecting multiple Twitter data sets from different disasters. This been challenging, because Twitter’s current terms of service, like those of many social-media firms, are naturally written for commercial uses, and prohibit direct sharing of datasets. Of course, we can’t collect every single tweet for every disaster from the past five years or we’ll never get to actually developing the dashboard. Besides, some of the most interesting Twitter data sets have emerged from recent disasters. Before 2010, for example, US users dominated the Twitter platform. Twitter’s international coverage has since increased, along with the number of new Twitter users, which almost doubled in 2012 alone. As Twitter becomes a larger and more global platform, its value as a data source for crisis mapping will increase.
Our dashboard will include a number of predeveloped classifiers based on as many data sets as we can get our hands on. The dashboard will also allow users to create their own classifiers on the fly by leveraging real-time machine learning. Assume, for example, that an earthquake strikes Indonesia and that no classifiers exist for a disaster of this kind in that country. Using our dashboard, users can train the algorithm to recognize tweets about, say, infrastructure damage.
This simply entails the manual tagging of 50-plus tweets about infrastructure damage to teach the algorithm what to look for. The new classifier will then automatically tag new tweets accordingly.
The classifier will not identify every tweet correctly. But the beauty of this technology is that it continues to learn and improve over time, as users “teach” the classifier not to make the same mistakes. And once the disaster-response efforts in Indonesia are over, this new classifier joins the library of existing ones for use by humanitarian organizations in similar future crises.
Ultimately we envision these classifiers as individual apps that can be created, dragged, and dropped on an intuitive, widget-like dashboard with multiple data-visualization options. The dashboard will be freely accessible and open source. We also plan to develop classifiers for other languages besides English, including Arabic, French, and Spanish. Although we hope to have a working prototype soon, for now the entire project is experimental. That’s one of the biggest advantages of working at a well-funded advanced research institute such as QCRI.
We have the luxury of leveraging world-class expertise to carry out basic research in the hope of solving major humanitarian challenges. Onward!