Second bytes: How to re-use data for the common good

Media Fellowship

Zettabytes of data are being accumulated worldwide every year and could be applied towards improving public health and the environment. But instead the majority is unused – and there are still numerous hurdles for sharing and donating it

Air Quality Measuring Station in Hamburg
Teaser Image Caption
Remnant from a low-tech era: air quality measuring station in Hamburg, Germany

New Year’s Day is a bad day for air quality in German cities. On January 1, 2020, the measuring station on Hamburg’s Stresemannstrasse registered 112 micrograms of fine particulate matter per cubic meter, almost five times as much as on a normal day, because of the fireworks the night before. In contrast, nitrogen oxides, the most common by-product of car exhaust fumes, reached their record levels in autumn: 69 micrograms of NO2 in mid-September and 134 micrograms of NO at the end of November.

If you want, you can find even more data on the air quality in Hamburg's Stresemannstrasse. The measuring station with the internal abbreviation 17SM digitally records the pollution on the busy, four-lane arterial road and continuously updates it for local residents, who can check on the measurements online before they leave the house. Is this what a progressive citizen service looks like?

Not at all, says Robert Heinecke, founder of Breeze Technologies. His start-up uses sensors and artificial intelligence to analyze the pollution of indoor and public places – and he would like to see more data being made available. Often the public stations only measure one of numerous substances in the air: ozone, ammonia, volatile organic compounds, sulfur dioxide. With fine particulate matter, also called PM 2.5, it is common to publish the mean value for the last 24 hours. But this is of little significance because it does not reveal what time of day has the highest concentration, says Heinecke. Take New Year's Eve. When the fireworks are used up after midnight the curve decreases rapidly.

It would make more sense to evaluate the data the measuring stations collect every minute. However, the city erases this unseen. “This data would be a treasure for us, with its help we could build new business models and improve public health,” says Heinecke. The willingness of authorities to share this data has so far been rather limited. When he asked the responsible officials to release it, they told him what they would charge for their work: 600 euros.

The EU wants to promote innovation and advance the common good

Data accumulates constantly and everywhere, in traffic, at home or at work, through every Instagram post and every streamed video. According to forecasts, the global volume of these data points will increase dramatically in the future, from 33 zettabytes (2018) to 175 zettabytes (2025), a number with 21 zeros. So far, only a fraction of this has been evaluated. According to a study in 2012, it was less than one percent at the time. Even if that has multiplied since then: The vast majority remains unused on computers and servers, especially in private companies.

European policymakers want to change that. The European Commission is working on a strategy to reorganize access to data and to develop a standard for easier sharing between corporations, founders, research institutions, and citizens. The hope is that amassing large amounts of data will spur innovation similar to the model of dominant digital companies in the United States and China. But the Commission also wants to advance the common good, sustainability goals and climate protection. The debate revolves around how data can be shared and used responsibly, via data donations by altruistic citizens, via data trustees who act as neutral intermediaries, or via data cooperatives whose members exchange data with one another. In addition, policymakers are discussing introducing obligations to share data in the private sector, including with competitors.

For the time being, Robert Heinecke from Breeze would be content with a “regulatory sandbox,” to be able to experiment in a temporary real-world laboratory. “Then we could develop a tool that helps asthmatics, retirees or parents with small children map out the safest route for a walk,” he says. Yet for now he is not allowed to access the real-time data he would need for this. And even if he could access it, the existing measuring infrastructure is insufficient. In his city of Hamburg, where 1.8 million people live, there are just 15 stations in operation.

Can citizens gain back control of “their” data?

Air is a hyperlocal issue. In a city, pollution levels can differ from one street to the next. Breeze is therefore building its own network using white, cylindrical sensors, twenty centimeters high and fourteen centimeters in diameter, which are "up to 1,000 times cheaper and 50,000 times smaller" than stationary units, according to Heinecke. Breeze calibrates the built-in sensors to meet the World Health Organization’s parameters for urban air quality. The company asks citizens for permission to install sensors in homes, on balconies, or terraces. Due to the rising public interest in the state of local air quality, it was easy to find citizens who were willing to make their apartment or house available.

Placing citizens at the center of decisions over how and for what purpose data should be collected and used is not a new idea. The concept is at the core of the right to informational self-determination, which the EU has enshrined in its General Data Protection Regulation (GDPR). Whereas this principle has taken a back seat in recent years as tech companies, secret services and governments have collected more and more data, sometimes without clear purpose or users’ consent, the number of initiatives that aim to put people in charge of their data and encourage voluntary data sharing is now increasing.

Members of the Swiss cooperative Midata provide their personal data for medical research and clinical studies. In Germany, the NGO Algorithm Watch is looking for data donors for its “DataSkop” project to shed light on algorithmic decision-making systems used by social media, sales platforms, and financial information agencies. In Barcelona and Amsterdam, the EU-supported “Decode” project set up secure neighborhood portals and organized democratic participation with digital tools.

In New York, the GovLab, which is part of New York University, started organizing “data assemblies.” In three separate meetings, so-called “mini-publics,” consisting of data holders, politicians, civil society representatives and citizens of all five boroughs came together to express their needs, wishes, expectations, concerns and criticism. Last summer, the first assembly focused on how existing data could be used to get the Covid-19 crisis under control. Future assemblies could deal with financial challenges for example. When a bank realizes that a customer is left with only $400 in the account – is there a secure way of offering financial advice so that he or she is still able to put something aside for their pension?

“One of the biggest tragedies of the data space is that we don't use all of our assets in order to improve people’s lives,” says GovLab’s co-founder and chief research and development officer Stefaan Verhulst. Verhulst has promoted data collaborations and the “re-use” of data for years. “So far, people have either been for or against sharing data – but there are hardly any public debates in which the pros and cons are weighed up and the gray areas in between are explored.”

“Data philanthropy will be an important part of corporate social responsibility”

There are many problems that need to be resolved. One of them is the legal framework. The donation of data is not even regulated in the EU’s GDPR, which came into force in 2018. On the contrary: some provisions in the law make the sharing of data more difficult. A second hurdle: Who is responsible for the implementation of data-sharing mechanisms in corporations, NGOs or other organizations? Even large companies often lack clear protocols for that, Verhulst says. He calls for creating positions for “data stewards,” data administrators who can serve as contact persons if, for example, a humanitarian organization asks for digital support. Verhulst thinks it is only a matter of time before such collaborations will become reality. “In the digital age, data philanthropy will be an important part of corporate social responsibility.”

Robert Heinecke uses his business model to test what something like this might look like. Breeze Technologies operates at the interface between for-profit and non-profit. On the one hand, it rents its sensors to companies that want to monitor the air quality in factories and around their industrial plants. On the other hand, Breeze does not charge residents for this service. The solution: “Customers pay a lower subscription price if they make the data collected from them available for the common good and have it published via a freely accessible citizen portal,” says Heinecke.

So transparency is rewarded. And even if the commercial customers decide not to share their data, Breeze uses it in the background to train its artificial intelligence and to better calibrate the publicly available sensors.

Using this model, Robert Heinecke hopes to be able to equip entire cities with sensors free of charge and finance his business on the basis of the data obtained. He has already won several German communities such as Neckarsulm, Moers and Hennef in Germany as partners. In his home city of Hamburg he still has to do some persuasive work though. Why is the city hesitating? “Because we uncover problems that should actually be dealt with by politicians.”

For the time being, Hamburg continues to rely on its stationary measuring, weathered boxes as the size of bus shelters, some of which were set up in the 1980s. But with increasing public demand for better and more accessible data, it is difficult to imagine that they will hold out as standard air surveillance instruments for another 40 years.