At the Digital Development: From Principles to Practice Forum, ICT4D practitioners came together to discuss the inherent tensions between open and interoperable data for transparency and performance improvement, while also protecting vulnerable populations’ privacy and respecting community concerns around data security.
The session started with a high level overview of the pros and cons of open data (as seen through a privacy/security lens). Then breakout groups dove deeper into three subcategories – transparency, data for decision-making, and interoperability. Each group looked at definitions and misunderstandings, pros/cons, and possible solutions, and presented definitions and key takeaways for further discussion.
Major opportunities for open, interoperable and shared data in international development included:
However, the room quickly found that most of the pros were also cons, and vice versa. For example, open data can improve accountability but it can also increase liability. Tracking personally identifiable information can mean improved transparency but also greater vulnerability.
Many additional concerns were brought up, such as the fact that there is already a lot of data being collected but much of it is never used, and yet, the international development community keeps asking for more and more data. Participants highlighted that much of the available data is hard to use because it is of poor quality, information about methodology and approach is missing, or is not interoperable.
In addition, key questions arose about who owns the data (especially when it is not owned by the people whose information is contained in it) and how do they use it? There are considerable concerns that data without context or nuance can be misleading or biased. There are also issues with the lack of a “sunset” policy in much collected data – future anonymity is not assured, especially as technology keeps changing and the ability to access, analyze and combine data sets keeps growing.
Finally, there was acknowledgement that the problem with data privacy and security is a new problem that is now part of our lives. No one has “figured it out” yet, anywhere and it is only going to get more complex, especially as individuals globally rely more and more on their digital identities for daily living.
3 Key takeaways
The first group looked at one of the explicit goals of open data – improved transparency. They asked the clarifying question of “transparency by whom and for whom?” There are different forms of transparency (and different responses required) based on whether we are talking about the economic markets, government services, or scientific research.
Major questions that need to be asked around data for transparency include: who owns the data, for whom is the data transparency intended, and for what use?
Part of the definition of transparency included access to information and accountability for the information results, and ways to assess the risks/benefits. There is a lack of standardized policies and approaches across the international development community to address many of these questions.
Positives around open data for transparency focused on increasing access to information which can lead to accountability, increased engagement by citizens and other stakeholders, and innovative ideas through analysis. Challenges revolved around privacy concerns, especially for already vulnerable populations, data errors and quality, possibility for manipulation and misuse (such as market pricing) and unintended consequences.
A deeper discussion around informed consent (especially related to government services or health information) and public information (especially for scientific research) is required.
The group on data for decision making started by asking the clarifying question of who is making the decision and for whom? Data can be used by a wide variety of groups, from NGOs, businesses, government, military, citizens, financial institutions, small businesses, and criminals. They also specified that there is a continuum of data – data may be in different states, such as raw, aggregated, filtered, analyzed, etc.
One key concern is how to use data without losing the context – i.e. critical information on what the data means. Decisions without this context would be highly misguided or incorrect. The example given was that of Ebola data and Liberia. If you just looked at economic statistics during the Ebola crisis without realizing the crisis was occurring in the background, the numbers would be misleading as to the issues in the country.
Another concern is around having bad quality data and not realizing it. Poor data can lead to poor decisions. Also, raw data is not the same as analyzed data – the analysis involves processing the data and hopefully taking into account quality, context, etc. Getting access to analyzed data used in decision making can be as useful as the raw data.
There is a need for clear policy and standards (from donors, organizations, etc.), preferably saying ‘open it’ or ‘share’ by default, with incentives for sharing (especially data used to perform an analysis).
Interoperability of data means that a set of facts and figures can be interchanged (aggregated, cross referenced, and layered). This requires a standardized format/structure and definitions (including methodology of how it’s collected, time length, etc.). It’s important too to note that “open data” is not the same as “big data.” A lot of the data being discussed is small to sizable data.
The pros were too many to mention and have been outlined by most in the open data community. Some of the challenges outlined included the lack of incentives for many organizations to make their data interoperable by default, especially when interoperability can be seen as undermining their competitiveness or as being too costly.
Many organizations are concerned about a lack of flexibility if they are required to follow a standard – they feel that the standard will constrain their ability to collect data for their specific needs. There is also the concern that increased interoperability of data leads naturally to increased vulnerability as it becomes easier to “reassemble” stripped data which create privacy concerns.
In conclusion, the following is a brief list of recommended topics for the foreign assistance community to address in regard to the tension between open data and privacy and security.