data protection

For data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice.

Many organisations are coming up with their own internal policy and guidelines for data sharing. However, for data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice. During her time as an OII Visiting Associate, Alison Holt developed a pragmatic solution in the form of a Voluntary Code, anchored in the developing ISO standards for the Governance of Data. She discusses the voluntary code, and the need to provide urgent advice to organisations struggling with policy for sharing data. Collecting, storing and distributing digital data is significantly easier and cheaper now than ever before, in line with predictions from Moore, Kryder and Gilder. Organisations are incentivised to collect large volumes of data with the hope of unleashing new business opportunities or maybe even new businesses. Consider the likes of Uber, Netflix, and Airbnb and the other data mongers who have built services based solely on digital assets. The use of this new abundant data will continue to disrupt traditional business models for years to come, and there is no doubt that these large data volumes can provide value. However, they also bring associated risks (such as unplanned disclosure and hacks) and they come with constraints (for example in the form of privacy or data protection legislation). Hardly a week goes by without a data breach hitting the headlines. Even if your telecommunications provider didn’t inadvertently share your bank account and sort code with hackers, and your child wasn’t one of the hundreds of thousands of children whose birthdays, names, and photos were exposed by a smart toy company, you might still be wondering exactly how your data is being looked after by the banks, schools, clinics, utility companies, local authorities and government departments that are so quick to collect your digital details. Then there are the companies who have invited you to sign away the rights to your data and possibly your…

As dementia is believed to be influenced by a wide range of social, environmental and lifestyle-related factors, this behavioural data has the potential to improve early diagnosis

Image by K. Kendall of "Sights and Scents at the Cloisters: for people with dementia and their care partners"; a program developed in consultation with the Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Alzheimer's Disease Research Center at Columbia University, and the Alzheimer's Association.

Dementia affects about 44 million individuals, a number that is expected to nearly double by 2030 and triple by 2050. With an estimated annual cost of USD 604 billion, dementia represents a major economic burden for both industrial and developing countries, as well as a significant physical and emotional burden on individuals, family members and caregivers. There is currently no cure for dementia or a reliable way to slow its progress, and the G8 health ministers have set the goal of finding a cure or disease-modifying therapy by 2025. However, the underlying mechanisms are complex, and influenced by a range of genetic and environmental influences that may have no immediately apparent connection to brain health. Of course medical research relies on access to large amounts of data, including clinical, genetic and imaging datasets. Making these widely available across research groups helps reduce data collection efforts, increases the statistical power of studies and makes data accessible to more researchers. This is particularly important from a global perspective: Swedish researchers say, for example, that they are sitting on a goldmine of excellent longitudinal and linked data on a variety of medical conditions including dementia, but that they have too few researchers to exploit its potential. Other countries will have many researchers, and less data. ‘Big data’ adds new sources of data and ways of analysing them to the repertoire of traditional medical research data. This can include (non-medical) data from online patient platforms, shop loyalty cards, and mobile phones — made available, for example, through Apple’s ResearchKit, just announced last week. As dementia is believed to be influenced by a wide range of social, environmental and lifestyle-related factors (such as diet, smoking, fitness training, and people’s social networks), and this behavioural data has the potential to improve early diagnosis, as well as allow retrospective insights into events in the years leading up to a diagnosis. For example, data on changes in shopping…

Examining the voluntary provision by commercial sites of information privacy protection and control under the self-regulatory policy of the U.S. Federal Trade Commission (FTC).

Ed: You examined the voluntary provision by commercial sites of information privacy protection and control under the self-regulatory policy of the U.S. Federal Trade Commission (FTC). In brief, what did you find? Yong Jin: First, because we rely on the Internet to perform almost all types of transactions, how personal privacy is protected is perhaps one of the important issues we face in this digital age. There are many important findings: the most significant one is that the more popular sites did not necessarily provide better privacy control features for users than sites that were randomly selected. This is surprising because one might expect “the more popular, the better privacy protection”—a sort of marketplace magic that automatically solves the issue of personal privacy online. This was not the case at all, because the popular sites with more resources did not provide better privacy protection. Of course, the Internet in general is a malleable medium. This means that commercial sites can design, modify, or easily manipulate user interfaces to maximise the ease with which users can protect their personal privacy. The fact that this is not really happening for commercial websites in the U.S. is not only alarming, but also suggests that commercial forces may not have a strong incentive to provide privacy protection. Ed: Your sample included websites oriented toward young users and sensitive data relating to health and finance: what did you find for them? Yong Jin: Because the sample size for these websites was limited, caution is needed in interpreting the results. But what is clear is that just because the websites deal with health or financial data, they did not seem to be better at providing more privacy protection. To me, this should raise enormous concerns from those who use the Internet for health information seeking or financial data. The finding should also inform and urge policymakers to ask whether the current non-intervention policy (regarding commercial websites…

How will we keep healthy? How will we live, learn, work and interact in the future? How will we produce and consume and how will we manage resources?

On October 6 and 7, the European Commission, with the participation of Portuguese authorities and the support of the Champalimaud Foundation, organised in Lisbon a high-level conference on “The Future of Europe is Science”. Mr. Barroso, President of the European Commission, opened the meeting. I had the honour of giving one of the keynote addresses. The explicit goal of the conference was twofold. On the one hand, we tried to take stock of European achievements in science, engineering, technology and innovation (SETI) during the last 10 years. On the other hand, we looked into potential future opportunities that SETI may bring to Europe, both in economic terms (growth, jobs, new business opportunities) and in terms of wellbeing (individual welfare and higher social standards). One of the most interesting aspects of the meeting was the presentation of the latest report on “The Future of Europe is Science” by the President’s Science and Technology Advisory Council (STAC). The report addresses some very big questions: How will we keep healthy? How will we live, learn, work and interact in the future? How will we produce and consume and how will we manage resources? It also seeks to outline some key challenges that will be faced by Europe over the next 15 years. It is well written, clear, evidence-based and convincing. I recommend reading it. In what follows, I wish to highlight three of its features that I find particularly significant. First, it is enormously refreshing and reassuring to see that the report treats science and technology as equally important and intertwined. The report takes this for granted, but anyone stuck in some Greek dichotomy between knowledge (episteme, science) and mere technique (techne, technology) will be astonished. While this divorcing of the two has always been a bad idea, it is still popular in contexts where applied science, e.g. applied physics or engineering, is considered a Cinderella. During my talk, I referred to Galileo as a paradigmatic scientist who had to be innovative in terms of…

People are very often unaware of how much data is gathered about them—let alone the purposes for which it can be used.

MEPs failed to support a Green call to protect Edward Snowden as a whistleblower, in order to allow him to give his testimony to the European Parliament in March. Image by greensefa.

Computers have developed enormously since the Second World War: alongside a rough doubling of computer power every two years, communications bandwidth and storage capacity have grown just as quickly. Computers can now store much more personal data, process it much faster, and rapidly share it across networks. Data is collected about us as we interact with digital technology, directly and via organisations. Many people volunteer data to social networking sites, and sensors—in smartphones, CCTV cameras, and “Internet of Things” objects—are making the physical world as trackable as the virtual. People are very often unaware of how much data is gathered about them—let alone the purposes for which it can be used. Also, most privacy risks are highly probabilistic, cumulative, and difficult to calculate. A student sharing a photo today might not be thinking about a future interview panel; or that the heart rate data shared from a fitness gadget might affect future decisions by insurance and financial services (Brown 2014). Rather than organisations waiting for something to go wrong, then spending large amounts of time and money trying (and often failing) to fix privacy problems, computer scientists have been developing methods for designing privacy directly into new technologies and systems (Spiekermann and Cranor 2009). One of the most important principles is data minimisation; that is, limiting the collection of personal data to that needed to provide a service—rather than storing everything that can be conveniently retrieved. This limits the impact of data losses and breaches, for example by corrupt staff with authorised access to data—a practice that the UK Information Commissioner’s Office (2006) has shown to be widespread. Privacy by design also protects against function creep (Gürses et al. 2011). When an organisation invests significant resources to collect personal data for one reason, it can be very tempting to use it for other purposes. While this is limited in the EU by data protection law, government agencies are in a…

It is simply not possible to consider public policy today without some regard for the intertwining of information technologies with everyday life and society.

We can't understand, analyse or make public policy without understanding the technological, social and economic shifts associated with the Internet. Image from the (post-PRISM) "Stop Watching Us" Berlin Demonstration (2013) by mw238.

In the journal’s inaugural issue, founding Editor-in-Chief Helen Margetts outlined what are essentially two central premises behind Policy & Internet’s launch. The first is that “we cannot understand, analyse or make public policy without understanding the technological, social and economic shifts associated with the Internet” (Margetts 2009, 1). It is simply not possible to consider public policy today without some regard for the intertwining of information technologies with everyday life and society. The second premise is that the rise of the Internet is associated with shifts in how policy itself is made. In particular, she proposed that impacts of Internet adoption would be felt in the tools through which policies are effected, and the values that policy processes embody. The purpose of the Policy and Internet journal was to take up these two challenges: the public policy implications of Internet-related social change, and Internet-related changes in policy processes themselves. In recognition of the inherently multi-disciplinary nature of policy research, the journal is designed to act as a meeting place for all kinds of disciplinary and methodological approaches. Helen predicted that methodological approaches based on large-scale transactional data, network analysis, and experimentation would turn out to be particularly important for policy and Internet studies. Driving the advancement of these methods was therefore the journal’s third purpose. Today, the journal has reached a significant milestone: over one hundred high-quality peer-reviewed articles published. This seems an opportune moment to take stock of what kind of research we have published in practice, and see how it stacks up against the original vision. At the most general level, the journal’s articles fall into three broad categories: the Internet and public policy (48 articles), the Internet and policy processes (51 articles), and discussion of novel methodologies (10 articles). The first of these categories, “the Internet and public policy,” can be further broken down into a number of subcategories. One of the most prominent of these streams…

Key to successful adoption of Internet-based health records is how much a patient places trust that data will be properly secured from inadvertent leakage.

In an attempt to reduce costs and improve quality, digital health records are permeating health systems all over the world. Internet-based access to them creates new opportunities for access and sharing—while at the same time causing nightmares to many patients: medical data floating around freely within the clouds, unprotected from strangers, being abused to target and discriminate people without their knowledge? Individuals often have little knowledge about the actual risks, and single instances of breaches are exaggerated in the media. Key to successful adoption of Internet-based health records is, however, how much a patient places trust in the technology: trust that data will be properly secured from inadvertent leakage, and trust that it will not be accessed by unauthorised strangers. Situated in this context, my own research has taken a closer look at the structural and institutional factors influencing patient trust in Internet-based health records. Utilising a survey and interviews, the research has looked specifically at Germany—a very suitable environment for this question given its wide range of actors in the health system, and often being referred to as a “hard-line privacy country”. Germany has struggled for years with the introduction of smart cards linked to centralised Electronic Health Records, not only changing its design features over several iterations, but also battling negative press coverage about data security. The first element to this question of patient trust is the “who”: that is, does it make a difference whether the health record is maintained by either a medical or a non-medical entity, and whether the entity is public or private? I found that patients clearly expressed a higher trust in medical operators, evidence of a certain “halo effect” surrounding medical professionals and organisations driven by patient faith in their good intentions. This overrode the concern that medical operators might be less adept at securing the data than (for example) most non-medical IT firms. The distinction between public and private operators is…

Measuring the mobile Internet can expose information about an individual’s location, contact details, and communications metadata.

Four of the 6.8 billion mobile phones worldwide. Measuring the mobile Internet can expose information about an individual's location, contact details, and communications metadata. Image by Cocoarmani.

Ed: GCHQ / the NSA aside, who collects mobile data and for what purpose? How can you tell if your data are being collected and passed on? Ben: Data collected from mobile phones is used for a wide range of (divergent) purposes. First and foremost, mobile operators need information about mobile phones in real-time to be able to communicate with individual mobile handsets. Apps can also collect all sorts of information, which may be necessary to provide entertainment, location specific services, to conduct network research and many other reasons. Mobile phone users usually consent to the collection of their data by clicking “I agree” or other legally relevant buttons, but this is not always the case. Sometimes data is collected lawfully without consent, for example for the provision of a mobile connectivity service. Other times it is harder to substantiate a relevant legal basis. Many applications keep track of the information that is generated by a mobile phone and it is often not possible to find out how the receiver processes this data. Ed: How are data subjects typically recruited for a mobile research project? And how many subjects might a typical research data set contain? Ben: This depends on the research design; some research projects provide data subjects with a specific app, which they can use to conduct measurements (so called ‘active measurements’). Other apps collect data in the background and, in effect, conduct local surveillance of the mobile phone use (so called passive measurements). Other research uses existing datasets, for example provided by telecom operators, which will generally be de-identified in some way. We purposely do not use the term anonymisation in the report, because much research and several case studies have shown that real anonymisation is very difficult to achieve if the original raw data is collected about individuals. Datasets can be re-identified by techniques such as fingerprinting or by linking them with existing, auxiliary datasets. The size…

As Africa goes digital, the challenge for policymakers becomes moving from digitisation to managing and curating digital data in ways that keep people’s identities and activities secure.

Africa is in the midst of a technological revolution, and the current wave of digitisation has the potential to make the continent’s citizens a rich mine of data. Intersection in Zomba, Malawi. Image by john.duffell.

After the last decade’s exponential rise in ICT use, Africa is fast becoming a source of big data. Africans are increasingly emitting digital information with their mobile phone calls, internet use and various forms of digitised transactions, while on a state level e-government starts to become a reality. As Africa goes digital, the challenge for policymakers becomes what the WRR, a Dutch policy organisation, has identified as ‘i-government’: moving from digitisation to managing and curating digital data in ways that keep people’s identities and activities secure. On one level, this is an important development for African policymakers, given that accurate information on their populations has been notoriously hard to come by and, where it exists, has not been shared. On another, however, it represents a tremendous challenge. The WRR has pointed out the unpreparedness of European governments, who have been digitising for decades, for the age of i-government. How are African policymakers, as relative newcomers to digital data, supposed to respond? There are two possible scenarios. One is that systems will develop for the release and curation of Africans’ data by corporations and governments, and that it will become possible, in the words of the UN’s Global Pulse initiative, to use it as a ‘public good’—an invaluable tool for development policies and crisis response. The other is that there will be a new scramble for Africa: a digital resource grab that may have implications as great as the original scramble amongst the colonial powers in the late 19th century. We know that African data is not only valuable to Africans. The current wave of digitisation has the potential to make the continent’s citizens a rich mine of data about health interventions, human mobility, conflict and violence, technology adoption, communication dynamics and financial behaviour, with the default mode being for this to happen without their consent or involvement, and without ethical and normative frameworks to ensure data protection or to weigh…

As the cost and size of devices falls and network access becomes ubiquitous, it is evident that not only major industries but whole areas of consumption, public service and domestic life will be capable of being transformed.

The 2nd Annual Internet of Things Europe 2010: A Roadmap for Europe, 2010. Image by Pierre Metivier.

On 17 April 2013, the US Federal Trade Commission published a call for inputs on the ‘consumer privacy and security issues posed by the growing connectivity of consumer devices, such as cars, appliances, and medical devices’, in other words, about the impact of the Internet of Things (IoT) on the everyday lives of citizens. The call is in large part one for information to establish what the current state of technology development is and how it will develop, but it also looks for views on how privacy risks should be weighed against potential societal benefits. There’s a lot that’s not very new about the IoT. Embedded computing, sensor networks and machine to machine communications have been around a long time. Mark Weiser was developing the concept of ubiquitous computing (and prototyping it) at Xerox PARC in 1990.  Many of the big ideas in the IoT—smart cars, smart homes, wearable computing—are already envisaged in works such as Nicholas Negroponte’s Being Digital, which was published in 1995 before the mass popularisation of the internet itself. The term ‘Internet of Things’ has been around since at least 1999. What is new is the speed with which technological change has made these ideas implementable on a societal scale. The FTC’s interest reflects a growing awareness of the potential significance of the IoT, and the need for public debate about its adoption. As the cost and size of devices falls and network access becomes ubiquitous, it is evident that not only major industries but whole areas of consumption, public service and domestic life will be capable of being transformed. The number of connected devices is likely to grow fast in the next few years. The Organisation for Economic Co-operation and Development (OECD) estimates that while a family with two teenagers may have 10 devices connected to the internet, in 2022 this may well grow to 50 or more. Across the OECD area the number of…