New Voluntary Code: Guidance for Sharing Data Between Organisations

Many organisations are coming up with their own internal policy and guidelines for data sharing. However, for data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice. During her time as an OII Visiting Associate, Alison Holt developed a pragmatic solution in the form of a Voluntary Code, anchored in the developing ISO standards for the Governance of Data. She discusses the voluntary code, and the need to provide urgent advice to organisations struggling with policy for sharing data.

Collecting, storing and distributing digital data is significantly easier and cheaper now than ever before, in line with predictions from Moore, Kryder and Gilder. Organisations are incentivised to collect large volumes of data with the hope of unleashing new business opportunities or maybe even new businesses. Consider the likes of Uber, Netflix, and Airbnb and the other data mongers who have built services based solely on digital assets.

The use of this new abundant data will continue to disrupt traditional business models for years to come, and there is no doubt that these large data volumes can provide value. However, they also bring associated risks (such as unplanned disclosure and hacks) and they come with constraints (for example in the form of privacy or data protection legislation). Hardly a week goes by without a data breach hitting the headlines. Even if your telecommunications provider didn’t inadvertently share your bank account and sort code with hackers, and your child wasn’t one of the hundreds of thousands of children whose birthdays, names, and photos were exposed by a smart toy company, you might still be wondering exactly how your data is being looked after by the banks, schools, clinics, utility companies, local authorities and government departments that are so quick to collect your digital details.

Then there are the companies who have invited you to sign away the rights to your data and possibly your privacy too—the ones that ask you to sign the Terms and Conditions for access to a particular service (such as a music or online shopping service) or have asked you for access to your photos. And possibly you are one of the “worried well” who wear or carry a device that collects your health data and sends it back to storage in a faraway country, for analysis.

So unless you live in a lead-lined concrete bunker without any access to internet connected devices, and you don’t have the need to pass by webcams or sensors, or use public transport or public services; then your data is being collected and shared. And for the majority of the time, you benefit from this enormously. The bus stop tells you exactly when the next bus is coming, you have easy access to services and entertainment fitted very well to your needs, and you can do most of your bank and utility transactions online in the peace and quiet of your own home. Beyond you as an individual, there are organisations “out there” sharing your data to provide you better healthcare, education, smarter city services and secure and efficient financial services, and generally matching the demand for services with the people needing them.

So we most likely all have data that is being shared and it is generally in our interest to share it, but how can we trust the organisations responsible for sharing our data? As an organisation, how can I know that my partner and supplier organisations are taking care of my client and product information?

Organisations taking these issues seriously are coming up with their own internal policy and guidelines. However, for data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice. During my time as a visiting associate at the Oxford Internet Institute, University of Oxford, I have developed a pragmatic solution in the form of a Voluntary Code. The Code has been produced using the guidelines for voluntary code development produced by the Office of Community Affairs, Industry Canada. More importantly, the Code is anchored in the developing ISO standards for the Governance of Data (the 38505 series). These standards apply the governance principles and model from the 38500 standard and introduce the concept of a data accountability map, highlighting six focus areas for a governing body to apply governance. The early stage standard suggests considering the aspects of Value, Risk and Constraint for each area, to determine what practice and policy should be applied to maximise the value from organisational data, whilst applying constraints as set by legislation and local policy, and minimising risk.

I am Head of the New Zealand delegation to the ISO group developing IT Service Management and IT Governance standards, SC40, and am leading the development of the 38505 series of Governance of Data standards, working with a talented editorial team of industry and standards experts from Australia, China and the Netherlands. I am confident that the robust ISO consensus-led process involving subject matter experts from around the world, will result in the publication of best practice guidance for the governance of data, presented in a format that will have relevance and acceptance internationally.

In the meantime, however, I see a need to provide urgent advice to organisations struggling with policy for sharing data. I have used my time at Oxford to interview policy, ethics, smart city, open data, health informatics, education, cyber security and social science experts and users, owners and curators of large data sets, and have come up with a “Voluntary Code for Data Sharing”. The Code takes three areas from the data accountability map in the developing ISO standard 38505-1; namely Collect, Store, Distribute, and applies the aspects of Value, Risk and Constraint to provide seven maxims for sharing data. To assist with adoption and compliance, the Code provides references to best practice and examples. As the ISO standards for the Governance of Data develop, the Code will be updated. New examples of good practice will be added as they come to light.

[A permanent home for the voluntary code is currently being organised; please email me in the meantime if you are interested in it:]

The Code is deliberately short and succinct, but it does provide links for those who need to read more to understand the underpinning practices and standards, and those tasked with implementing organisational data policy and practice. It cannot guarantee good outcomes. With new security threats arising daily, nobody can fully guarantee the safety of your information. However, if you deal with an organisation that is compliant with the Voluntary Code, then at least you can have assurance that the organisation has at least considered how it is using your data now and how it might want to reuse your data in the future, how and where your data will be stored, and then finally how your data will be distributed or discarded. And that’s a good start!

Alison Holt was an OII Academic Visitor in late 2015. She is an internationally acclaimed expert in the Governance of Information Technology and Data, heading up the New Zealand delegations to the international standards committees for IT Governance and Service Management (SC40) and Software and Systems Engineering (SC7). The British Computer Society published Alison’s first book on the Governance of IT in 2013.

How big data is breathing new life into the smart cities concept

“Big data” is a growing area of interest for public policy makers: for example, it was highlighted in UK Chancellor George Osborne’s recent budget speech as a major means of improving efficiency in public service delivery. While big data can apply to government at every level, the majority of innovation is currently being driven by local government, especially cities, who perhaps have greater flexibility and room to experiment and who are constantly on a drive to improve service delivery without increasing budgets.

Work on big data for cities is increasingly incorporated under the rubric of “smart cities”. The smart city is an old(ish) idea: give urban policymakers real time information on a whole variety of indicators about their city (from traffic and pollution to park usage and waste bin collection) and they will be able to improve decision making and optimise service delivery. But the initial vision, which mostly centred around adding sensors and RFID tags to objects around the city so that they would be able to communicate, has thus far remained unrealised (big up front investment needs and the requirements of IPv6 are perhaps the most obvious reasons for this).

The rise of big data—large, heterogeneous datasets generated by the increasing digitisation of social life—has however breathed new life into the smart cities concept. If all the cars have GPS devices, all the people have mobile phones, and all opinions are expressed on social media, then do we really need the city to be smart at all? Instead, policymakers can simply extract what they need from a sea of data which is already around them. And indeed, data from mobile phone operators has already been used for traffic optimisation, Oyster card data has been used to plan London Underground service interruptions, sewage data has been used to estimate population levels, the examples go on.

However, at the moment these examples remain largely anecdotal, driven forward by a few cities rather than adopted worldwide. The big data driven smart city faces considerable challenges if it is to become a default means of policymaking rather than a conversation piece. Getting access to the right data; correcting for biases and inaccuracies (not everyone has a GPS, phone, or expresses themselves on social media); and communicating it all to executives remain key concerns. Furthermore, especially in a context of tight budgets, most local governments cannot afford to experiment with new techniques which may not pay off instantly.

This is the context of two current OII projects in the smart cities field: UrbanData2Decide (2014-2016) and NEXUS (2015-2017). UrbanData2Decide joins together a consortium of European universities, each working with a local city partner, to explore how local government problems can be resolved with urban generated data. In Oxford, we are looking at how open mapping data can be used to estimate alcohol availability; how website analytics can be used to estimate service disruption; and how internal administrative data and social media data can be used to estimate population levels. The best concepts will be built into an application which allows decision makers to access these concepts real time.

NEXUS builds on this work. A collaborative partnership with BT, it will look at how social media data and some internal BT data can be used to estimate people movement and traffic patterns around the city, joining these data into network visualisations which are then displayed to policymakers in a data visualisation application. Both projects fill an important gap by allowing city officials to experiment with data driven solutions, providing proof of concepts and showing what works and what doesn’t. Increasing academic-government partnerships in this way has real potential to drive forward the field and turn the smart city vision into a reality.

OII Resarch Fellow Jonathan Bright is a political scientist specialising in computational and ‘big data’ approaches to the social sciences. His major interest concerns studying how people get information about the political process, and how this is changing in the internet era.

How can big data be used to advance dementia research?

Image by K. Kendall of “Sights and Scents at the Cloisters: for people with dementia and their care partners”; a program developed in consultation with the Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Alzheimer’s Disease Research Center at Columbia University, and the Alzheimer’s Association.

Dementia affects about 44 million individuals, a number that is expected to nearly double by 2030 and triple by 2050. With an estimated annual cost of USD 604 billion, dementia represents a major economic burden for both industrial and developing countries, as well as a significant physical and emotional burden on individuals, family members and caregivers. There is currently no cure for dementia or a reliable way to slow its progress, and the G8 health ministers have set the goal of finding a cure or disease-modifying therapy by 2025. However, the underlying mechanisms are complex, and influenced by a range of genetic and environmental influences that may have no immediately apparent connection to brain health.

Of course medical research relies on access to large amounts of data, including clinical, genetic and imaging datasets. Making these widely available across research groups helps reduce data collection efforts, increases the statistical power of studies and makes data accessible to more researchers. This is particularly important from a global perspective: Swedish researchers say, for example, that they are sitting on a goldmine of excellent longitudinal and linked data on a variety of medical conditions including dementia, but that they have too few researchers to exploit its potential. Other countries will have many researchers, and less data.

‘Big data’ adds new sources of data and ways of analysing them to the repertoire of traditional medical research data. This can include (non-medical) data from online patient platforms, shop loyalty cards, and mobile phones — made available, for example, through Apple’s ResearchKit, just announced last week. As dementia is believed to be influenced by a wide range of social, environmental and lifestyle-related factors (such as diet, smoking, fitness training, and people’s social networks), and this behavioural data has the potential to improve early diagnosis, as well as allow retrospective insights into events in the years leading up to a diagnosis. For example, data on changes in shopping habits (accessible through loyalty cards) may provide an early indication of dementia.

However, there are many challenges to using and sharing big data for dementia research. The technology hurdles can largely be overcome, but there are also deep-seated issues around the management of data collection, analysis and sharing, as well as underlying people-related challenges in relation to skills, incentives, and mindsets. Change will only happen if we tackle these challenges at all levels jointly.

As data are combined from different research teams, institutions and nations—or even from non-medical sources—new access models will need to be developed that make data widely available to researchers while protecting the privacy and other interests of the data originator. Establishing robust and flexible core data standards that make data more sharable by design can lower barriers for data sharing, and help avoid researchers expending time and effort trying to establish the conditions of their use.

At the same time, we need policies that protect citizens against undue exploitation of their data. Consent needs to be understood by individuals—including the complex and far-reaching implications of providing genetic information—and should provide effective enforcement mechanisms to protect them against data misuse. Privacy concerns about digital, highly sensitive data are important and should not be de-emphasised as a subordinate goal to advancing dementia research. Beyond releasing data in a protected environments, allowing people to voluntarily “donate data”, and making consent understandable and enforceable, we also need governance mechanisms that safeguard appropriate data use for a wide range of purposes. This is particularly important as the significance of data changes with its context of use, and data will never be fully anonymisable.

We also need a favourable ecosystem with stable and beneficial legal frameworks, and links between academic researchers and private organisations for exchange of data and expertise. Legislation needs to account of the growing importance of global research communities in terms of funding and making best use of human and data resources. Also important is sustainable funding for data infrastructures, as well as an understanding that funders can have considerable influence on how research data, in particular, are made available. One of the most fundamental challenges in terms of data sharing is that there are relatively few incentives or career rewards that accrue to data creators and curators, so ways to recognise the value of shared data must be built into the research system.

In terms of skills, we need more health-/bioinformatics talent, as well as collaboration with those disciplines researching factors “below the neck”, such as cardiovascular or metabolic diseases, as scientists increasingly find that these may be associated with dementia to a larger extent than previously thought. Linking in engineers, physicists or innovative private sector organisations may prove fruitful for tapping into new skill sets to separate the signal from the noise in big data approaches.

In summary, everyone involved needs to adopt a mindset of responsible data sharing, collaborative effort, and a long-term commitment to building two-way connections between basic science, clinical care and the healthcare in everyday life. Fully capturing the health-related potential of big data requires “out of the box” thinking in terms of how to profit from the huge amounts of data being generated routinely across all facets of our everyday lives. This sort of data offers ways for individuals to become involved, by actively donating their data to research efforts, participating in consumer-led research, or engaging as citizen scientists. Empowering people to be active contributors to science may help alleviate the common feeling of helplessness faced by those whose lives are affected by dementia.

Of course, to do this we need to develop a culture that promotes trust between the people providing the data and those capturing and using it, as well as an ongoing dialogue about new ethical questions raised by collection and use of big data. Technical, legal and consent-related mechanisms to protect individual’s sensitive biomedical and lifestyle-related data against misuse may not always be sufficient, as the recent Nuffield Council on Bioethics report has argued. For example, we need a discussion around the direct and indirect benefits to participants of engaging in research, when it is appropriate for data collected for one purpose to be put to others, and to what extent individuals can make decisions particularly on genetic data, which may have more far-reaching consequences for their own and their family members’ professional and personal lives if health conditions, for example, can be predicted by others (such as employers and insurance companies).

Policymakers and the international community have an integral leadership role to play in informing and driving the public debate on responsible use and sharing of medical data, as well as in supporting the process through funding, incentivising collaboration between public and private stakeholders, creating data sharing incentives (for example, via taxation), and ensuring stability of research and legal frameworks.

Dementia is a disease that concerns all nations in the developed and developing world, and just as diseases have no respect for national boundaries, neither should research into dementia (and the data infrastructures that support it) be seen as a purely national or regional priority. The high personal, societal and economic importance of improving the prevention, diagnosis, treatment and cure of dementia worldwide should provide a strong incentive for establishing robust and safe mechanisms for data sharing.

Read the full report: Deetjen, U., E. T. Meyer and R. Schroeder (2015) Big Data for Advancing Dementia Research. Paris, France: OECD Publishing.

Why does the Open Government Data agenda face such barriers?

Advocates hope that opening government data will increase government transparency, catalyse economic growth, address social and environmental challenges. Image by the UK’s Open Data Institute.

Advocates of Open Government Data (OGD)—that is, data produced or commissioned by government or government-controlled entities that can be freely used, reused and redistributed by anyone—talk about the potential of such data to increase government transparency, catalyse economic growth, address social and environmental challenges and boost democratic participation. This heady mix of potential benefits has proved persuasive to the UK Government (and governments around the world). Over the past decade, since the emergence of the OGD agenda, the UK Government has invested extensively in making more of its data open. This investment has included £10 million to establish the Open Data Institute and a £7.5 million fund to support public bodies overcome technical barriers to releasing open data.

Yet the transformative impacts claimed by OGD advocates, in government as well as NGOs such as the Open Knowledge Foundation, still seem a rather distant possibility. Even the more modest goal of integrating the creation and use of OGD into the mainstream practices of government, businesses and citizens remains to be achieved. In my recent article Barriers to the Open Government Data Agenda: Taking a Multi-Level Perspective (Policy & Internet 6:3) I reflect upon the barriers preventing the OGD agenda from making a breakthrough into the mainstream. These reflections centre on the five key finds of a survey exploring where key stakeholders within the UK OGD community perceive barriers to the OGD agenda. The key messages from the UK OGD community are that:

1. Barriers to the OGD agenda are perceived to be widespread 

Unsurprisingly, given the relatively limited impact of OGD to date, my research shows that barriers to the OGD agenda are perceived to be widespread and numerous in the UK’s OGD community. What I find rather more surprising is the expectation, amongst policy makers, that these barriers ought to just melt away when exposed to the OGD agenda’s transparently obvious value and virtue. Given that the breakthrough of the OGD agenda (in actual fact) will require changes across the complex socio-technical structures of government and society, many teething problems should be expected, and considerable work will be required to overcome them.

2. Barriers on the demand side are of great concern

Members of the UK OGD community are particularly concerned about the wide range of demand-side barriers, including the low level of demand for OGD across civil society and the public and private sectors. These concerns are likely to have arisen as a legacy of the OGD community’s focus on the supply of OGD (such as public spending, prescription and geospatial data), which has often led the community to overlook the need to nurture initiatives that make use of OGD: for example innovators such as Carbon Culture who use OGD to address environmental challenges.

Adopting a strategic approach to supporting niches of OGD use could help overcome some of the demand-side barriers. For example, such an approach could foster the social learning required to overcome barriers relating to the practices and business models of data users. Whilst there are encouraging signs that the UK’s Open Data Institute (a UK Government-supported not-for-profit organisation seeking to catalyse the use of open data) is supporting OGD use in the private sector, there remains a significant opportunity to improve the support offered to potential OGD users across civil society. It is also important to recognise that increasing the support for OGD users is not guaranteed to result in increased demand. Rather the possibility remains that demand for OGD is limited for many other reasons—including the possibility that the majority of businesses, citizens and community organisations find OGD of very little value.

3. The structures of government continue to act as barriers

Members of the UK OGD community are also concerned that major barriers remain on the supply side, particularly in the form of the established structures and institutions of government. For example, barriers were perceived in the forms of the risk-adverse cultures of government organisations and the ad hoc funding of OGD initiatives. Although resilient, these structures are dynamic, so proponents of OGD need to be aware of emerging ‘windows of opportunity’ as they open up. Such opportunities may take the form of tensions within the structures of government (e.g. where restrictions on data sharing between different parts of government present an opportunity for OGD to create efficiency savings); and external pressures on government (e.g. the pressure to transition to a low carbon economy could create opportunities for OGD initiatives and demand for OGD).

4. There are major challenges to mobilising resources to support the open government data agenda

The research results also showed that members of the UK’s OGD community see mobilising the resources required to support the OGD as a major challenge. Concerns around securing funding are predictably prominent, but concerns also extend to developing the skills and knowledge required to use OGD across civil society, government and the private sector. These challenges are likely to persist whilst the post-financial crisis narrative of public deficit reduction through public spending reduction dominates the political agenda. This leaves OGD advocates to consider the politics and ethics of calling for investment in OGD initiatives, whilst spending reductions elsewhere are leading to the degradation of public services provision to vulnerable and socially excluded individuals.

5. The nature of some barriers remains contentious within the OGD community

OGD is often presented by advocates as a neutral, apolitical public good. However, my research highlights the important role that values and politics plays in how individuals within the OGD community perceive the agenda and the barriers it faces. For example, there are considerable differences in opinion, within the OGD community, on whether or not a private sector focus on exploiting financial value from OGD is crowding out the creation of social and environmental value. So benefits may arise from advocates being more open about the values and politics that underpin and shape the agenda. At the same time, OGD-related policy and practice could create further opportunities for social learning that brings together the diverse values and perspectives that coexist within the OGD community.

Having considered the wide range of barriers to the breakthrough of OGD agenda, and some approaches to overcoming these barriers, these discussions need setting in a broader political context. If the agenda does indeed make a breakthrough into the mainstream, it remains unclear what form this will take. Will the OGD agenda make a breakthrough by conforming with, and reinforcing, prevailing neoliberal interests? Or will the agenda stretch the fabric of government, the economy and society, and transform the relationship between citizens and the state?

Read the full article: Martin, C. (2014) Barriers to the Open Government Data Agenda: Taking a Multi-Level Perspective. Policy & Internet 6 (3) 217-240.

Designing Internet technologies for the public good

MEPs failed to support a Green call to protect Edward Snowden as a whistleblower, in order to allow him to give his testimony to the European Parliament in March. Image by greensefa.

Computers have developed enormously since the Second World War: alongside a rough doubling of computer power every two years, communications bandwidth and storage capacity have grown just as quickly. Computers can now store much more personal data, process it much faster, and rapidly share it across networks.

Data is collected about us as we interact with digital technology, directly and via organisations. Many people volunteer data to social networking sites, and sensors—in smartphones, CCTV cameras, and “Internet of Things” objects—are making the physical world as trackable as the virtual. People are very often unaware of how much data is gathered about them—let alone the purposes for which it can be used. Also, most privacy risks are highly probabilistic, cumulative, and difficult to calculate. A student sharing a photo today might not be thinking about a future interview panel; or that the heart rate data shared from a fitness gadget might affect future decisions by insurance and financial services (Brown 2014).

Rather than organisations waiting for something to go wrong, then spending large amounts of time and money trying (and often failing) to fix privacy problems, computer scientists have been developing methods for designing privacy directly into new technologies and systems (Spiekermann and Cranor 2009). One of the most important principles is data minimisation; that is, limiting the collection of personal data to that needed to provide a service—rather than storing everything that can be conveniently retrieved. This limits the impact of data losses and breaches, for example by corrupt staff with authorised access to data—a practice that the UK Information Commissioner’s Office (2006) has shown to be widespread.

Privacy by design also protects against function creep (Gürses et al. 2011). When an organisation invests significant resources to collect personal data for one reason, it can be very tempting to use it for other purposes. While this is limited in the EU by data protection law, government agencies are in a good position to push for changes to national laws if they wish, bypassing such “purpose limitations.” Nor do these rules tend to apply to intelligence agencies.

Another key aspect of putting users in control of their personal data is making sure they know what data is being collected, how it is being used—and ideally being asked for their consent. There have been some interesting experiments with privacy interfaces, for example helping smartphone users understand who is asking for their location data, and what data has been recently shared with whom.

Smartphones have enough storage and computing capacity to do some tasks, such as showing users adverts relevant to their known interests, without sharing any personal data with third parties such as advertisers. This kind of user-controlled data storage and processing has all kinds of applications—for example, with smart electricity meters (Danezis et al. 2013), and congestion charging for roads (Balasch et al. 2010).

What broader lessons can be drawn about shaping technologies for the public good? What is the public good, and who gets to define it? One option is to look at opinion polling about public concerns and values over long periods of time. The European Commission’s Eurobarometer polls reveal that in most European countries (including the UK), people have had significant concerns about data privacy for decades.

A more fundamental view of core social values can be found at the national level in constitutions, and between nations in human rights treaties. As well as the protection of private life and correspondence in the European Convention on Human Rights’ Article 8, the freedom of thought, expression, association and assembly rights in Articles 9-11 (and their equivalents in the US Bill of Rights, and the International Covenant on Civil and Political Rights) are also relevant.

This national and international law restricts how states use technology to infringe human rights—even for national security purposes. There are several US legal challenges to the constitutionality of NSA communications surveillance, with a federal court in Washington DC finding that bulk access to phone records is against the Fourth Amendment [1] (but another court in New York finding the opposite [2]). The UK campaign groups Big Brother Watch, Open Rights Group, and English PEN have taken a case to the European Court of Human Rights, arguing that UK law in this regard is incompatible with the Human Rights Convention.

Can technology development be shaped more broadly to reflect such constitutional values? One of the best-known attempts is the European Union’s data protection framework. Privacy is a core European political value, not least because of the horrors of the Nazi and Communist regimes of the 20th century. Germany, France and Sweden all developed data protection laws in the 1970s in response to the development of automated systems for processing personal data, followed by most other European countries. The EU’s Data Protection Directive (95/46/EC) harmonises these laws, and has provisions that encourage organisations to use technical measures to protect personal data.

An update of this Directive, which the European parliament has been debating over the last year, more explicitly includes this type of regulation by technology. Under this General Data Protection Regulation, organisations that are processing personal data will have to implement appropriate technical measures to protect Regulation rights. By default, organisations should only collect the minimum personal data they need, and allow individuals to control the distribution of their personal data. The Regulation would also require companies to make it easier for users to download all of their data, so that it could be uploaded to a competitor service (for example, one with better data protection)—bringing market pressure to bear (Brown and Marsden 2013).

This type of technology regulation is not uncontroversial. The European Commissioner responsible until July for the Data Protection Regulation, Viviane Reding, said that she had seen unprecedented and “absolutely fierce” lobbying against some of its provisions. Legislators would clearly be foolish to try and micro-manage the development of new technology. But the EU’s principles-based approach to privacy has been internationally influential, with over 100 countries now having adopted the Data Protection Directive or similar laws (Greenleaf 2014).

If the EU can find the right balance in its Regulation, it has the opportunity to set the new global standard for privacy-protective technologies—a very significant opportunity indeed in the global marketplace.

[1] Klayman v. Obama, 2013 WL 6571596 (D.D.C. 2013)

[2] ACLU v. Clapper, No. 13-3994 (S.D. New York December 28, 2013)


Balasch, J., Rial, A., Troncoso, C., Preneel, B., Verbauwhede, I. and Geuens, C. (2010) PrETP: Privacy-preserving electronic toll pricing. 19th USENIX Security Symposium, pp. 63–78.

Brown, I. (2014) The economics of privacy, data protection and surveillance. In J.M. Bauer and M. Latzer (eds.) Research Handbook on the Economics of the Internet. Cheltenham: Edward Elgar.

Brown, I. and Marsden, C. (2013) Regulating Code: Good Governance and Better Regulation in the Information Age. Cambridge, MA: MIT Press.

Danezis, G., Fournet, C., Kohlweiss, M. and Zanella-Beguelin, S. (2013) Smart Meter Aggregation via Secret-Sharing. ACM Smart Energy Grid Security Workshop.

Greenleaf, G. (2014) Sheherezade and the 101 data privacy laws: Origins, significance and global trajectories. Journal of Law, Information & Science.

Gürses, S., Troncoso, C. and Diaz, C. (2011) Engineering Privacy by Design. Computers, Privacy & Data Protection.

Haddadi, H, Hui, P., Henderson, T. and Brown, I. (2011) Targeted Advertising on the Handset: Privacy and Security Challenges. In Müller, J., Alt, F., Michelis, D. (eds) Pervasive Advertising. Heidelberg: Springer, pp. 119-137.

Information Commissioner’s Office (2006) What price privacy? HC 1056.

Spiekermann, S. and Cranor, L.F. (2009) Engineering Privacy. IEEE Transactions on Software Engineering 35 (1).

Read the full article: Keeping our secrets? Designing Internet technologies for the public good, European Human Rights Law Review 4: 369-377. This article is adapted from Ian Brown’s 2014 Oxford London Lecture, given at Church House, Westminster, on 18 March 2014, supported by Oxford University’s Romanes fund.

Professor Ian Brown is Associate Director of Oxford University’s Cyber Security Centre and Senior Research Fellow at the Oxford Internet Institute. His research is focused on information security, privacy-enhancing technologies, and Internet regulation.

Responsible research agendas for public policy in the era of big data

Last week the OII went to Harvard. Against the backdrop of a gathering storm of interest around the potential of computational social science to contribute to the public good, we sought to bring together leading social science academics with senior government agency staff to discuss its public policy potential. Supported by the OII-edited journal Policy and Internet and its owners, the Washington-based Policy Studies Organization (PSO), this one-day workshop facilitated a thought-provoking conversation between leading big data researchers such as David Lazer, Brooke Foucault-Welles and Sandra Gonzalez-Bailon, e-government experts such as Cary Coglianese, Helen Margetts and Jane Fountain, and senior agency staff from US federal bureaus including Labor Statistics, Census, and the Office for the Management of the Budget.

It’s often difficult to appreciate the impact of research beyond the ivory tower, but what this productive workshop demonstrated is that policy-makers and academics share many similar hopes and challenges in relation to the exploitation of ‘big data’. Our motivations and approaches may differ, but insofar as the youth of the ‘big data’ concept explains the lack of common language and understanding, there is value in mutual exploration of the issues. Although it’s impossible to do justice to the richness of the day’s interactions, some of the most pertinent and interesting conversations arose around the following four issues.

Managing a diversity of data sources. In a world where our capacity to ask important questions often exceeds the availability of data to answer them, many participants spoke of the difficulties of managing a diversity of data sources. For agency staff this issue comes into sharp focus when available administrative data that is supposed to inform policy formulation is either incomplete or inadequate. Consider, for example, the challenge of regulating an economy in a situation of fundamental data asymmetry, where private sector institutions track, record and analyse every transaction, whilst the state only has access to far more basic performance metrics and accounts. Such asymmetric data practices also affect academic research, where once again private sector tech companies such as Google, Facebook and Twitter often offer access only to portions of their data. In both cases participants gave examples of creative solutions using merged or blended data sources, which raise significant methodological and also ethical difficulties which merit further attention. The Berkman Center’s Rob Faris also noted the challenges of combining ‘intentional’ and ‘found’ data, where the former allow far greater certainty about the circumstances of their collection.

Data dictating the questions. If participants expressed the need to expend more effort on getting the most out of available but diverse data sources, several also canvassed against the dangers of letting data availability dictate the questions that could be asked. As we’ve experienced at the OII, for example, the availability of Wikipedia or Twitter data means that questions of unequal digital access (to political resources, knowledge production etc.) can often be addressed through the lens of these applications or platforms. But these data can provide only a snapshot, and large questions of great social or political importance may not easily be answered through such proxy measurements. Similarly, big data may be very helpful in providing insights into policy-relevant patterns or correlations, such as identifying early indicators of seasonal diseases or neighbourhood decline, but seem ill-suited to answer difficult questions regarding say, the efficacy of small-scale family interventions. Just because the latter are harder to answer using currently vogue-ish tools doesn’t mean we should cease to ask these questions.

Ethics. Concerns about privacy are frequently raised as a significant limitation of the usefulness of big data. Given that with two or more data sets even supposedly anonymous data subjects may be identified, the general consensus seems to be that ‘privacy is dead.’ Whilst all participants recognised the importance of public debate around this issue, several academics and policy-makers expressed a desire to get beyond this discussion to a more nuanced consideration of appropriate ethical standards. Accountability and transparency are often held up as more realistic means of protecting citizens’ interests, but one workshop participant also suggested it would be helpful to encourage more public debate about acceptable and unacceptable uses of our data, to determine whether some uses might simply be deemed ‘off-limits’, whilst other uses could be accepted as offering few risks.

Accountability. Following on from this debate about the ethical limits of our uses of big data, discussion exposed the starkly differing standards to which government and academics (to say nothing of industry) are held accountable. As agency officials noted on several occasions it matters less what they actually do with citizens’ data, than what they are perceived to do with it, or even what it’s feared they might do. One of the greatest hurdles to be overcome here concerns the fundamental complexity of big data research, and the sheer difficulty of communicating to the public how it informs policy decisions. Quite apart from the opacity of the algorithms underlying big data analysis, the explicit focus on correlation rather than causation or explanation presents a new challenge for the justification of policy decisions, and consequently, for public acceptance of their legitimacy. As Greg Elin of Gitmachines emphasised, policy decisions are still the result of explicitly normative political discussion, but the justifiability of such decisions may be rendered more difficult given the nature of the evidence employed.

We could not resolve all these issues over the course of the day, but they served as pivot points for honest and productive discussion amongst the group. If nothing else, they demonstrate the value of interaction between academics and policy-makers in a research field where the stakes are set very high. We plan to reconvene in Washington in the spring.

*We are very grateful to the Policy Studies Organization (PSO) and the American Public University for their generous support of this workshop. The workshop “Responsible Research Agendas for Public Policy in the Era of Big Data” was held at the Harvard Faculty Club on 13 September 2013.

Also read: Big Data and Public Policy Workshop by Eric Meyer, workshop attendee and PI of the OII project Accessing and Using Big Data to Advance Social Science Knowledge.

Victoria Nash received her M.Phil in Politics from Magdalen College in 1996, after completing a First Class BA (Hons) Degree in Politics, Philosophy and Economics, before going on to complete a D.Phil in Politics from Nuffield College, Oxford University in 1999. She was a Research Fellow at the Institute of Public Policy Research prior to joining the OII in 2002. As Research and Policy Fellow at the OII, her work seeks to connect OII research with policy and practice, identifying and communicating the broader implications of OII’s research into Internet and technology use.

Time for debate about the societal impact of the Internet of Things

The 2nd Annual Internet of Things Europe 2010: A Roadmap for Europe, 2010. Image by Pierre Metivier.

On 17 April 2013, the US Federal Trade Commission published a call for inputs on the ‘consumer privacy and security issues posed by the growing connectivity of consumer devices, such as cars, appliances, and medical devices’, in other words, about the impact of the Internet of Things (IoT) on the everyday lives of citizens. The call is in large part one for information to establish what the current state of technology development is and how it will develop, but it also looks for views on how privacy risks should be weighed against potential societal benefits.

There’s a lot that’s not very new about the IoT. Embedded computing, sensor networks and machine to machine communications have been around a long time. Mark Weiser was developing the concept of ubiquitous computing (and prototyping it) at Xerox PARC in 1990.  Many of the big ideas in the IoT—smart cars, smart homes, wearable computing—are already envisaged in works such as Nicholas Negroponte’s Being Digital, which was published in 1995 before the mass popularisation of the internet itself. The term ‘Internet of Things’ has been around since at least 1999. What is new is the speed with which technological change has made these ideas implementable on a societal scale. The FTC’s interest reflects a growing awareness of the potential significance of the IoT, and the need for public debate about its adoption.

As the cost and size of devices falls and network access becomes ubiquitous, it is evident that not only major industries but whole areas of consumption, public service and domestic life will be capable of being transformed. The number of connected devices is likely to grow fast in the next few years. The Organisation for Economic Co-operation and Development (OECD) estimates that while a family with two teenagers may have 10 devices connected to the internet, in 2022 this may well grow to 50 or more. Across the OECD area the number of connected devices in households may rise from an estimated 1.7 billion today to 14 billion by 2022. Programmes such as smart cities, smart transport and smart metering will begin to have their effect soon. In other countries, notably in China and Korea, whole new cities are being built around smart infrastructuregiving technology companies the opportunity to develop models that could be implemented subsequently in Western economies.

Businesses and governments alike see this as an opportunity for new investment both as a basis for new employment and growth and for the more efficient use of existing resources. The UK Government is funding a strand of work under the auspices of the Technology Strategy Board on the IoT, and the IoT is one of five themes that are the subject of the Department for Business, Innovation & Skills (BIS)’s consultation on the UK’s Digital Economy Strategy (alongside big data, cloud computing, smart cities, and eCommerce).

The enormous quantity of information that will be produced will provide further opportunities for collecting and analysing big data. There is consequently an emerging agenda about privacy, transparency and accountability. There are challenges too to the way we understand and can manage the complexity of interacting systems that will underpin critical social infrastructure.

The FTC is not alone in looking to open public debate about these issues. In February, the OII and BCS (the Chartered Institute for IT) ran a joint seminar to help the BCS’s consideration about how it should fulfil its public education and lobbying role in this area. A summary of the contributions is published on the BCS website.

The debate at the seminar was wide ranging. There was no doubt that the train has left the station as far as this next phase of the Internet is concerned. The scale of major corporate investment, government encouragement and entrepreneurial enthusiasm are not to be deflected. In many sectors of the economy there are already changes that are being felt already by consumers or will be soon enough. Smart metering, smart grid, and transport automation (including cars) are all examples. A lot of the discussion focused on risk. In a society which places high value on audit and accountability, it is perhaps unsurprising that early implementations have often been in using sensors and tags to track processes and monitor activity. This is especially attractive in industrial structures that have high degrees of subcontracting.

Wider societal risks were also discussed. As for the FTC, the privacy agenda is salient. There is real concern that the assumptions which underlie the data protection regimeespecially its reliance on data minimisationwill not be adequate to protect individuals in an era of ubiquitous data. Nor is it clear that the UK’s regulatorthe Information Commissionerwill be equipped to deal with the volume of potential business. Alongside privacy, there is also concern for security and the protection of critical infrastructure. The growth of reliance on the IoT will make cybersecurity significant in many new ways. There are issues too about complexity and the unforeseenand arguably unforeseeableconsequences of the interactions between complex, large, distributed systems acting in real time, and with consequences that go very directly to the wellbeing of individuals and communities.

There are great opportunities and a pressing need for social research into the IoT. The data about social impacts has been limited hitherto given the relatively few systems deployed. This will change rapidly. As Governments consult and bodies like the BCS seek to advise, it’s very desirable that public debate about privacy and security, access and governance, take place on the basis of real evidence and sound analysis.

eHealth: what is needed at the policy level? New special issue from Policy and Internet

The explosive growth of the Internet and its omnipresence in people’s daily lives has facilitated a shift in information seeking on health, with the Internet now a key information source for the general public, patients, and health professionals. The Internet also has obvious potential to drive major changes in the organisation and delivery of health services efforts, and many initiatives are harnessing technology to support user empowerment. For example, current health reforms in England are leading to a fragmented, marketised National Health Service (NHS), where competitive choice designed to drive quality improvement and efficiency savings is informed by transparency and patient experiences, and with the notion of an empowered health consumer at its centre.

Is this aim of achieving user empowerment realistic? In their examination of health queries submitted to the NHS Direct online enquiry service, John Powell and Sharon Boden find that while patient empowerment does occur in the use of online health services, it is constrained and context dependent. Policymakers wishing to promote greater choice and control among health system users should therefore take account of the limits to empowerment as well as barriers to participation. The Dutch government’s online public national health and care portal similarly aims to facilitate consumer decision-making behaviour and increasing transparency and accountability to improve quality of care and functioning of health markets. Interestingly, Hans Ossebaard, Lisette van Gemert-Pijnen and Erwin Seydel find the influence of the Dutch portal on choice behaviour, awareness, and empowerment of users to actually be small.

The Internet is often discussed in terms of empowering (or even endangering) patients through broadening of access to medical and health-related information, but there is evidence that concerns about serious negative effects of using the Internet for health information may be ill-founded. The cancer patients in the study by Alison Chapple, Julie Evans and Sue Ziebland gave few examples of harm from using the Internet or of damage caused to their relationships with health professionals. While policy makers have tended to focus on regulating the factual content of online information, in this study it was actually the consequences of stumbling on factually correct (but unwelcome) information that most concerned the patients and families; good practice guidelines for health information may therefore need to pay more attention to website design and user routing, as well as to the accuracy of content.

Policy makers and health professionals should also acknowledge the often highly individual strategies people use to access health information online, and understand how these practices are shaped by technology—the study by Astrid Mager found that the way people collected and evaluated online information about chronic diseases was shaped by search engines as much as by their individual medical preferences.

Many people still lack the necessary skills to navigate online content effectively. Eszter Hargittai and Heather Young examined the experiences a diverse group of young adults looking for information about emergency contraception online, finding that the majority of the study group could not identify the most efficient way of acquiring emergency contraception in a time of need. Given the increasing trend for people to turn to the Internet for health information, users must possess the necessary skills to make effective and efficient use of it; an important component of this may concern educational efforts to help people better navigate the Web. Improving general e-Health literacy is one of several recommendations by Maria De Jesus and Chenyang Xiao, who examined how Hispanic adults in the United States search for health information online. They report a striking language divide, with English proficiency of the user largely predicting online health information-seeking behavior.

Lastly, but no less importantly, is the policy challenge of addressing the issue of patient trust. The study by Ulrike Rauer on the structural and institutional factors that influence patient trust in Internet-based health records found that while patients typically considered medical operators to be more trustworthy than non-medical ones, there was no evidence of a “public–private” divide; patients perceived physicians and private health insurance providers to be more trustworthy than the government and corporations. Patient involvement in terms of access and control over their records was also found to be trust enhancing.

A lack of policy measures is a common barrier to success of eHealth initiatives; it is therefore essential that we develop measures that facilitate the adoption of initiatives and that demonstrate their success through improvement in services and the health status of the population. The articles presented in this special issue of Policy & Internet provide the sort of evidence-based insight that is urgently needed to help shape these policy measures. The empirical research and perspectives gathered here will make a valuable contribution to future efforts in this area.