
Big data technology offers unprecedented opportunities to society as a whole and also to its individual members. At the same time, this technology poses significant risks to those it overlooks. In this article, we give an overview of recent technical work on diversity, particularly in selection tasks, discuss connections between diversity and fairness, and identify promising directions for future work that will position diversity as an important component of a data-responsible society. We argue that diversity should come to the forefront of our discourse, for reasons that are both ethical—to mitigate the risks of exclusion—and utilitarian, to enable more powerful, accurate, and engaging data analysis and use.

Cite this article as: Drosou M, Jagadish HV, Pitoura E, Stoyanovich J (2017) Diversity in big data: a review. Big Data 5:2, 73–84, DOI: 10.1089/big.2016.0054.

Published In

Big Data
Volume 5Issue Number 2June 2017
Pages: 73 - 84
PubMed: 28632443


Published in print: June 2017
Published online: 1 June 2017


Marina Drosou
Department of Computer Science, University of Ioannina, Ioannina, Greece.
H.V. Jagadish
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan.
Evaggelia Pitoura
Department of Computer Science, University of Ioannina, Ioannina, Greece.
Julia Stoyanovich*
Department of Computer Science, Drexel University, Philadelphia, Pennsylvania.


Address correspondence to: Julia Stoyanovich, Department of Computer Science, Drexel University, Philadelphia, PA 19104

