Ricardo Baeza-Yates: Bias on the Web
Abstract: The Web is the most powerful communication medium and the largest public data repository that humankind has created. Its content ranges from great reference sources such as Wikipedia to ugly fake news. Indeed, social (digital) media is just an amplifying mirror of ourselves. Hence, the main challenge of search engines and other websites that rely on web data is to assess the quality of such data. However, as all people has their own biases, web content as well as our web interactions are tainted with many biases. Data bias includes redundancy and spam, while interaction bias includes activity and presentation bias. In addition, sometimes algorithms add bias, particularly in the context of search and recommendation systems. As bias generates bias, we stress the importance of debiasing data as well as using the context and other techniques such as explore & exploit, to break the filter bubble. The main goal of this talk is to make people aware of the different biases that affect all of us on the Web. Awareness is the first step to be able to fight and reduce the vicious cycle of web bias. For more details see the article of same title in Communications of ACM, June 2018.
Bio: Ricardo Baeza-Yates is a Research Professor at Northeastern University's Roux Institute for Experiential AI based at the Silicon Valley Campus where he was the Director of Data Science between 2017 and 2020. Before he was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from 2006 to 2016. Until 2005 he was an ICREA Research Professor at Universitat Pompeu Fabra in Catalonia and until 2004 he was Professor and founding director of the Center for Web Research at University of Chile. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. He is co-author of the best-seller Modern Information Retrieval textbook published by Addison-Wesley in 2011 (2nd ed), that won the ASIST 2012 Book of the Year award. From 2002 to 2004 he was elected to the Board of Governors of the IEEE Computer Society and between 2012 and 2016 was elected for the ACM Council. Since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow, among other awards and distinctions. His areas of expertise are web search and data mining, information retrieval, AI fairness, data science, and its associated algorithms.
Ceren Budak: How to Regulate Disinformation and Toxicity Online
Abstract: Problems such as incivility and disinformation are all commonplace online. While there have been efforts to combat these problems—such as the use of moral suasion to curb incivility and media literacy to curb misinformation—the approaches generally lack a unified theoretical framework that allows for a systematic exploration of the solution space. In this talk, I will walk us through two of these problems—disinformation and toxicity—and examine them through the lens of a socioeconomic theory of regulation originally introduced by Larry Lessig. According to this model, four modalities regulate behavior online and offline: norms, law, market, and architecture. Norms constrain through the sanctions or rules of a community. Law regulates through the punishment of the state. Markets constrain through price. Finally, architecture—built environment or code in online space—constrains through the structural burdens it imposes. We will discuss results of two large-scale studies on how markets can be used to curb misinformation and how norms shape toxicity online.
Bio: Ceren Budak is an Assistant Professor of Information at the School of Information and an Assistant Professor of Electrical Engineering and Computer Science at the University of Michigan. Her research interests lie in the area of computational social science. She utilizes network science, machine learning, and crowdsourcing methods and draws from scientific knowledge across multiple social science communities to contribute computational methods to the field of political communication.
Munmun De Choudhury: Interdisciplinary and Collaborative Approaches to Digital Mental Health: A Tale of Engaging with Three Stakeholders
Abstract: Digital traces, such as social media data, supported with advances in the computer science field, are increasingly being used to understand the mental health of individuals and populations. With these approaches offering promise to change the status quo in mental health for the first time since mid-20th century, interdisciplinary collaborations have been greatly emphasized. But what are some models of engagement for computer scientists that augment existing capabilities while minimizing the risk of harm? This talk will describe the experiences from working with three different stakeholders in projects relating to digital mental health – first with a federal agency, second with healthcare providers, and third with a non-profit organization. The talk hopes to present some lessons learned by way of these engagements, and to reflect on approaches we need to realize a dream of many computer scientists: how to have their research contribute to positive societal impacts.
Bio: Munmun De Choudhury, PhD is an associate professor of Interactive Computing at Georgia Tech. Dr. De Choudhury is best known for laying the foundation of a new line of research that develops computational techniques to responsibly and ethically employ social media in improving our mental health. To do this work, she adopts a highly interdisciplinary approach, combining machine learning with social and clinical science. Dr. De Choudhury has been recognized with the Complex Systems Society – Junior Scientific Award in 2019, several best paper and honorable mention awards, and featured in popular press like the New York Times, the NPR, and the BBC. Earlier, Dr. De Choudhury was a faculty associate with the Berkman Klein Center for Internet and Society at Harvard, a postdoc at Microsoft Research, and obtained her PhD in Computer Science from Arizona State University.
Sheldon Jacobson: Models for Generating NCAA Men's Basketball Tournament Bracket Pools
Abstract: Each year, the NCAA Division I Men’s Basketball Tournament attracts popular attention, including bracket challenges where fans seek to pick the winners of the tournament’s games. However, the quantity and unpredictable nature of games suggest a single bracket will likely select some winning teams incorrectly even if created with insightful and sophisticated methods. Hence, a participant may wish to create a pool of brackets that likely contains at least one high-scoring bracket. We propose the Power Model to estimate the probability mass function of all possible tournament outcomes based on past tournament data. Bracket pools are generated for the 2013-2018 tournaments using five Power Model variations. The generated brackets are assessed by the ESPN scoring system and compared to those produced by a traditional pick favorite approach as well as the highest scoring brackets in the ESPN Tournament Challenge for each year. More information on this and related research can be found at bracketodds.cs.illinois.edu.
Bio: Sheldon H. Jacobson, Ph.D., is a Founder Professor of Computer Science at the University of Illinois (Urbana-Champaign). He has a B.Sc. in Mathematics from McGill University and a Ph.D. in Operations Research from Cornell University. His research interests span theory and practice, covering decision-making under uncertainty using optimization-based AI and on-line optimization models, with applications in aviation security, health care, election forecasting, and sports. He has studied the analytics of NCAA basketball bracketology since 2006. His research has been quoted and featured by CBS Sports, NBC Sports, Bleacher Report, the Chicago Tribune, USA Today, and Bloomberg. His research is highlighted on the web site, bracketodds.cs.illinois.edu.
Rishab Nithyanand - Privacy: Where We Are and What the Future Holds
Abstract: This talk will focus on providing an overview of the state of privacy in society today, the technical and social developments that got us here, and the challenges we're likely to face in the future if we continue down our current path. Finally, we will discuss the challenges we face while trying to navigate towards a more privacy-friendly future and our Internet measurement research that works towards circumventing these challenges.
Bio: Rishab is an Assistant Professor in the Department of Computer Science at the University of Iowa where he heads the SPARTA Lab. Prior to joining the faculty at UIowa, he was a Ford-Mozilla Open Web Fellow at the Data & Society Research Institute, visiting researcher at the International Computer Science Institute at UC Berkeley, and obtained his PhD from Stony Brook University. His research interests are in the areas of security, privacy, Internet measurement, and working towards an understanding of the impact of the Internet on the sociopolitical realities of today.
Alexandra Olteanu: Measuring Objectionable Behaviors by Humans and Machines
Abstract: There is a rich and long-standing literature on detecting and mitigating a wide range of biased, objectionable, or deviant content and behaviors, including hateful and offensive speech, misinformation, and discrimination. There is also a growing literature on fairness, accountability, and transparency in computational systems that is concerned with how such systems may inadvertently engender, reinforce, and amplify such behaviors. While many systems have become increasingly proficient at identifying clear cases of objectionable content and behaviors---by both humans and machines---many challenges still persist. While existing efforts tend to focus on issues that we know to look for, techniques for preempting future issues that may not yet be on the product teams' and research community's radar are not nearly as well developed or understood. Addressing this gap requires deep dives into specific application areas.
In this talk, I will share reflections on why many of these challenges continue to linger, including due to data generation and collection practices, the reliance on proxy measurements, training and testing data sets construction, and the design of evaluation metrics. I will ground our discussion in some of our recent research on designing frameworks for auditing predictive text in downstream applications, focusing on web search autocomplete suggestions and email response suggestions.
Bio: Alexandra Olteanu is a part of Microsoft Research’s group on Fairness, Accountability, Transparency, and Ethics (FATE). Her research interest are in Computational Social Science, Social Computing, Crisis Computing, Social Systems, Social Media, Data Biases, Data Quality, Algorithmic Discrimination, Social Good Applications.