- Our Work
Marina Beshai, COS (completed IW as a junior)
Title of Project: Political Movements in the Age of Social Media: An Analysis of Twitter’s Role in the Egyptian Crisis
Governments worldwide and the media often blame social media companies for civil unrest rather than the associated individuals. Claiming social media to be a threat against democracy, governments heavily moderate platforms and suppress activists. Drawing on more than six million #Egypt tweets published during the 2011 Egyptian crisis, this study explores the relationship between the in-person demonstrations and the online Twitter movement to observe how these components complemented and influenced one another. The rise and fall of #Mubarak (on Twitter) with the subsequent rise of #noscaf (No Supreme Council of the Armed Forces) goes to show how the grievances of protesters mirrored that of the topics trending online. They were controlling the narrative to a certain extent. And the sheer number of associated country hashtags (154 out of the 195 present-day countries were associated with #Egypt), never mind the use, imply a connected, worldwide community. Natural Language Processing (NLP) showed that English speakers were consistently more negative in their tweets than their Arabic counterparts. Not once did Arabic users express a more negative outlook than their English counterparts. Despite the large gap between the two groups. the correlation coefficient between the Arabic and English scores was 0.46, so there was a strong linear relationship between the general moods of the two parties. And a holistic analysis of tweets during the internet blackout in Egypt showed that many users around this time were increasingly concerned for the safety of protesters. On January 28, 2011, the first day of the internet blackout in Egypt, the frequency of tweets greatly increased by 82,020 tweets which comprise 1% of the total tweets including #Egypt in 2011. Topic modeling showed that on seven out of the ten days of the internet blackout in Egypt, ‘freeEgypt’ or ‘freedom’ were one of the most frequently used words that aptly describe users’ general attitude. In all, results suggested that there is a give and take relationship whereby users inside the country greatly influence the platform at the start of demonstrations, and in turn, receive support and aid from users outside of the country later on.
Yana Mihova, SPI (completed IW as a junior)
Title of Project: Bill Gates, Drinking Bleach and 5G Radiation: The Role of Right-Wing Media in Spreading Coronavirus Misinformation
From initial reporting of the COVID-19 virus in early 2020, misinformation fueled the pandemic by spreading doubts about its authenticity. Due to the novelty of the pandemic, there was a gap in research on the effects of type of media consumption and its impact on believing misinformation about the pandemic. Since I was interested in investigating the ways that media consumption can impact societal perspectives on a particular topic, I decided to investigate the relationship between the spread of COVID-19 misinformation on media platforms and the outstanding consequences in society of this misinformation. I looked at this relationship by observing the type of news source an individual consumed and their likelihood to endorse COVID-19 misinformation, as measured by belief of COVID-19 conspiracy theories and distrust in public health officials. My analysis found that a statistically significant positive relationship existed between individuals who reported consuming only right-leaning media and their tendency to endorse COVID-19 misinformation. When taken into context with previous research indicating right-leaning media reported significantly more COVID-19 misinformation than moderate and left-leaning media, my findings indicate a correlation between reporting of false information and likelihood to endorse COVID-19 misinformation. This study brings to light the dangers of factless reporting and how it can have detrimental effects on societal outcomes.
Betsy Pu, COS (completed IW as a junior)
Title of Project: Analyzing Traffic Analysis Attacks on Video Streaming
Nowadays, the vast majority of web traffic, including video streaming traffic, is encrypted using HTTPS to ensure the privacy and security of transmitted data. However, some information about the traffic is leaked through the unencrypted and publicly visible metadata of packets transferred across the internet. “Video fingerprinting” attacks apply traffic analysis to video streaming traffic to identify the exact video being streamed from its traffic metadata alone, and represent a little-studied risk to the privacy of anyone who streams videos online. This project contains the design of “bounding experiments” that shed light upon the video fingerprinting landscape in several ways. These experiments make it possible to evaluate various video fingerprinting methods, normalize and compare their performance, and also also provide theoretical bounds for the hardness of the fingerprinting problem. We identify the most performant video fingerprinting techniques to be approaches that make inferences directly on packet metadata rather than training machine learning models, infer the true size of video chunks being delivered across the network, and use local sequence matching techniques to handle resolution changes during a streaming session. Furthermore, we identify promising distance functions for selecting the most likely video being streamed. Finally, bounding the difficulty of the fingerprinting problem enables future researchers to reason about defensive guarantees to guard against video fingerprinting attacks.
Henry Vecchione, COS (completed IW as a junior)
Title of Project: Pan-app-ticon: What to Do About Ring’s Partnerships with Police Departments
I take issue with how Ring, the Internet-connected security camera company owned by Amazon, has pursued mutually beneficial partnerships with local police departments. The partnerships incentivize police to distribute Ring devices in their communities and grants the police access to a “Law Enforcement Portal” that enables them to select an area on a map, specify up to a 12-hour window of time, and send requests for footage from those hours to Ring owners in that area. I argue that cost and inefficiency are a significant barrier to surveillance creep and that this interface reduces that cost too much. I support this argument with two Supreme Court cases, U.S. v. Knotts (1983) and U.S. V. Jones (2012), the comparison of which illustrates how technological advancements can fundamentally change one’s expectation of privacy and the invasiveness of criminal investigation. I then examine the ACLU’s Community Control Over Police Surveillance (CCOPS) model bill and real legislation based on it, which require public approval for new surveillance technologies. I find that much of it doesn’t adequately protect against connected surveillance devices like Ring because they are not a “new technology”, rather an old technology that is harmful in how it’s used and efficiency it creates. This allows Ring to bypass approval. I then propose changes that Ring can make to their products and changes that legislatures can make to their bills to minimize harm. I suggest that Ring could use image recognition to blur faces in video sent to police, only removing the blur on order from a judge, or it could change the law enforcement interface to prohibit bulk requests or require more information. Legislatures should also alter how “new technology” is defined, requiring reapproval if a technology increases surveillance efficiency a meaningful degree even if it resembles an approved technology on the surface.
Melody Zheng, COS (completed IW as a junior)
Title of Project: Analyzing the Digital Divide: A Quantitative and Qualitative Study of Six United States Cities
As society grows increasingly dependent on information and communication technologies (ICTs), it becomes crucial to address the digital divide still present in many communities. In my work, I focus on identifying a digital access policy that the city of Oakland, California should adopt. To do so, I compared the initiatives of three cities in the same population range that have been “successful,” such that the rate of Internet and computer access for historically underserved populations has increased from 2015 to 2019, with the initiatives of three cities that have not been successful. Using data from the U.S. Census Bureau’s American Community Survey, I tracked the rates of Internet and computer access for five different demographics over the five year period and chose the three cities with the best average rate and the three with the lowest average rate. I then analyzed qualitative data to identify whether the selected cities focused on Internet access, computer access, and/or digital literacy training in their digital access initiatives, although two of the three unsuccessful cities had little to no such information publicly available. When comparing the three successful cities to the remaining unsuccessful city, Oakland, I found that while the four cities generally addressed all three aspects, the successful cities had a greater focus on community resources. Therefore, I argue that Oakland should invest more resources on digital literacy programs and publicly available ICTs, especially since they were not able to offer entirely free home Internet plans and digital devices. Community resources and technology classes would be more accessible to a greater number of households and hopefully lead to improved financial situations as well.
Yaw Asante, COS
Title of Project: Evaluating and Contextualizing Network-Based Analysis of Drug Response in Cancer Dependency Genes
Computationally assessing which genes cancer needs to propagate itself is a much-researched topic at the juncture of computational and medical science. To contribute to this area, I sought to build a software tool capable of assessing cancer dependency by extending from the foundation of a tool called NetMix, which solves a related problem. Additionally, I sought to examine the broader context in which tools like these may be applied in clinical medicine. For my first contribution, I designed a NetMix-based software process called CADEGA and compared its performance to that of a peer algorithm called NETPHIX. This work demonstrated CADEGA’s limited performance overall, though with a potential for finding functional correlations which differed from those of NETPHIX. For my second contribution, I conducted an overview of the real-word context in which methods like CADEGA and NETPHIX would apply to the field of data-enabled healthcare. This analysis demonstrated the expansive efforts being made or planned in technical infrastructure as well as the blindspots present in existing laws surrounding genetic data and in the equitable development of these resources for rural facilities.
Bevin Benson, COS
Title of Project: Restricted Content: A Technical Guide to Internet Censorship in the Age of Social Media
The growth of social media platforms poses new challenges to governments seeking to control information online. Historically, governments have relied on a toolkit of technical methods to censor content on the web, such as IP blocking, DNS tampering, and deep-packet filtering. These methods are ineffective against blocking specific content on social media platforms. As a result, many governments have turned to sending “content removal requests” to these platforms as a means of restricting material that it considers objectionable. My independent work outlines the technical methods of Internet censorship, focusing on how governments can block content using IP/port blocking, DNS tampering, and deep-packet filtering, and examines the relationship between governments and three major Internet platforms – Facebook, Twitter, and Google – vis-á-vis content removal requests. It conducts an exploratory analysis of the transparency datasets in the transparency reports released by the three platforms using Python to uncover what data the platforms release to the public. It finds that the platforms, particularly Facebook, lack transparency about the requests they receive and their guidelines for content removal. Twitter releases the greatest amount of data on content removal requests, including links to the content under question, yet this data is difficult to access and poorly organized. Additionally, it examines trends in the number of content removal requests provided by a subset of 13 countries based on geographic diversity, size, Internet freedom, and the number of content removal requests submitted. Specifically, it finds that there is no significant correlation between internet freedom and the number of content removal requests, but that Turkey and Russia send the greatest number of content removal requests to Internet platforms.
Justin Chang, COS
Title of Project: The Role of International Consensus in Cyber Attribution
With so many people relying on the critical infrastructures and data housed in cyberspace, cyber attacks have the potential to harm extremely large numbers of civilians. Yet international regulations on these attacks remain largely nonexistent, as there exists no binding agreement on what states can or cannot do in cyberspace. My independent work explores the role that international consensus can have in cyber attribution, a necessity to maintaining a secure cyberspace. By looking at examples of past attacks, I present the inherent limitations of technical attribution tools and techniques, arguing that international collaboration can improve the time and efficiency involved in attribution. In response to the difficulties in achieving such a consensus, I argue for the creation of an international body tasked with attributing cyber attacks, as such a body can still improve the process of cyber attribution, without the support of all major cyber actors.
Edward Elson, CLA
Title of Project: The Idea of Progress in Antiquity
My independent work investigates whether or not (and how) an idea of technological progress might have been understood by Mediterranean societies in early and late antiquity. Some scholars have posted that an idea of progress simply did not exist in the ancient world, that the institutional capabilities of Ancient Greek and early Roman societies were perceived by their people to be static, not to develop nor accelerate over time. My independent work refutes that argument, drawing from the “Golden Age” theory of Hesiod, the lesser known personal accounts of Xenophanes (whose allusions to a collective cultural and intellectual evolution quite clearly demonstrate that an idea of progress was – at the very least – in his own mind), the philosophical works of Plato, tragic excerpts from Sophocles, and finally, a poem of human history provided by the Roman Lucretius. My analysis consists of a series of close reading of the prior texts, which is supplemented by the existent but scant philological scholarship on the subject, and ultimately makes clear that an idea of progress certainly did exist in antique thought, but not in the way that it might exist today. Technological and institutional achievement were thought, I argue, not to better nor worsen the overall conditions of the ancient human experience, but to complicate it exponentially into the future. With added achievement came, according to the ancients, an added depth of problems, ambitions, interest, and values, many of which were thought to conflict with each other. I draw these ideas the fragments of Xenophanes and demonstrate how they echoed through Plato’s Laws, Sophocles’ “Ode to Man,” and Lucretius’ On Nature.
Isabella Faccone, ORFE
Title of Project: Tools to Understand 2016 Voter Influence Tactics in Comparison with the 2020 Election: Applications of Network Topology, Information Cascades and Rumor Recurrence
Social medias have a fundamental impact on how society receives and exchanges information around political events, specifically elections. These social media networks amplified misinformation during both the 2016 election and the 2020 election despite new control mechanisms. This research determines the key frameworks for understanding the network climate that enabled such amplification and misinformation, relying on veracity, amplification and recurrence to draw distinctions between the 2016 and 2020 elections. In order to evaluate these criteria, I constructed a 2016 Twitter dataset based on previous research, and I was able to find a 2020 Election dataset for Twitter that was updated weekly with keywords, trends, politicians, and new trackers for the duration of the 2020 election period. These datasets are what I utilized to evaluate cascades, the main trends on the network and sheer mass of activity around politicians and rumors. Critically, this research demonstrates that both the role of individuals in information cascades and the features of the rumors that propagate pervasively have a large impact on the likelihood that a rumor will recur in a given network. This research shows that false rumors propagate faster and recur more often than true rumors in both the 2016 and 2020 elections, but draws a distinction between unilateral and interactive information dissemination models to demonstrate the differing effect that amplifiers have on propagation and recurrence. For rumors that disseminate via a unidirectional traditional news outlet shared via links on Twitter, the effect of a high number of verified users is limited. However, for rumors that propagate as retweets and quoted replies, which have a multi-directional and interactive model, the effect of a high number of verified users participating in the cascade was very pronounced. Thus the properties of the rumor, its veracity and the specific subset of the population through which the rumor passes each has an effect on that rumor’s overall impact and exposure to a given network of users. This research’s findings are critical to the future of social medias as they grapple with the persistence of misinformation amidst a highly volatile and nuanced digital politics arena.
Kevin Feng, COS
Title of Project: Lowering the Barrier for Web Advertisement Research at Scale
Web advertisements are essential to the day-to-day operations on the internet by providing a key channel of revenue to websites that offer content at little to no cost. However, they are also common sources of deception, scams, and privacy violations. Given their significance, ads are of interest to many different groups of experts, including web researchers, communications scholars, and regulators, but their fleeting nature makes them difficult to study systematically and at scale. This independent work presents AdOculos, a technical system comprising a search interface powered by automated visual analysis tools and a continuously updated, large-scale archive of ads crawled from thousands of popular websites. By using the system to uncover novel research questions, dimensions of analysis, and policy recommendations, I demonstrate how AdOculos and its underlying tools enable expanded possibilities in ad research.
Grace Hong, ECO
Title of Project: The Effect of Google Fiber’s Entry on Student Educational Outcomes in Kansas City
In 2010, the private tech company Google disrupted the broadband market by partnering with individual cities to offer high-speed fiber Internet through Google Fiber. In my research, I study the impact of Google Fiber’s installation in Kansas City in 2011 with student educational outcomes in Missouri’s public schools through two studies: an intra-city study of Kansas City and inter-city study between Kansas City and St. Louis in pre- and post-fiber periods. The intra-city study used a fixed effects regression model and highlighted mixed effects of Fiber on education. However, the inter-city study, using a difference-in-differences regression, showed that post-Fiber Kansas City experienced less percentages of students scoring in the worst category (Below Basic) and greater percentages of students scoring in higher categories (Proficient). As a result, this study illustrates that Fiber’s entry may be correlated with higher test performances, especially for those who were performing in the lowest categories to start with, and it provides a stronger case for continuing to close to the digital divide across the United States.
Gabrielle Jabre, POL
Title of Project: Social Media as a Narrative Battlefield: An Investigation into the 2019 Lebanese Protests
In non-democracies, civil society and the regime battle to dominate the narrative on social media. During the 2019 Lebanese Protests, social media became a place for narrative warfare between civil society and the regime. My research questions: Did social media played a sectarian-reducing or sectarian-enhancing role, and what were its effects on mass mobilization? My research design is twofold: qualitative interviews and a quantitative analysis on a portion of Twitter data. I conducted 13 zoom interviews with: social media activists, physical activists, journalists, politicians and independent media center directors. The interviewees were asked questions on both the sectarian-reducing and sectarian-enhancing role of social media. Furthermore, during these interviews, seven broad categories were discussed: social media activism, the importance of social media for the protestors, social media as a narrative battlefield, online information corruption and its impact, whether the regime fought back online, social media’s overall effect on the protests, and freedom of speech. Overall, participants argued that social media played a sectarian-reducing role, disseminated a civic narrative among a global Lebanese network, and facilitated collective and connective action. Furthermore, all interviewees argued that online information corruption was propagated along sectarian narratives that discredited the protests, but it was not a compelling enough reason for the deterrence of mass mobilization. Instead, most interviewees argued that the reduction in collective action was due to violence, COVID-19 national lockdown and economic barriers. To further investigate the relationship between sectarianism and online information corruption, I substantiated the interview results with a quantitative analysis. My logistic regression model indicated a statistically significant positive relationship between sectarianism and a false narrative online through the metric of a p-value. Therefore, the data analysis corroborated the interviewees’ insights that false online narratives were sectarian. My results highlight that a civic narrative dominated social media and played a constructive role in greater collective and connective Lebanese action against the regime. One the other hand, the results also show that online information corruption was used as a sectarian narrative tool to discredit the protests, but this was not enough to deter collective action. Thus, social media’s democratic nature benefitted a civic narrative, but also served regime manipulation.
Watson Jia, COS
Title of Project: Consistency and Distributed Gateways in IoT Environments
Distributed systems have become ubiquitous in our modern computing world, with applications ranging from telecommunications to computer networks. The Internet of Things (IoT) has integrated technology with many physical objects in our everyday lives, with applications ranging from smart home technologies to medical applications. My independent work attempts to combine two increasingly important fields in modern computing – distributed systems and the IoT – and investigates applications of distributed systems in IoT environments by leveraging multiple IoT gateways as a distributed system. This project explores fault tolerance and data consistency, which have large implications for reliability and scalability in applications that rely on both distributed systems and the IoT. This could especially impact industrial systems and infrastructural applications. I aimed to modify multiple Mozilla WebThings smart home gateways to act as a distributed system, implement a fault tolerance scheme, and identify consistency issues in smart home IoT devices within this system. Quality of service metrics of the distributed system in the form of latency measures show consistent, reasonable delays between gateways, with no large deviations from the mean. A fault tolerance scheme, in which one gateway takes over the IoT devices of another gateway that had gone offline, was able to add all devices from the offline gateway to the new gateway, and the new gateway was able to control half of the devices added. Consistency issues caused by network connectivity problems and event reorderings were identified and possible solutions were found.
Christy Lee, COS
Title of Project: When a Virus Goes Viral: A Study on the Efficacy of Using Twitter Analysis to Forecast COVID-19 Cases
In a short period of time, COVID-19 has completely transformed the landscape of global health, economics, and society. Given the enormity of this impact, it has become crucial to more effectively prepare for and act against COVID-19; improving our ability to forecast case counts is one method of doing so. My independent work discusses a forecasting model which aims to quantify an aspect of social response in order to build a more well-rounded predictor of case trends. Because COVID-19 is spread primarily through person-to-person contact, shifts in social response to the virus can affect social behavior, and thus subsequent case numbers. By analyzing Twitter data for sentiment and frequency, the model takes into account one measure of social attitudes and behaviors towards COVID-19. This data is considered in conjunction with reported COVID-19 case data and state demographic information, inputted into a feedforward neural network model for regression, and ultimately used to forecast positive cases 3, 7, and 14 days into the future.
Austin Mejia, IND
Title of Project: Lucky Break: Regulating Loot Boxes in Video Games
Over the past four years, loot boxes have skyrocketed in popularity. These virtual crates of in-game items have become a mainstay of the video game industry, generating over $30 billion in 2019. However, with their meteoric rise comes concerns over their impact on gamers, as a growing body of evidence suggests that loot boxes are addictive. Though other nations like China and Hungary have already introduced legislation to regulate loot boxes, the U.S. has yet to establish a policy response, with no viable regulations foreseeable within the next year. My research seeks to propose a compelling and comprehensive policy response to loot boxes. Whereas many proposals solely focus on combating potential addiction, this recommendation additionally examines the structure of loot boxes and how they are embedded with “dark patterns,” or designs intended to trick players into spending more money. Ultimately, I recommend new regulations that create a stricter digital marketplace, requiring developers to disclose the odds of their loot boxes and implementing strict currency expectations. This recommendation hopes to lay a flexible foundation upon which future regulation can build on as our understanding of loot boxes continues to progress.
Sean-Wyn Ng, COS
Title of Project: Pose2Pose2: Pose Selection and Transfer for Full-Body Character Animation
To convert a video of a real-life human subject into an animation, artists often watch an original performance video of the subject many times in order to determine which body poses they tend to hold. Artists must also choose optimal points of transition between body poses within the animation. In this project, I explored the design of animation systems that are less manually intensive, which could potentially make animation more accessible to the general public by lowering its time-consuming barrier of entry. I created Pose2Pose2, a system inspired by Pose2Pose, a tool that automatically extracted and clustered two-dimensional upper-body pose data from a subject within a video, displaying them on a user interface in order of frequency of occurrence for more efficient visualization. However, Pose2Pose2 also has the ability to track full-body poses, as well identify both two- and three-dimensional pose data within a video featuring a human subject. Pose2Pose2 also includes additional features within the user interface, such as grouping rotation-normalized three-dimensional poses together and marking poses that are visually similar to poses selected by the user. Users select poses from the interface and use them as reference to draw cartoonized versions of the subject holding the selected poses. Pose2Pose2 uses the drawings to convert a new video featuring the same subject into an animation.
Vedika Patwari, COS
Title of Project: Evaluating the Impact of Data Localization on Technological Innovation in India
Cross-border data flows are playing an increasingly important role in supporting a globally digitized economy and yet, countries are attempting to regulate the flow of data through localization mandates. Using India as a case study, I examine the impact of localization on company operations and technological innovation. India’s dynamic policy environment, and its unique combination of a large digital economy and an emerging data center industry, offer insight into how localization impacts growth and innovation in countries with relatively lower levels of digital infrastructure. Given that emerging economies are also turning towards restricting the free flow of data, this is an important context within which to study data localization. Existing studies analyze the impact of localization at the national level and there is a need to better understand how localization plays out at a company level. Thus, I conducted semi-structured interviews with executives at various financial technology companies in India to understand the impacts of localization. I find that there is a high level of compliance with the localization mandate across all company sizes. Additionally, companies with local operations are able to localize their data with greater ease when compared to companies with global operations. This varied impact of localization is not addressed in existing literature and may cause multinational companies to opt out of markets with localization restrictions. I also identify an over-reliance on data centers located in Mumbai; this geographic centralization of data is a key vulnerability in the Indian financial ecosystem. In order to mitigate some of the identified risks, I recommend public and private investments to increase the availability and geographic spread of India’s data center infrastructure. Further research with more companies and in different countries is necessary to build upon my findings and to better inform future policies on data localization.
Carlotta Platt, SPI
Title of Project: Containing the Contagion: Determinants of Government Response to the First Wave of the COVID-19 Pandemic in Europe
The COVID-19 Pandemic has spared no country in the world, causing almost three million deaths in its first year. Yet governments were unprepared for and responded very differently to iterations of the same virus. My research uses quantitative and qualitative analysis to investigate what factors determined national variation of first-wave policy response to the COVID-19 pandemic in European countries, and what led to response effectiveness. I hypothesize that four groups of factors (Governmental, Political, Societal and Economic) will be significant in explaining (1) national variations in response intensity, (2) national variations in response quality, and will interact to determine (3) response effectiveness. Overall, I find that these four groups of factors explain variation in response intensity and quality, and that they interact to determine an effective response. Among them, (1) decentralization with strong intergovernmental coordination and central guidance, with reliance on few but communicative scientific experts, (2) strong leadership with low pressure from the opposition, (3) high trust and low media misinformation, and (4) a strong economy that is able to quickly increase healthcare capacity, combine to determine an effective response: one which best couples intensity with quality to avoid high numbers of cases and deaths. On a societal level specifically, I use Instagram and Twitter analysis to study how high media coverage of the pandemic, when used to spread misinformation, related to a less effective response.
Harmit (Hari) Raval, COS
Title of Project: Security and reliability implications of imprecise programming language specifications: A case-study of GPU schedulers
Programming language specifications are the rules that govern how programs behave; developers use these specifications to reason about their program properties, including safety and security concerns. When these specifications are imprecise, programmers develop applications that can behave in surprising ways. It is possible for malicious actors to exploit these surprising behaviors, causing significant societal impact. One of the most widespread parallel computing devices is the graphics processing unit (GPU). While these devices were classically used for graphics computations, they are now able to handle more general-purpose compute applications. Rapid evolution, coupled with increasing diversity of these devices lead to underspecified programming languages. This situation is ripe for security vulnerabilities given how widespread GPUs are. My research focuses on an underspecified area of GPU specification, the scheduler. Through our work in creating a thorough GPU testing framework and automatically constructing hundreds of multi-threaded test cases, we discovered instances where the scheduler can lock up. When the GPU locks up, we found that many different behaviors can be exhibited including: a simple graphics reboot or even the machine freezing completely. The latter example provides a direct pathway for a security vulnerability. By embedding our litmus tests in a mobile device application, we demonstrate the ability for such an application to leverage its low-level system access and cause visual information leakage. Overall, our work identifies serious security concerns in modern GPU devices. These concerns have severe sociological concerns given the prevalence of GPUs in modern systems, i.e., most people interact with their most private information, e.g. their daily usage on a smart phone, using devices that contain these powerful, yet underspecified processors.
Lauren Tang, COS
Title of Project: Towards the Democratization of Finance in the Context of Stock Trading
There is a growing market of millennials becoming more interested in personal finance and stock trading, especially with the onset of the COVID-19 pandemic. Many stock market brokers are making changes to their platforms to capture the millennial market. Robinhood, a high-tech trading platform, strives to bring novice investors onto their platform, stating that their mission is to “democratize finance for all.” We need to consider this: What does democratizing finance truly mean and has this goal been reached? I argue that the democratization of finance requires two parts: access and education. People need to be able to access financial systems and have the financial literacy to understand how to skillfully navigate them. I analyze differences between Robinhood and older incumbent brokers such as Charles Schwab to determine whether access to markets has increased. Additionally, I explore what means of financial literacy tools exist for investors and what can be further done by brokers. Robinhood has made great strides towards increasing access through their introduction of commission-free trading to the industry which has led brokers to follow suit. Robinhood also utilizes gamification tactics, and emphasis on the UI/UX. Examples of this can be seen in their sign up and trading process when compared to older incumbents. Issues arise when inexperienced investors don’t understand certain dangers associated with trading, such as tax liabilities and downfalls with commission-free trading (specifically on Robinhood). Commission-free trading poses a financial harm to investors because Robinhood practices payment for order flow (PFOF), which can result in users on their platform receiving worse prices on trade execution. While other brokers also engage in this practice, they pass along (PFOF) benefits to their users, but it is unclear whether Robinhood does this. Across brokers in general, tax liability is another issue for novice investors because they may not understand how stock trading is taxed based on the transaction and don’t know of tax offsetting practices. A clear example of these dangers arose in the GameStop short squeeze event in early 2021. Brokers provide access to markets but fail to provide tools for financial literacy. New investors seeking financial advice often independently look to other investors through Reddit forums, enabling a community of financial learning, but this channel of information is not always reliable. In order to continue making progress towards the democratization of finance for all, Robinhood and other brokers need to play an active role in educating their users by providing them with tools for learning and smarter investing.
Ethan Thai, ELE
Title of Project: Dr. AI: Adapting CNN Classification for the Technical and Social Challenges of Medical Diagnosis
Diagnosing medical images is a time, cost, and labor intensive task traditionally only undertaken by an expert few. Fortunately, through the development of artificial intelligence (AI) and accessibility to medical datasets, convolutional neural networks (CNNs) have become increasingly suitable for learning to conduct computer-assisted diagnosis (CAD). However, learning for medical classification comes with the unique technical challenges of low data volume, class imbalance, inconsistent labeling, and having fine image details differentiate multiple diagnoses, as well as the social challenges of respecting patient data usage and combating algorithmic bias. In this independent work, I designed (with others on this research project) a training methodology specifically tailored to the medical domain by integrating transfer-learning, dataset cleaning, and synthetic data augmentation techniques. Through evaluation of color channel variations in images used to pre-train a model, implementation of an iterative dataset cleaning scheme, and use of DeepInversion to synthesize patient-decoupled training data, small but compounding improvements to classification performance are shown. Finally, through the gained experience of developing a CAD methodology and contextualization of medical AI research in prevalent social and legal discussions, a set of privacy and bias conscious design principles are introduced.
Ryan Yao, COS
Title of Project: Safeguarding Consumer Privacy: Analysis of Data Obfuscation Mechanisms to Prevent Ubiquitous Network Tracking
The rapid rise of modern consumer Internet platforms has largely been enabled by the development of lucrative targeted network advertising models. However, these model-based platforms have engaged in unprecedented user data collection and extensive network tracking, which collectively threaten individual consumer privacy. In the absence of sufficient general data privacy regulation, a new class of user-oriented data obfuscation privacy tools has quickly grown. Using two popular data obfuscation tools — TrackMeNot, a search obfuscation tool, and AdNauseam, an ad clickstream obfuscation tool — as a lens, my independent work examines the ways in which data obfuscation — the production and inclusion of fake data to mask real data — can be applied to anonymize user data, deny data collection, and fundamentally disrupt excessive and unsanctioned network tracking. Grounded in review of recent literature, my research explores data obfuscation as a potential alternative to an otherwise narrow and exclusive focus on privacy regulation which has been the primary focus of previous work. Analysis of the ability of data obfuscation to prevent ubiquitous network tracking places it as a means of incentivizing the adoption of more responsible data collection practices and advertising models which respect existing and future privacy standards. Ultimately, recommend new policy initiatives, including the implementation of regulatory protections for consumer data obfuscation tools, the prevention of exclusive platform self-regulation, and the creation of regulation which works in conjunction with data obfuscation. These recommendations aim to serve as a principled foundation for use of data obfuscation in safeguarding the future of consumer privacy.
Anika Yardi, ORFE
Title of Project: Using Monte Carlo Markov Chain Methods to Understand the Mathematics and Visualization of Gerrymandering in Politically Competitive Districts
Gerrymandering is a technique used to give an unfair advantage to any one political party through the process of manipulating district lines in order to dilute the voting power of an opposing political party. Known by their bizarre shapes, gerrymandered districts are thought to be easily recognizable. However, this is not always the case, and it can be incredibly difficult to tell whether victories in particular areas occur due to legislative wrongdoing or are a natural political outcome. This is where mathematics can help. In my research, I worked to form a framework for analysis of redistricting plans of Maryland, Pennsylvania, and North Carolina. Firstly, using the powerful technique of Monte Carlo Markov Chain Methods, I came to the conclusion that while all three of my selected states have elements of gerrymandering in their redistricting plans, North Carolina and Pennsylvania are extreme examples of the technique. Furthermore, I compared court-mandated redistricting plans for these two states and determined that the implemented remedial plans were not the fairest and least extreme option of the proposed plans. Finally, I tackled the problem of gerrymandering from a policy perspective and isolated effective elements to combat gerrymandering from bills being proposed in North Carolina and Pennsylvania, which include independent commissions, required mechanisms for public hearings, and criteria for fairness like compactness.
Noa Zarur, COS
Title of Project: Profit Maximization and Food Waste Reduction
Almost half of the food in the United States goes to waste. The current solutions bakeries and restaurants have for avoiding food waste are not sufficient. The goal of my project is to reduce food waste and maximize profit for bakeries algorithmically, focusing on bread as my primary commodity of study. A challenge with creating such an algorithm is that it relies on the number of customers that show up at a bakery and buy bread throughout all points in a day. My approach recursively calls a function that returns the optimal number of breads to make in each time slot. By recursively calling my core function I run through every possible profitable outcome given all combinations of regular customers, leftover customers, and regular breads. I used average statistics of how many time slots there are per day to bake bread, the number of customers per day, and cost of making each loaf of bread to test my algorithm. My algorithm also allows the user to customize the algorithm by inputting the maximum number of customers that typically come to their bakery, their price for regular bread, their price for leftover bread, and how many leftover breads you start the day with. The results return the optimal number of regular breads to make, the expected profit from the regular bread and leftover bread, and the total profit for each time point. Next steps include adding features to allow this algorithm to be easily used by bakeries and ready for deployment. Further applications of the project might include expanding its usage to restaurants and pharmacies.