T&S Certificate – IT Track

Certificate Student Independent Work

2025 Certificate Graduates Independent Work

Ahmed, Abani (COS), completed IW as a junior
Title of Project: Analyzing YouTube Comment Sentiment, Post Dislike Counter Removal

In November 2021, YouTube removed the dislike counter so that the number of dislikes a video received was not publicly visible. My research examines the impact of YouTube’s decision to privatize the dislike counter on user interactions within the comment sections of YouTube videos across various genres. The analysis utilizes a combination of sentiment analysis and statistical methods to compare comment sentiments before and after the removal. Sentiment analysis was done with Python’s NLTK and TextBlob packages. By analyzing comments from selected YouTube channels within ten video genres, the research seeks to understand how the sentiments of comments have changed and whether they have grown more positive, neutral, or negative. The findings suggest a general increase in positivity across most genres, with notable exceptions in gaming and shopping categories. This research contributes to understanding the broader implications of platform design changes on user behavior and interaction, highlighting how the removal of a simple feature like the dislike counter can shift the dynamics of viewer engagement and sentiment on a major video-sharing platform.

Aliu, Aminah (COS), completed IW as a junior
Title of Project: Vendi Scores For Dynamic Diverse Data Curation Short

Generative AI models perform poorly on prompts related to Black hair, largely due to biased training data. I present a novel approach to diverse data curation, aiming to address the computational challenges and biases inherent in the process. The proposed algorithm, Vendi data curation, dynamically curates datasets by adding one new data point at each time step and removing one data point that contributes the least to the diversity of the overall dataset. The technologies used include machine learning techniques, such as gradient descent, and Python (libraries include PyTorch, numpy, SciPy, sklearn). Through a series of experiments and analysis, the effectiveness of the algorithm is demonstrated, showcasing its ability to improve dataset diversity incrementally. Furthermore, I discuss the implications of identity and bias in dataset curation, and present the Black Hairstyle Dataset, a collection of images representing three Black hairstyles. This paper highlights the fact

Block, Fletcher (SOC), completed IW as a junior
Title of Project: Is ChatGPT Funny?

How do audience perceptions of humor differ between AI-generated stand-up comedy and comedy made by human comedians? To examine this, I deployed an experimental study comparing combinations of human-written, human-delivered, ChatGPT-written, and ChatGPT-delivered stand-up comedy routines. 149 Princeton undergraduates selected at random participated in the study, which found that human-written human-delivered comedy is perceived to have better material, better delivery, and is overall funnier than any comedy routine with an AI element. My research revolves around the use of AI (ChatGPT) in a creative (specifically, comedy) application. Additionally, ChatGPT was employed to assist in the creation of visualizations with RStudio, and other AI software was used to edit the comedy clips participants listened to.

Kalap, Katharine (ELE), completed IW as a junior
Title of Project: Neural Nets to Improve Rheumatoid Arthritis Patient Outcomes

Rheumatoid arthritis is a chronic autoimmune disease that is difficult to diagnose in its early stages. Many regions in the UK struggle to meet target times for appointments with consultant rheumatologists and have low accuracy rates, leading to delays in treatment and poorer patient outcomes. This work builds on my junior independent research, where I created a machine learning model trained on serological data from 1000 patients, achieving an accuracy of 87.6% for binary classification. Using clustering training losses for label error detection helped flag and remove incorrectly labeled data, enhancing accuracy. 

As part of my research, I analyzed the real-world impact of this model, showing its impact on waiting times when used as a scheduling tool. I demonstrated a 41.6% improvement in time to appointments for high-priority patients. The research highlights quantitative effects of the model in enhancing patient outcomes, while simultaneously showing broader implications of AI in healthcare beyond diagnosis. I further discuss nuanced effects of this model’s use as a scheduling tool, highlighting ethical and social implications which need to be addressed by policy makers before real world use.

Kalogerakos, Anya (ECE), completed IW as a junior
Title of Project: Evaluating the Influence of Internet Communities on Internet Governance

While the Internet remains a critical tool for the daily life of many—with over 5 billion worldwide users—a relatively small proportion of users are aware that various global bodies develop Internet technology standards, policy regulations, and broad Internet recommendations. Several of the most prominent Internet communities, defined to be an organization with a common interest or responsibility related to the operation or use of the Internet, are nothing beyond confusing acronyms to the average Internet user. As such, most Internet users are unaware of who influences the structure and operation of the Internet and what motivates these influencers. The goal of my research is to comparatively study various Internet communities in order to propose a framework for evaluating the influence of an Internet community and to provide a ranking of current prominent Internet communities according to this research. In the process of creating this system, an exploration of historical data, membership involvement, media engagement, public sentiment analysis, and academic survey responses were gathered and compared between the following Internet communities: ICANN/IANA, IETF, ISOC, ITU, W3C, and ETSI.

Musslewhite, Gia (SPI), completed IW as a junior
Title of Project: Murthy v. Missouri, First Amendment Liberties, and the Freedom to Mislead

Online platforms have risen as a major global source of news, introducing the risk of misleading content in the wake of recent crises like the COVID-19 pandemic and the Russo-Ukrainian war. This has threatening implications for U.S. citizens’ access to verifiable information. Currently, Section 230 of the Communications Decency Act allows social media platforms like Meta, X, and Wikipedia to self-regulate posts online while generally protecting them from liability. But recently, President Biden’s administration came under fire for a history of direct communications with platforms, requesting that misleading posts be removed. In Murthy v. Missouri, a group of plaintiffs, including the states of Louisiana and Missouri, argued before the Supreme Court that federal officials’ actions violated the users’ First Amendment-guaranteed freedom of speech. This case introduced three policy options. The first two aligned with the two sides of the Murthy case: the Court could either protect or ban the rights of federal officials to communicate with online platforms. Importantly, collaborative federal communications may strengthen public trust, but may also turn coercive or limit the free spread of news paramount to democracy. The third, middle-ground option seeks to have platforms engage in greater self-regulation while limiting the need for federal oversight. Given these choices, I recommended Wikimedia and other platforms support the defense’s side in Murthy and support good-faith communications to preserve access to reliable, free knowledge online. However, in the long-term, platforms should strengthen their own regulatory models, directing more resources and funding to moderation teams to limit partisan influence and defend their Section 230-protected autonomy.

Neske, Valerie (SOC), completed IW as a junior
Title of Project: A Study of Femcels and Misandrists on Instagram

How does the shift of the femcel movement from niche forums to mainstream social media impact femceldom’s expression by femcels and consumption by ordinary users on Instagram? My research aims to provide an exploratory study of the contemporary femcel community on Instagram to not only fill a lack of academic knowledge about the incel movement’s alleged female counterpart, but to also deepen an understanding of what happens to movements when they migrate from niche to mainstream. Using an inductive approach, I conducted a digital ethnography of Instagram’s femcel community to observe their methods of self-expression and audience engagement. My study found that the shift to a visual-based platform transformed the expression of femceldom from deeply personal anecdotes and opinions to more brief, performative confessions with a strong emphasis on visual aesthetics. It also broadened and diminished ‘femcel’ as an identity, with similar implications for other identity-based social movements. Overall, I believe that in contrast to the previous generation of forum femcels who focused on community building around shared experiences, the contemporary generation of social media femcels focus on a personal identity building around a more marketable neoliberal post-feminist femceldom.

Rahman, Amber (COS), completed IW as a junior
Title of Project: Imperial Preoccupations: A Black Internationalist Analysis of the Uses of ShotSpotter from the U.S. to Palestine

ShotSpotter is a faulty AI-powered acoustic gunshot detection technology used by over 120 police forces across the United States. The technology relies on secretly placed microphones in Black and Brown neighborhoods to “detect” gunshot noises through a process that involves both artificial intelligence and human review to deploy police to the scene. It is a predictive technology because of its assumption of where “crime” is supposed to happen: communities of color. SoundThinking, the rebranded name of the company, markets itself to “law enforcement and security forces across the globe.” including the United States, South Africa, Jamaica, the Bahamas, and, as of 2021, to occupied Palestine. ShotSpotter’s scope extends beyond local police departments: is also used, has been tested, and it is funded, by the U.S. federal government, including the military. These various carceral contexts led me to interrogate the central marketing claim of ShotSpotter: that it exists to reduce gun violence. I use a Black Internationalist framework to assess how the gunshot detection technology ShotSpotter is actually a carceral, settler-colonial tool used by the U.S. police, military, and Israeli police to control, oppress, and surveil “occupied” communities of color both “internal” to the U.S. empire (meaning within the U.S.) and “external to it.” Carceral technologies, meaning digital tools that aid carceral systems in racialized oppression, are increasingly being deployed to expand and exacerbate state violence and surveillance, aided by private companies. Therefore, ShotSpotter is a lens through which we can understand how carceral tools of imperialism are connected and maintained from the United States to Palestine. I offer the term imperial pre-occupations to illustrate how prediction (pre-) is a tool for the occupation of racialized communities and policing racialized populations is a preoccupation of the imperial state domestically and internationally to maintain and expand its power, using the ShotSpotter gunshot detection technology as a case study. My paper explores how these imperial preoccupations materialize in the similar and differing language and logics of ShotSpotter’s use by U.S. police, military, and the Israeli Occupation Forces due to their related settler-colonial projects; via the tools and techniques employed to make the technology useful to the state in different contexts, whether it be attached to drones in Occupied Palestine or the streets of the Southside of Chicago or on U.S. military bases in Afghanistan and Iraq; and the marketing & funding of the ShotSpotter technology, which is used, tested, and subsidized by the U.S. federal government to local policing agencies. By situating how ShotSpotter functions as an imperial tool both internal and external to the U.S. empire, a portal to envision and enact transnational resistance against ShotSpotter emerges.

Swain, Sujay (ELE), completed IW as a junior
Title of Project: Framework for the Future: Developing a Better System for Hardware Innovation

Over the last half-decade, the United States has realized the importance of investing in hardware innovation and manufacturing. With the passage of the CHIPS and Science Act, it appeared as if the United States had finally stepped back into the ring. The reality is, though the CHIPS act was only an initial step, one focused solely on industrial policy and not on the American consumer. Over the past year, I have developed a framework that would allow everyone from local government to the federal government to develop technology policy legislation that can respond to the needs of its citizens. The goal was to provide a framework that would allow innovation to be balanced with user rights and consumer privacy. After speaking with policy makers and experts in the field, along with a detailed analysis of existing legislation, the following framework was proposed. With the goal of creating simple effective solutions, the framework has for main pillars. The first pillar is investing in the local community. The second pillar requires that the technology is built around transparency. The third pillar is to use incentives to encourage organizations to provide greater user rights. The fourth pillar is to emphasize an invest in production not just research. These four pillars would enable better hardware policy to be written, allowing both hardware innovation and the user to flourish.

2024 Certificate Graduates Independent Work

Ajjarapu, Nikhil (COS)
Title of Project: Rescript: An LLM-powered Application to Track Congressional Policy Meetings

Vast amounts of policy developments occur in public settings such as government meetings and debates. These meetings are important to stakeholders from various sectors, who often need to stay on top of developments that could significantly impact them. My independent work  introduces Rescript, a novel application and startup that productionizes Large Language Models (LLMs) to streamline the tracking and analysis of U.S. Congressional Meetings. Rescript automates the extraction of relevant information from video streams, offering users a range of tools, including AI-written professional-quality memos and real-time research assistance accessed through a chat interface. While it is relatively easy to create a promising demo using LLMs, productionizing this new technology is a unique challenge due to its unreliability and tendencies to create hallucinations. To solve this, I developed a robust speaker identification system and AI orchestration techniques, making Rescript a highly reliable, commercial offering serving dozens of enterprise users and their firms, not just a prototype. Through a combination of user interviews, quantitative evaluation, and an analysis of the startup’s business and product traction, this study evaluates Rescript’s effectiveness in improving work for government relations teams. Rescript’s business has grown 6x since launching in December, while our AI outputs reach the accuracy and quality of human-written outputs. The findings suggest that tightly scoping the application of AI and building redundant systems to ensure reliability is one pathway to ensuring an AI application’s success. I conclude that there is significant potential for AI to transform government engagement by reducing the time and expertise required to identify and act on relevant policy discussions.

Bendarkawi, Jad (ECE)
Title of Project: The Swarm Garden: Human-Swarm Interaction for Self-Adaptive Art and Architecture

Why do our technologies instill so much fear in us? Why do our lives today feel so alienating when our technologies are supposed to improve our well-being? What if machines were more like animals and plants? Questions like these motivate the development of The Swarm Garden: An Interactive Architecture Exhibit, an opportunity for humans and robots to create unique, holistic experiences with technology for beauty and human wellbeing. At the intersection of swarm intelligence, architectural design, art-making, and dance, The Swarm Garden demonstrates an experimental, nature-inspired interactive architecture exhibit where 36 robotic flower modules bloom in response to human presence and can exhibit complex long-range and real-time responses through self-organization. Each module exploits the bistability of confinement – or the ability for a sheet to buckle into flower-like patterns when pulled through a ring. Through direct interaction with the flower modules and a wearable device for capturing dance gestures and movement, visitors are empowered to discover emergent behaviors in the swarm by manipulating the propagation of blooming patterns, LED light directions, and LED colors through various interaction modalities. After the hardware and software development and deployment of this human swarm system, we evaluated the interaction modalities used in public exhibition to discover an overwhelmingly positive response from the audience, demonstrating the success of this technology in delivering beautiful and holistic experiences through human-swarm interaction. We also find the evaluation of our system by a professional dancer successful in its application to human-swarm collaborative dance performance and improvisation, having served as a novel method for emergent, synergistic, and beautiful improvisational and choreographic performance outcomes. We envision futures where dancers, artists, and performers can utilize architectural swarms like The Swarm Garden as extensions of their artistic works and employ swarm intelligence to create embodied experiences with technology. Overall, the field of human-swarm interaction provides us the technological groundwork to create opportunities for us to reimagine our relationship with technology, and The Swarm Garden serves as a beacon for us to speculate a joyous future of coexistence between humans, machines, and nature through artistic and architectural robotic swarm applications.

Bhakta, Kareena, SPI (completed IW as a junior 2023)
Title of Project: When Sharing Is Not Caring: An Examination of UNHCR’s Data Sharing Policies in Bangladesh and Kenya

The implementation of biometric technology during refugee registration has the potential to speed up the process, improving access to aid and leading to less fraud since it can identify individuals based on unique characteristics such as iris scans or fingerprints. At the same time, this technology has introduced the issue of how organizations such as the United Nations High Commissioner for Refugees (UNHCR) should manage this sensitive data given that these populations are often fleeing conflict, extreme living conditions, and more. I focus on examining how UNHCR can build trust with refugee populations through their data protection policies and how to increase mechanisms for these populations to hold UNHCR accountable with regard to biometric data management. I analyze case studies of data sharing in Bangladesh and Kenya through four major categories: (1) policies governing data, (2) consent and information sharing, (3) accountability and redress, and (4) responding to failures and adaptations. Based on UNHCR’s published policies and first-hand accounts recorded by other scholars, I determine that there are few required actions of due diligence before data sharing (including data sharing agreements and impact assessments) in addition to miscommunication with refugees about the management of their data and limited mechanisms for complaints or redress. To conclude, I propose seven recommendations for UNHCR that are bucketed into three categories: (1) increasing refugee empowerment, (2) strengthening management of data, and (3) decreasing reliance on biometric data.

Castleman, Jane, COS (completed IW as a junior 2023)
Title of Project: Your Answers Are Protected By Law: Evaluating Protections against Reconstruction and Re-identification Attacks on the U.S. Census

In 2021, the U.S. Census published a report revealing that the privacy protections they used for the 2010 U.S. Decennial Census Data were insufficient, saying the results “were alarming” and “provided conclusive evidence” that stronger privacy protections were necessary. My project aimed to build a framework for evaluating the impacts on private databases by reconstruction and re-identification attacks. It connected the characteristics of these attacks to the legal definitions of privacy in the Title 13 U.S. Code, which states that the Census Bureau cannot publish data from which an individual can be identified. By evaluating the distances between the reconstructed and target database, as well as significant rates of reconstruction and re-identification, policy makers and researchers can better understand whether or not these attacks constitute a violation of individual privacy. In the Census Bureau’s attack, I argue that there was a significant rate of reconstructed individuals. For blocks of size 1-9, 10-49, and 50-99, there exist population uniques that can be exactly reconstructed, therefore violating the Title 13 U.S. Code. Additionally, a significant rate of re-identification can be determined by setting bounds for the re-identification rate and the accuracy of re-identifications. Given there were 178 million re-identifications in their worst-case attack, I argue the rate and accuracy of re-identifications are beyond significant bounds. Overall, this framework begins to bridge the gap between mathematical formalizations of privacy and legal definitions by creating a method to evaluate reconstruction and re-identification attacks. It also hopes to help policymakers understand when and why to apply increased privacy protections to their data.

Chen, Alina, COS
Title of Project: Virtual Ad-demic: Examining Skew in Facebook’s Ad Delivery for Vaccine Information

Social media platforms, such as Facebook with its three billion global users, play a critical role in public health communication. Vaccine hesitancy, a top global health threat as per the World Health Organization, is fueled by misinformation across these networks. As the world grappled with the rapid spread of COVID-19 in 2020, discussions around vaccination, vaccine mandates, and public health measures intensified both online and offline, and heightened the need to understand vaccine discourse on social media. My independent work focused on Facebook’s vaccine-related advertisements from 2018 to 2024, utilizing data from the platform’s Ad Library. I studied 31,362 ads from 1,297 advertisers, analyzing narrative content, advertising strategies, and delivery patterns, with the aim of determining how these elements influence public vaccination perceptions, assessing ad distribution efficacy across demographics, and identifying potential biases in Facebook’s ad delivery algorithms. My findings demonstrate significant misalignments in ad distribution versus intended demographic targets, indicating possible biases that could affect public health communication, and emphasizing the need for transparency in social media algorithms to ensure equitable information distribution.

Dong, Stephen, COS (completed IW as a junior 2023)
Title of Project: SamplAR – Augmented Reality for Family-Style Restaurant Ordering

While there has been research on the commercial benefits of augmented reality (AR) within the restaurant setting, there is limited research exploring the social potential of AR within the restaurant setting, particularly within restaurants that dine family-style. This is especially important because restaurant ordering is inherently social, and many restaurants that dine family-style are Asian, which comes with difficulties such as language barriers and a lack of understanding of specific foods. My hypothesis was that AR could help solve these issues while making family-style food ordering more collaborative, social, and fun. In my independent work, I built a collaborative ordering platform for family-style restaurants called SamplAR and evaluated the application with 5 dyads to study the impact of the application as well as uncover insights about the dynamics underlying ordering at family-style restaurants. My hope is that my research can help us build digital technology within restaurants that encourage socialization.

Esparraguera, Liam, COS (completed IW as a junior 2023)
Title of Project: a11ystudy: An Explorable History of Web Accessibility

Despite the growing prevalence of web-based technologies, the study and development of accessible online interfaces has remained a challenge and site of ongoing progress since the dawn of the Internet-connected age. Statistical reports on the modern state of web accessibility have been published, yet, there exist few detailed reports that span the history of the web, a solution which could serve as a crucial utility in the development of accessible interfaces. In this project, I develop a novel approach to documenting the evolution of digital accessibility through the integration of web content archives with software for the programmatic evaluation of web page accessibility. The final result is a two-tool system for the collection and exploration of time-series data on web content accessibility: a11ystudy, a command-line interface for the evaluation of archived web pages, and a11ystudy-web, a companion web application allowing users to visualize exported data to explore trends in web accessibility. These tools are used in conjunction to generate and visualize a sample dataset that documents the conformance of the top 100 webpages to the Web Content Accessibility Guidelines from 2012 to the present. With this project, I build the foundation for a toolset that enables individuals, regardless of technical expertise, to interrogate the past and present state of web accessibility in order to pursue a more equitable future for digital technologies.

Frascella, Anthony (Ted), SPI
Title of Project: Under the Electric Eye: An Analysis and Assessment of Risk Factors For Abuse of State Surveillance

The surveillance technology is an ever growing presence in the world and individuals living under all regime types must reckon with risks to personal privacy and political rights as a result. My independent work explores the impact of surveillance technology on society, focusing on its dual role in enhancing public safety and posing risks to privacy and freedom. It examines how different regime types deploy surveillance systems and their consequent effects on populations. The study utilizes a mixed-methods approach to analyze the correlations and causal relationships between factors such as access to technology, political rights, and security conditions, and their impact on privacy. This research reveals that surveillance integration into daily online activities raises significant privacy concerns, highlighting the societal challenge of balancing security with individual rights. The work underscores the importance of robust legal frameworks and external oversight mechanisms in preventing surveillance abuses and protecting democratic freedoms. It calls for international cooperation and stricter legal guidelines to safeguard privacy and human rights in the digital age, stressing the societal implications of technological misuse and the necessity for policy that upholds human rights.

Gil, Irene, COS
Title of Project: Virtual Museum Hub: A New Lens to Breaking Down the Antiquated Structures of the Arts Industry to Reinvigorate a Social Community

In recent years, the growth of the museum industry in the US has been slow. However, this is not in line with the sentiment of most museums, where it has been an age-old goal for them to nurture interest and education in the arts. This goal of most museums has been prevented by the barriers felt by many visitors, whether they be financial, logistical, or simply a lack of exposure. A lot of these barriers stem from the antiquated structures that the museum industry is tied down by. For the quantitative framework of my research, I analyzed data on the top 20 museums in the US to gather insights on museum operation and asked the question of where the barriers to break down come from. For the quantitative framework, I analyzed the Google Arts & Culture feature, using the Google Trends tool, studying how effective the feature is and what impact this technology has on society. Bringing in my IW work of a digital museum reservation platform, the Virtual Museum Hub, I argue that there is a different angle in which we can use technology to further the arts education and use museums as a way to cultivate social engagement.

Grover, Ananya, COS
Title of Project: Navigating the News: A Pro-Social LLM Chatbot for Enhancing Viewpoint Diversity in Online News Consumption

In response to concerns about media polarization and harmful machine-generated content, my independent work explores the development and evaluation of PRISM, a socially beneficial chatbot designed to assist individuals in quickly accessing alternative perspectives on news found on the internet and social media platforms. I compare three different Large Language Model (LLM)-based approaches, namely Zero-Shot Learning (ZSL), Chain of Thought (CoT), and Multiagent Group Chat (MGC), at performing the task of writing comparative summaries of up to three news articles, finding that the Chain of Thought method performs the best. This independent work presents both a tool and a set of findings that reinforce its need, highlighting the potential of LLM-based tools in facilitating critical engagement with news media and enhancing users’ access to diverse viewpoints to promote healthier media diets and discourse.

Knoll, Theo, COS (completed IW as a junior 2023)
Title of Project: ARctic Escape: Promoting Social Connection, Teamwork, and Collaboration Using a Co-Located Augmented Reality Escape Room

Escape rooms are interactive games where players collaborate to discover information about their environment to accomplish a shared goal. While physical escape rooms provide groups with fun, social experiences, they require a gameplay venue, props, and a game master to play, all of which detract from their ease of access. Existing augmented reality (AR) escape rooms demonstrate that AR can make escape room experiences easier to access, but many AR escape rooms are single-player, and therefore fail to maintain the social and collaborative elements of their physical counterparts. I created ARctic Escape, a two-person, co-located AR escape room designed to promote social connection, collaboration, and communication. I evaluated ARctic Escape by conducting semi-structured interviews with four dyads to explore the sociological implications of AR technology as applied to escape rooms and to learn about participants’ interpersonal dynamics and experiences during gameplay. I found that participants thought the experience was fun, collaborative, promoted discussion, and inspired new social dynamics.

Lee, Alison (Alice), COS (completed IW as a junior 2023)
Title of Project: Examining Gender Differences in Investor Questions Towards Entrepreneurs: A Shark Tank Case Study

Venture capital allows startups to scale and be successful quickly, yet women have historically been excluded from a majority of this funding. Venture capital funding for all-women teams has hovered around the 2% mark for over a decade. We explore the potential explanation that investors ask entrepreneurs different questions on the basis of gender, perhaps subconsciously revealing a cognitive bias that assumes women are more likely to fail. This is demonstrated with more popular usage of ‘potential’ words like “aspire” and “hope” when talking to men and more popular use of ‘prevention’ words like “risk” and “careful” when conversing with women. Using a newly-created Shark Tank dataset, we use natural language processing to create classifiers that take in investor questions and predict the gender of the entrepreneur. We find that the classifiers are not meaningfully more accurate. However, we find some qualitative evidence of investors asking entrepreneurs different questions.

Liu, Zi Han, SOC
Title of Project: The Datafication of Listening: How “Spotify Artists” and “Spotify Wrapped” Shape Value in the Streaming Music Economy

Streaming has completely transformed the recorded music economy. Building on the research of my senior thesis, which examined how tastemakers and consumers shape value in the streaming music economy, this independent work focused on the sociotechnical changes that emerged since the rise of Spotify. Specifically, I conducted a comparative content analysis between Spotify’s artist-facing interface (i.e. “Spotify for Artists” or “Spotify Artists”) and its consumer-facing interface (i.e. “Spotify Wrapped”) to show how streaming platforms refashioned the music listening experience by way of behavioral data. In turn, this repackaged music navigation by the centrality of curated social identities rather than sound. For “Spotify Artists,” this is achieved by segmenting audiences based on the listener’s relational bond with their favorite artists or songs. Conversely, for “Spotify Wrapped,” consumers built intimate algorithmic identities by engaging with Spotify’s curated listening profiles.

Maynard, Lauren, COS (completed IW as a junior 2023)
Title of Project: Nonprofits Unlock the Decentralized Landscape: Addressing the Fears, Uncertainties, and Doubts of Leveraging Blockchain Technology in Civil Society

As demonstrated in the open-source software (OSS) movement, many sectors were actively shaping and influencing the development and use of this technology. Fears, uncertainties, and doubts left nonprofits hesitant initially to adopt OSS technology. However, as the technology matured and nonprofit organizations saw the cost savings, increased efficiency, scalability, and better performance that OSS offered, they began to recognize the competitive edge and other advantages of using OSS and increasingly adopted it. Unfortunately, by then, many of the organizations that had hesitated to adopt OSS had missed the opportunity to shape and modify earlier stages of OSS. To better understand how civil society can participate in the decentralized future of the web, TechSoup Global conducted qualitative research sponsored by Filecoin, a decentralized storage network. I was recruited to identify a robust use case for decentralized technology to prevent nonprofits from falling further behind in the digital divide. I conducted a mixed methods analysis that included qualitative interviews, surveys, and a literature review to develop user personas using Figma, a design and prototyping tool. This research framed the landscape analysis and revealed best practices for the nonprofit adoption of decentralized technology. My independent work identified that IT professionals are the subject matter experts leading the way towards accepting decentralized technology–as they increasingly recognize the advantages and can implement them if adequately supported. Therefore, this strategy’s objective was to inform stakeholders about the benefits of decentralized technology and streamline adoption efforts by surfacing IT professionals’ challenges when they advocate for decentralized technology’s use. We hope that with TechSoup’s knowledge and resources nonprofits will be empowered to leverage and shape the new decentralized space to meet their organizational needs by making the digital future as secure, transparent, and open as possible.

Parikh, Yash, COS
Title of Project: Longitudinal Web Privacy Monitoring: Toward a Regulatory Tool

Recent years have brought unprecedented amounts of regulation on third-party data collection. There have been more state-level privacy laws in the past three years than ever before, and private companies are separately choosing to regulate third-party data collection. Regulators will have a large role in determining what changes should occur and how effective these changes are. To empirically measure the impact of regulatory changes and to enforce their policies, regulators need longitudinal web privacy monitoring tools. Current tools are too complex and/or cannot collect data about privacy violations over time. To address this gap, I conducted user interviews to determine the features regulators need in their web privacy monitoring tools. I created Longitudinal, Automated Monitoring of Privacy (LAMP), a proof-of-concept automated web-scraper for longitudinal web privacy monitoring to fill these needs.

Rubenstein, Alison, ORF
Title of Project: Sparking Interest: Investigating Drivers of Public Interest Through an Analysis of Google Trends for Extreme Weather Events

Understanding drivers of public interest can reveal solutions to collective action problems such as climate change and enable widespread behavioral change. Since search engines, like Google, are easily accessible and frequently used, online search behavior data can serve as an insightful metric to gauge public interest at a given place and time. In this independent work, I used Google Trends data for climate change and natural disasters to understand behavioral patterns related to these topics and to investigate the advantages and disadvantages of using online search behavior data in research. Strong relationships were observed between searches for climate change and searches for heatwaves, as well as between climate change searches and the volume of climate related news publications. Additionally, the results suggest that extreme weather events, like hurricanes, capture widespread search attention at the times of these events, which has important implications for information messaging campaigns. Cross-regional analysis observed more similar behavioral trends between locations that are closer together, have similar weather experiences, and have similar political affiliations and education levels. Through this research, I assessed multiple statistical methods for characterizing and interpreting online search data and demonstrated the importance of contextualizing search data when modeling and analyzing trends. Overall, this research demonstrated that online search behavior data can provide valuable insights related to public interest but considering many possible meanings of user data is essential when matching user search data to scientific data.

Song, Emmy, COS (completed IW as a junior 2023)
Title of Project: Cracking the Bamboo Ceiling: Predictive Factors for Asian American Promotion in the Workplace

Under the model minority myth, Asian-Americans are stereotyped as high-achieving, well-educated members of society who are able to find well-paying jobs. They make up only 6.2% of the United States population but are well over-represented in the workplace, composing 13% of working professionals. However, Asian-Americans falter at receiving promotion to the upper echelons of leadership, where White Americans dominate. While White Americans make up 69% of the U.S. workforce, they compose 85% of executives, senior officers, and other higher-level managers. On the other hand, Asian Americans make up 13% of the workforce and only hold 6% of top positions. I study the mechanisms as to why Asian-Americans are not promoted at the same rate as White Americans by drawing upon employee responses to surveys about their work experiences. By utilizing a random forests model to identify the most relevant factors, I determine that demographic attributes such as race and gender, as well as level of career support and training are largely impactful in determining Asian-American promotional outcomes. In addition, indirect values of grit and job support were more important to advance promotions for younger age groups, while ambition and monetary motivation came to the forefront for older ones. Finally, my intersectional analysis of race and gender confirms previous hypotheses that being male and more experienced adds an advantage in promotion, regardless of race.

Vuono, Ryan, SPI
Title of Project: Towards Best-in-Class Biometric Data Protection for Refugees: A Comparative Review of UNHCR, Oxfam, and ICRC Policies

Biometrics have rapidly increased in popularity as a method for verifying the identities of refugees, both by governments and humanitarian organizations. With over 100 million individuals falling under its mandate, UNHCR has an immense responsibility to ensure that their personal data is protected, especially so in the case of biometrics, which are permanently and irreversibly connected to one’s identity. To ensure that this incredibly sensitive form of data is used exclusively to achieve its primary mission to “safeguard the rights and well-being of refugees,” it must have best-in-class protection policies. In my paper, I compare UNHCR data protection policies with those of Oxfam and ICRC, two other organizations in the humanitarian space. I analyze the text of each organization’s policy documents and compare them across four main criteria—scope and specificity, data collection and data storage, third-party sharing, and flexibility. This analysis revealed that UNHCR policy could do more in order to improve its specificity of language, stringency of its requirements, and security of its held data—especially with regard to biometrics. Taking inspiration from the strongest aspects of Oxfam and ICRC policies, I provide five major recommendations to help better achieve the agency’s mission of safeguarding the rights and well-being of refugees: (1) creating clearer delineations between biometrics and other kinds of identifying data; (2) conducting a comprehensive assessment of the risks of collecting, processing, storing and sharing biometrics, and sharing the information with possible data subjects; (3) strengthening point-of-storage protections; (4) developing a stricter data-sharing framework to ensure that biometrics are only shared when absolutely necessary and in very specific ways; and (5) implementing an annual or bi-annual review process to update its policies to keep pace with the rapid pace of biometrics’ development. UNHCR has clear opportunities for improvement in order to ensure the sustained protection of refugees in an increasingly technological world.

Waseem, Shanzey, (COS) (completed IW as a junior 2023)
Title of Project: Video Games: There’s No Time for Violence

Psychological research depicts how violent video games cause real-life violence, particularly in the youth. While vast literature investigates various policies to curb the specific harms to this protected population that are a result of violent gaming, I use guardians’ perceptions to demonstrate the demand for more stringent content regulation and the introduction of time regulation. By designing a meticulous survey to collate a deep analysis of guardians’ perceptions, and despite the fact that psychological research that shows that perception is not reality, by comparing generalized perceptions to more specific observations and including significance levels to value the conclusions, the results showed a significant demand for industry-level content and time moderation policies; however, it also displayed the lack of awareness guardian’s have on video gaming policy, literature and hence, results-based regulations. As such, the policy implications are to look at developer and educational end changes that can be incorporated.

Wilks, Torre, SOC
Title of Project: Pretty Hurts: The Intersectionality of Race, Weight and Socioeconomic Status on Algorithmic Bias

Qualitative interviews that describe how content creators perceive algorithms as contributors to the mistreatment of people of color on social media. The participants of my study, all with more than 20,000 followers, believe societal factors like race, weight and socioeconomic status create algorithmic bias on social media platforms. The consequences of this bias leads to creators of color having a difficult time achieving virality and adequate compensation on social media. Biased algorithms shadowban and moderate Black content creators harsher than white creators, which makes it harder for people of color to be discovered on apps and causes them to be paid less. Moreover, the AI technology brands use to determine which creators they should sponsor are influenced by racial stereotypes that discourage partnerships with people from low socioeconomic statuses. As social media platforms progressed, the influence of a content creator has permeated beyond the realm of entertainment, and now has jurisdiction over our economic and political decision-making. Meaning, if algorithmic bias prevents Black creators from achieving the same amount of power and privilege on these platforms, then social media companies should be responsible for making their technology more transparent and equitable.

Woo, Melissa, ORF
Title of Project: From Black Box to Glass Box: The Impact of Data Complexity on Machine Learning Explainability

My research focuses on enhancing transparency and trustworthiness in credit scoring for lending by evaluating post-hoc feature attribution methods, which explain model decisions by quantifying the significance of each input feature in the outputted result. By analyzing performance across various data complexity contexts such as feature correlation and target expression, I identify the best-performing methods, finding that SHAP-based methods and particularly On-Manifold SHAP are highly effective for explaining model decisions given data with linear target expressions and high feature correlation. These insights help improve model interpretability and decision-making in credit lending, addressing concerns about fairness and accountability in complex machine learning models.

Yang, Katherine (Kathy), COS
Title of Project:Proving Causality in Disparate Impact Housing Discrimination Cases After Inclusive Communities

In the United States, a long history of discrimination has resulted in persistent demographic disparities in housing. The Fair Housing Act was intended to lessen the extent of segregation and provide legal recourse to victims of housing discrimination. Under the act, “disparate impact” claims provide an option to object to insidious policies that are facially neutral but contribute significantly to disparities. In the past two decades, these claims have grown harder for plaintiffs to prove. Notably, the introduction of new causation standards has prevented plaintiffs from moving past the initial “prima facie” stage in court using traditional statistical methods. In parallel to this increased burden is the explosion of a “causal revolution” in the scientific world. My goal was to investigate whether Judea Pearl’s structural causal models—drawn from theoretical computer science—could help plaintiffs better prove causation in disparate impact housing discrimination cases. To that end, I conducted a case study analysis of the seminal Inclusive Communities v. Texas Department of Housing & Community Affairs case—tracing the evolving causation arguments through multiple iterations and constructing a proof-of-concept causal diagram for the facts of the case. I conclude that structural causal models show promise for application in disparate impact housing claims given their flexibility, robustness, and accessibility. In addition, the cross-pollination of computer science and law in this project unearthed fundamental philosophical discrepancies between scientific and legal definitions of causality that remain to be resolved through future work.

Zhang, Jasmine, COS
Title of Project: Click for a Cure: Analyzing Healthcare Advertisements on Facebook and the Role of the Delivery Algorithm

Advertisement delivery algorithms play a uniquely impactful role in our online experiences, determining what topics we see and what perspectives we hear. As ad delivery algorithms are powered by machine learning, their outcomes may be biased or skewed by gender, race, or age due to biases in the training data. In my research, I initiate the conversation in the domain of healthcare. My objective is to generate a holistic understanding of Meta’s healthcare advertisement space, and to study whether healthcare ad delivery may be discriminatory. I collect existing Meta advertisements using the most popular keywords in the current healthcare discussion. I conduct textual analysis on the ads within each keyword, examining popularity, complexity, formality, and point of view used. I further conduct thematic analysis, focusing on the topic of climate anxiety. By coding and identifying themes across a random subset of climate anxiety ads and newspaper articles, I find that the perspectives and themes reflected in the advertisement space parallel the discussion in news media. I further examine the role the delivery algorithm plays in influencing the delivery of such ads across demographic categories by running my own climate anxiety ads in conjunction with ads for nature, ChatGPT, and social media influencers. By comparing the demographic distribution reached across the different ads and observing that they are not substantially different, I conclude that the ad delivery algorithm does not treat climate anxiety ads differently from ads in the other categories I study. Rather, the algorithm delivers all the ads to predominantly older male audiences, suggesting that advertisers aiming to reach gender-balanced or younger audiences should use explicit targeting features indicating these goals.

Previous Certificate Graduates’ Independent Work