Transparency Mechanisms for Hash-based Matching under End-to-End Encryption
(CW for discussion of child sexual abuse)
Many companies in and out of the U.S. conduct hash-based detection of child sexual abuse material (CSAM) in settings which are not end-to-end encrypted, including messaging, email, and cloud storage. These systems generally use a special “perceptual” hash function to match newly sent or stored images against a database of hashes of known CSAM images which is held by a national or international CSAM clearinghouse.
In July, Apple announced a system which would perform this matching privately, so that the server only learns information once a certain threshold of matches has been reached. Although Apple only announced plans to use this system in the non-end-to-end encrypted iCloud Photos application, such a system could easily be adapted for use in the end-to-end encrypted setting. As a result, Apple’s announcement has rekindled the debate over content moderation in end-to-end encrypted messaging.
This work provides both policy and technical contributions to improve the state of the debate. First, we perform an in-depth analysis of the policy tradeoffs. This includes a description of when the issue relates to hash-based detection in general, or only in the end-to-end encrypted setting. We also identify several issues that can be improved by cryptographic transparency measures. We find that the vast majority of the policy questions are unlikely to be addressed with cryptography. Second, we implement some cryptographic transparency tools that do improve the situation in the case of the Apple PSI system; these should be adaptable to other proposed systems as well.
Sarah Scheffler is a postdoctoral research associate at CITP. Her applied cryptography research creates new cryptographic capabilities inspired by the needs of society, law, and policy. Her research includes joint legal-cryptographic work on compelled decryption, end-to-end encrypted messaging, algorithmic fairness, and zero knowledge proofs.
Sarah received her Ph.D. in computer science from Boston University in 2021 advised by Prof. Mayank Varia. Prior to her work at BU, she worked as assistant research staff at the MIT Lincoln Laboratory, and received her B.S. in mathematics and computer science from Harvey Mudd College.
Sarah has also been a frequent volunteer for various outreach programs teaching computer science, programming, and cryptography to high school students.