Assign Experiment Variants at Scale in Online Controlled Experiments
Randomization is the key to establishing causality in controlled experiments. Online controlled experiments (A/B tests) has become the gold standard to learn the impact of new product features in technology companies. Randomization enables inference of casualty from A/B test. Because the randomized assignment maps end users to experiment buckets and balances user characteristics (both observed and unobserved) between the groups, and experiments can attribute any outcome differences between the experiment groups (control and treatment) to the product feature under experiment. Technology companies run A/B tests at scale – hundreds if not thousands of A/B tests concurrently, and each with hundreds of millions of users. The large scale poses unique challenges to randomization. First, the randomized assignment must be computationally fast since the experiment service has hundreds of thousands of queries per second (QPS), and the QPS grows quickly in a hypergrowth company. Second, experiment variant assignments must be independent between the hundreds of experiments a user is assigned to. Third, the assignment must be consistent when a user revisits the same experiment or more users are included in the experiment. We present a novel assignment algorithm and provide statistical tests to validate the randomized assignments. The results of the study show that not only is this algorithm computationally fast but also satisfies the statistical requirements: unbiased and independent between experiments.
Max is a senior data scientist at Wish where he focuses on experimentation (A/B testing) and machine learning. He has been improving the A/B testing platform at Wish on various fronts, including infrastructure, statistical testing, usability, etc. His passion is to empower data-driven decision-making through the rigorous use of data. Max earned his Ph.D. in Statistical Informatics from the University of Arizona.