Professor David Puelz Says to Challenge Ideology by Following the Data
The statistician and former Goldman Sachs analyst asks questions of counterfactual worlds.
University of Austin Assistant Professor of Data Science David Puelz is a Bayesian statistician working at the intersection of computational data analysis and machine learning. He writes and researches on economics, the social sciences, and applied artificial intelligence. He is also faculty at The University of Texas at Austin in The School of Civic Leadership and McCombs School of Business. Puelz previously was a postdoctoral researcher at the University of Chicago Booth School of Business and an analyst for Goldman Sachs & Co. He received his Ph.D. in Statistics from The University of Texas at Austin.
Here, Dr. Puelz discusses his formation as a college student, his predictions for causal AI, his recent project challenging COVID-19 predictions, and much more.
UATX: Where did you grow up and complete your undergraduate degree? Describe a college experience that was formative for you as a researcher.
Dr. David Puelz: I grew up in Dallas, so moving to Austin has been like returning home.
I earned my undergraduate degree in math and physics from Wesleyan University in Connecticut. Like UATX, Wesleyan has a very small student-to-faculty ratio, which encourages student engagement and often leads to exciting research collaborations with professors.
While in college, I worked directly with several Wesleyan faculty, including the physicist Dr. Tsampikos Kottos over one summer at the Max Planck Institute in Germany. That experience beefed up my coding skills and introduced me to theoretical physics research. I also researched as an undergrad at the Institute for Pure and Applied Math at the University of California, Los Angeles.
Engaging in high-level academic research while I was still a college student laid the foundation for my career as an academic.
Could you describe what led you to statistics and your particular subfields of computational data analysis and machine learning? How would you briefly introduce those subfields to those unfamiliar?
Right out of college, I got a job with Goldman Sachs, an investment banking firm in New York City. I worked with a group of Ph.D.-holders in economics, statistics, physics, and mathematics. My work on Wall Street opened my eyes to the many ways of describing our complex world through quantitative models. In particular, it highlighted a major contrast between physical and probabilistic modeling.
In college, I learned to understand the world primarily through physical models, such as “force = mass x acceleration”. However, once I began my career in finance, I learned about the beauty of probability modeling to capture risk, uncertainty, and structure in observed data. This new way of thinking about the world so captured my interest that I decided to pursue a Ph.D. at UT-Austin with an exceptional statistician who became a personal friend, Carlos Carvalho.
From that point forward, I set out to build new computational methods for difficult applied problems. Some questions I tackled: How do you design a predictive model for the entire public stock market based on hundreds of firm characteristics? How do you measure the effect of a randomized control trial on a large network, whether spatial or social?
The applications I work on range widely from finance and economics to social science, but methodologically, I wear one hat: I am a statistician designing new machine learning tools.
A major theme in all of my teaching is extreme skepticism and rigorous study of complex problems with data. Right now, I work on research broadly defined as “causal machine learning,” recently described by some as “causal AI.” This turns a prediction problem on its head and asks questions of counterfactual worlds. I ask questions like: “What would have happened to A had B happened or not happened?”
Or, more concretely:
“What would have happened to the California economy had the minimum wage not been increased?”
“What would have happened to COVID-19 spread had lockdowns not been put in place in New York City?”
“What would have happened to tumor growth under a regime of an innovative cancer drug (as opposed to a placebo pill)?”
If Taylor Swift were a man, would she be wealthier?
Viewed through the lens of statistical modeling, answering these questions boils down to estimating treatment or causal effects in complex settings. Treatment effects are magnitudes of change in a system that experiences an intervention. These include randomized control trials at large companies, treatment/placebo tests of a new cancer drug, or observed policy changes in a state or country.
My hot take is that causal AI will eventually replace the “large-language-model AI” hype. It presents a much more challenging engineering task, and the value proposition of automated decision-making for businesses and policymakers is much greater than text, image, and video generation.
In addition to your role at UATX, you serve as faculty and Director of the Policy Research Lab in the School of Civic Leadership at The University of Texas at Austin. Could you tell us more about this program and how it relates to your role at UATX?
The Policy Research Lab is a two-part program that first trains students in statistics and data science tools for policy research and then hires a select group of students as research assistants to work with faculty and collaborating companies on cutting-edge projects. The projects span important topics for civic and business leaders, from COVID-19 policy to impact measurement in venture capital to economic forecasting.
We are working to jointly offer PRL to UATX and UT-Austin students in future years.
In addition to my work with the PRL, I teach data science and machine learning courses in the business school at UT-Austin for undergraduates, MBAs, and graduate students in business analytics.
You have researched the weaknesses of COVID-19 models during the pandemic. Could you share a brief overview of your findings and the lessons forecasters and the general public might take from that episode?
I launched this project during the pandemic, inspired by my own skepticism of epidemiological forecasts used to justify policies like lockdowns and school closures. Media and government leaders paraded these scientists and their models through several waves of COVID-19, and I started to wonder about their accuracy.
Three of my brilliant UT students—Ashna Bhansali, John Walkington, and Tarini Sudhakar—put together a data set of forecasts developed by an influential UT-Austin team and realized data for the City of Austin on hospitalizations, ICU patients, and deaths. We found the forecasts wildly inaccurate and alarmist relative to what actually transpired. We were excited to share this information with the world and grateful to get it published in a peer-reviewed journal.
What brought you to UATX, and what will be distinctive about a University of Austin education?
I’m motivated to build, and I’m excited about new alternatives to higher education. I’m especially excited to help design a new STEM curriculum more closely aligned with top companies and industry partners and unshackled from the “old ways” of undergraduate training and recruiting.
My teaching will encourage students to avoid ideological conformity, relish taking unconventional positions, and always follow the data.