Fourth-year student and PhD candidate Han Shao is researching multiple problems in machine learning theory. She is currently working with Professor Avrim Blum, who also serves as Chief Academic Officer, as her advisor. Shao received her B.S. from Nanjing University and her MPhil from the Chinese University of Hong Kong.
“I feel very fortunate to be advised by Professor Blum,” Shao said. “He taught me how to conduct research, including how to theoretically formalize a problem and how to generate simple examples to understand the problem intuitively. I really learned a lot from him. Initially, he introduced me to research problems but gradually encouraged me to find new problems by myself.”
One of the machine learning problems that Shao is researching is incentives for collaboration in federated learning.
In federated learning, multiple learning agents (such as smartphones or hospitals with electronic health records) possess their local data and can train a centralized model collaboratively. In recent years, federated learning has been embraced as an approach for bringing about collaboration across large populations of learning agents.
However, little is known about how collaboration protocols should take agents’ incentives into account when allocating training burden for communal learning in order to maintain such collaborations. For example, if one agent were asked to contribute more but it’s only used to improve the performance for another agent, then the first agent may feel it is unfair and will no longer be incentivized to join the collaboration anymore.
“These models need a method that incentivizes agents to join this collaborative effort,” Shao said. “I am quite interested in the incentives in this type of communal learning setting so that every agent is only asked for a reasonable amount of contribution and that the learning is sustainable.”
Shao is also interested in theoretical understanding of some empirical phenomena. One example is the sample complexity of data augmentation under transformation invariances. Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances into the learning process.
“We were trying to understand how data augmentation performs theoretically and what an optimal algorithm is in terms of sample complexity under transformation invariances,” Shao said.
Among the courses she has taken at TTIC, Shao said her favorite courses were “Statistical and Computational Learning Theory” taught by Professor Nati Srebro, “Introduction to the Theory of Machine Learning” taught by Professor Blum, and “Computational and Metric Geometry” taught by Professor Yury Makarychev.
“[Professor Makarychev’s] computational geometry class was very interesting because I didn’t know very much about computational geometry before,” Shao said. “It’s new and interesting to me.”
As Shao continues through her PhD program, she has started to consider what she would like to do after graduation. While her plans are flexible, she has expressed interest in academia.
“I think the best part of working in academia is that you have more freedom to research the things you are really interested in,” Shao said. “In terms of this, you always have to think about what you want to pursue. I really enjoy thinking about my research because I feel that I learn a lot from each project and I think a side benefit is the ability to be more flexible with your time.”
One piece of advice that Shao would give to future PhD candidates is to try to be open to other problems you are not familiar with.
“Some people can end up limiting themselves if they focus on one thing without being open to other problems or thoughts, which also limits their understanding of the thing they are focusing on,” Shao said. “It’s really helpful to be open to things that you don’t know and sometimes you might learn a new angle to look at the problem you care about.”