WWW 2021 Tutorial, Ljubljana, Slovenia
Date: April 2021, exact dates and times TBD
- Tutorial Summary
- Presenter Biographies
Jennifer Dy, Northeastern University, Electrical and Computer Engineering Department
Stratis Ioannidis, Northeastern University, Electrical and Computer Engineering Department
Ilkay Yildiz, Northeastern University, Electrical and Computer Engineering Department
This tutorial will review classic and recent approaches to tackle the problem of learning from comparisons and, more broadly, learning from ranked data. Particular focus will be paid to the ranking regression setting, whereby rankings are to be regressed from sample features.
Class labels generated by humans are often noisy, as data collected from multiple experts exhibit inconsistencies across labelers. To ameliorate this effect, one approach is to ask labelers to compare or rank samples instead: when class labels are ordered, a labeler presented with two or more samples can rank them w.r.t. their relative order, as induced by class membership. Comparisons are more informative than class labels, as they capture both inter- and intra-class relationships; the latter are not revealed by class labels alone. In addition, comparison labels are subject to reduced variability in practice: this has been experimentally observed in many domains, and is due to the fact that humans often find it easier to make relative–rather than absolute–judgements.
Nevertheless, learning from comparisons poses computational challenges regressing rankings features is a computationally intensive task. Learning from pairs of comparisons between 𝑁 samples corresponds to inference over 𝑂(𝑁^2) comparison labels. More generally, learning from rankings of sample subsets of size K corresponds to inference over 𝑂(𝑁^K) labels. This requires significantly improving the performance of, e.g., maximum likelihood estimation (MLE) algorithms over such datasets. Finally, collecting rankings is also labor intensive. This is precisely because of the 𝑂(𝑁^K) size of the space of potential sets of size K to be labeled.
- Parametric models: Bradley-Terry, Plackett-Luce, Thurstone.
- Non-parametric Models: noisy-permutation model, Mallows model, matrix factorization methods.
- Maximum Likelihood Estimation and spectral algorithms.
- Ranking regression and variational inference methods applied to comparisons.
- Sample complexity guarantees for ranking regression.
- Deep neural network models and accelerated learning methods.
- Active learning from comparisons.
Jennifer Dy is a Professor at the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, where she first joined the faculty in 2002. She received her M.S. and Ph.D. in 1997 and 2001 respectively from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, and her B.S. degree from the Department of Electrical Engineering, University of the Philippines, in 1993. Her research spans both fundamental research in machine learning and their application to biomedical imaging, health, science and engineering, with research contributions in unsupervised learning, interpretable models, dimensionality reduction, feature selection/sparse methods, learning from uncertain experts, active learning, Bayesian models, and deep representations. She received an NSF Career award in 2004. She has served or is serving as Secretary for the International Machine Learning Society, associate editor/editorial board member for the Journal of Machine Learning Research, Machine Learning journal, IEEE Transactions on Pattern Analysis and Machine Intelligence, organizing and or technical program committee member for premier conferences in machine learning and data mining (ICML, NeurIPS, ACM SIGKDD, AAAI, IJCAI, UAI, AISTATS, ICLR, SIAM SDM), and program co-chair for SIAM SDM 2013 and ICML 2018.
Stratis Ioannidis is an Associate Professor in the Electrical and Computer Engineering Department of Northeastern University, in Boston, MA, where he also holds a courtesy appointment with the College of Computer and Information Science. His research interests span machine learning, distributed systems, networking, optimization, and privacy. He received his B.Sc. (2002) in Electrical and Computer Engineering from the National Technical University of Athens, Greece, and his M.Sc. (2004) and Ph.D. (2009) in Computer Science from the University of Toronto, Canada. Prior to joining Northeastern, he was a research scientist at the Technicolor research centers in Paris, France, and Palo Alto, CA, as well as at Yahoo Labs in Sunnyvale, CA. He is the recipient of an NSF CAREER Award, a Google Faculty Research Award, a Facebook Research Award, and a best paper award at ACM ICN 2017 and IEEE DySPAN 2019.
Ilkay Yildiz is a senior PhD student at the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA. She received her undergraduate degree in Electrical and Electronics Engineering at Bilkent University, Ankara, Turkey in 2017. Her research interests span ranking and preference learning, deep learning, computer vision, probabilistic modeling, and optimization. Her research contributions involve accelerated regression algorithms that learn from choice and ranking labels.
Our work is supported by NIH (R01EY019474), NSF (SCH-1622542 at MGH, SCH-1622536 at Northeastern, SCH-1622679 at OHSU, Facebook Statistics Fellowship, and by unrestricted departmental funding from Research to Prevent Blindness (OHSU).