WWW 2021 Tutorial, Ljubljana, Slovenia
Date: April 14, 2021
Start Time: 2pm CEST/Ljubljana, 8am EDT, 5am PDT
End Time: 6pm CEST/Ljubljana, noon EDT, 9am PDT
Location: Virtual, MiTeam > Tutorials > Learning From Comparisons
Slides: PDF
 Presenters
 Motivation and Summary
 Outline
 Schedule
 Presenter Biographies
 Acknowledgements
 References
Presenters
Jennifer Dy, Northeastern University, Electrical and Computer Engineering Department
Stratis Ioannidis, Northeastern University, Electrical and Computer Engineering Department
Ilkay Yildiz, Northeastern University, Electrical and Computer Engineering Department
Motivation and Summary
Class labels generated by humans are often noisy, as data collected from multiple experts exhibit inconsistencies across labelers. To ameliorate this effect, one approach is to ask labelers to compare or rank samples instead: when class labels are ordered, a labeler presented with two or more samples can rank them w.r.t. their relative order, as induced by class membership. Comparisons are more informative than class labels, as they capture both inter and intraclass relationships; the latter are not revealed by class labels alone. In addition, comparison labels are subject to reduced variability in practice: this has been experimentally observed in many domains, and is due to the fact that humans often find it easier to make relative–rather than absolute–judgements.
Nevertheless, learning from comparisons poses computational challenges regressing rankings features is a computationally intensive task. Learning from pairs of comparisons between 𝑁 samples corresponds to inference over 𝑂(𝑁^2) comparison labels. More generally, learning from rankings of sample subsets of size K corresponds to inference over 𝑂(𝑁^K) labels. This requires significantly improving the performance of, e.g., maximum likelihood estimation (MLE) algorithms over such datasets. Finally, collecting rankings is also labor intensive. This is precisely because of the 𝑂(𝑁^K) size of the space of potential sets of size K to be labeled.
The tutorial will review classic and recent approaches to tackle the problem of learning from comparisons and, more broadly, learning from ranked data. Particular focus will be paid to the ranking regression setting, whereby rankings are to be regressed from sample features.
Outline
 Parametric models: BradleyTerry, PlackettLuce, Thurstone.
 Nonparametric Models: noisypermutation model, Mallows model, matrix factorization methods.
 Maximum Likelihood Estimation and spectral algorithms.
 Ranking regression and variational inference methods applied to comparisons.
 Sample complexity guarantees for ranking regression.
 Deep neural network models and accelerated learning methods.
 Active learning from comparisons.
Schedule
CEST  EDT  PDT  Content  Duration 

2:00pm  8:00am  5:00am  Parametric models: BradleyTerry, PlackettLuce, Thurstone. Maximum Likelihood Estimation and spectral algorithms. Nonparametric Models: noisypermutation model, Mallows model.  1h 15mins 
3:15pm  9:15am  6:15am  Q&A + break  15 mins 
3.30pm  9:30am  6:30am  Ranking regression. Deep neural network models and accelerated learning methods.  1h 30mins 
5:00pm  11:00am  8:00am  Q&A + break  15 mins 
5.15pm  11:15am  8:15am  Variational inference methods applied to comparisons. Active learning from comparisons. Sample complexity.  30mins 
5.45pm  11:45am  8.45am  Q&A + final discussion  15 mins 
Presenter Biographies
Jennifer Dy is a Professor at the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, where she first joined the faculty in 2002. She received her M.S. and Ph.D. in 1997 and 2001 respectively from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, and her B.S. degree from the Department of Electrical Engineering, University of the Philippines, in 1993. Her research spans both fundamental research in machine learning and their application to biomedical imaging, health, science and engineering, with research contributions in unsupervised learning, interpretable models, dimensionality reduction, feature selection/sparse methods, learning from uncertain experts, active learning, Bayesian models, and deep representations. She received an NSF Career award in 2004. She has served or is serving as Secretary for the International Machine Learning Society, associate editor/editorial board member for the Journal of Machine Learning Research, Machine Learning journal, IEEE Transactions on Pattern Analysis and Machine Intelligence, organizing and or technical program committee member for premier conferences in machine learning and data mining (ICML, NeurIPS, ACM SIGKDD, AAAI, IJCAI, UAI, AISTATS, ICLR, SIAM SDM), and program cochair for SIAM SDM 2013 and ICML 2018.
Stratis Ioannidis is an Associate Professor in the Electrical and Computer Engineering Department of Northeastern University, in Boston, MA, where he also holds a courtesy appointment with the College of Computer and Information Science. His research interests span machine learning, distributed systems, networking, optimization, and privacy. He received his B.Sc. (2002) in Electrical and Computer Engineering from the National Technical University of Athens, Greece, and his M.Sc. (2004) and Ph.D. (2009) in Computer Science from the University of Toronto, Canada. Prior to joining Northeastern, he was a research scientist at the Technicolor research centers in Paris, France, and Palo Alto, CA, as well as at Yahoo Labs in Sunnyvale, CA. He is the recipient of an NSF CAREER Award, a Google Faculty Research Award, a Facebook Research Award, and a best paper award at ACM ICN 2017 and IEEE DySPAN 2019.
Ilkay Yildiz is a senior PhD student at the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA. She received her undergraduate degree in Electrical and Electronics Engineering at Bilkent University, Ankara, Turkey in 2017. Her research interests span ranking and preference learning, deep learning, computer vision, probabilistic modeling, and optimization. Her research contributions involve accelerated regression algorithms that learn from choice and ranking labels.
Acknowledgements
Our work is supported by NIH (R01EY019474), NSF (SCH1622542 at MGH, SCH1622536 at Northeastern, SCH1622679 at OHSU, Facebook Statistics Fellowship, and by unrestricted departmental funding from Research to Prevent Blindness (OHSU).
References

N. Ahmed and M. Campbell. Variational bayesian data fusion of multiclass discrete observations with applications to cooperative humanrobot estimation. In ICRA, 2010.

Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 776–783. IEEE.

Azari, H., Parks, D., and Xia, L. (2012). Random utility theory for social choice. In Advances in Neural Information Processing Systems, pages 126–134.

Bosman, A. S., Engelbrecht, A., and Helbig, M. (2020). Visualising basins of attraction for the crossentropy and the squared error neural network loss functions. Neurocomputing.

G. Bouchard. Efficient bounds for the softmax function, applications to inference in hybrid models. In Presentation at the Workshop at NIPS, 2007.

Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1):1–122.

Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345.

Braverman, M. and Mossel, E. (2008). Noisy sorting without resampling. In Proceedings of the 19th Annual ACMSIAM Symposium on Discrete Algorithms, pages 268–276. Society for Industrial and Applied Mathematics.

J. P. Campbell, J. KalpathyCramer, D. Erdoğmuş, P. Tian, D. Kedarisetti, C. Moleta, J. D. Reynolds, K. Hutcheson, M. J. Shapiro, M. X. Repka et al., “Plus disease in retinopathy of prematurity: A continuous spectrum of vascular abnormality as a basis of diagnostic variability,” Ophthalmology, vol. 123, no. 11, pp. 2338–2344, 2016.

Caron, F. and Doucet, A. (2012). Efficient bayesian inference for generalized bradley–terry models. Journal of Computational and Graphical Statistics, 21(1):174–196.

Caruana, R. and NiculescuMizil, A. (2004). Data mining in metric space: an empirical analysis of supervised learning performance criteria. In Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 69–78.

Cattelan, M. (2012). Models for paired comparison data: A review with emphasis on dependent data. Statistical Science, pages 412–433.

Chang, H., Yu, F., Wang, J., Ashley, D., and Finkelstein, A. (2016). Automatic triage for a photo series. ACM Transactions on Graphics (TOG), 35(4):148.

Desarkar, M. S., Saxena, R., and Sarkar, S. (2012). Preference relation based matrix factorization for recommender systems. In International conference on user modeling, adaptation, and personalization, pages 63–75. Springer.

Doughty, H., Damen, D., and MayolCuevas, W. (2018). Who’s better? who’s best? Pairwise deep ranking for skill determination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6057–6066.

Dubey, A., Naik, N., Parikh, D., Raskar, R., and Hidalgo, C. A. (2016). Deep learning the city: Quantifying urban perception at a global scale. In European Conference on Computer Vision, pages 196–212. Springer.

Fligner, M. A. and Verducci, J. S. (1993). Probability models and statistical analyses for ranking data, volume 80. Springer.

Golik, P., Doetsch, P., and Ney, H. (2013). Crossentropy vs. squared error training: a theoretical and experimental comparison. In Interspeech, volume 13, pages 1756–1760.

Guiver, J. and Snelson, E. (2009). Bayesian inference for plackettluce ranking models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 377–384. ACM.

Guo, Yuan, Peng Tian, Jayashree KalpathyCramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Deniz Erdoğmuş, Jennifer G. Dy, and Stratis Ioannidis. “Experimental Design under the BradleyTerry Model.” In IJCAI, pp. 21982204. 2018.

Guo, Yuan, Jennifer Dy, Deniz Erdoğmuş, Jayashree KalpathyCramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, and Stratis Ioannidis. “Accelerated experimental design for pairwise comparisons.” Proceedings of the 2019 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2019.

Guo, Yuan, Jennifer Dy, Deniz Erdoğmuş, Jayashree KalpathyCramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, and Stratis Ioannidis. “Variational inference from ranked samples with features.” In Asian Conference on Machine Learning, pp. 599614. PMLR, 2019.

Hajek, B., Oh, S., and Xu, J. (2014). Minimaxoptimal inference from partial rankings. In Advances in Neural Information Processing Systems, pages 1475–1483.

Hunter, D. R. (2004). MM algorithms for generalized bradleyterry models. The Annals of Statistics, 32(1):384–406.

T. Jaakkola and M. Jordan. A variational approach to bayesian logistic regression models and their extensions. In Sixth International Workshop on Artificial Intelligence and Statistics, 1997.

Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 133–142. ACM.

KalpathyCramer, J., Campbell, J. P., Erdoğmuş, D., Tian, P., Kedarisetti, D., Moleta, C., Reynolds, J. D., Hutcheson, K., Shapiro, M. J., and Repka, M. X. (2016). Plus disease in retinopathy of prematurity: Improving diagnosis by ranking disease severity and using quantitative image analysis. Ophthalmology, 123(11):2345–2351.

Khetan, A. and Oh, S. (2016). Computational and statistical tradeoffs in learning to rank. In Advances in Neural Information Processing Systems, pages 739–747.

Koren, Y. and Sill, J. (2011). Ordrec: an ordinal model for predicting personalized item rating distributions. In Proceedings of the fifth ACM Conference on Recommender Systems, pages 117–124. ACM.

Lei, Q., Zhong, K., and Dhillon, I. S. (2016). Coordinatewise power method. In Advances in Neural Information Processing Systems, pages 2064–2072.

Lu, T. and Boutilier, C. (2011). Learning mallows models with pairwise preferences. In Proceedings of the 28th International Conference on Machine Learning (icml11), pages 145–152.

Mao, C., Pananjady, A., and Wainwright, M. J. (2018a). Breaking the 1/n1/2 barrier: Faster rates for permutationbased models in polynomial time. In Conference On Learning Theory, pages 2037–2042.

Mao, C., Weed, J., and Rigollet, P. (2018b). Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory, pages 821–847. PMLR.

Marden, J. I. (2014). Analyzing and modeling rank data. Chapman and Hall/CRC.

Maystre, L. and Grossglauser, M. (2015). Fast and accurate inference of plackettluce models. In Advances in Neural Information Processing Systems, pages 172–180.

M. Minoux. Accelerated greedy algorithms for maximizing submodular set functions. In Optimization techniques. 1978.

C. Moleta, J. P. Campbell, J. KalpathyCramer, R. P. Chan, S. Ostmo, K. Jonas, M. F. Chiang, I. . I. in ROP Research Consortium et al., “Plus disease in retinopathy of prematurity: Diagnostic trends in 2016 vs. 2007,” American Journal of Ophthalmology, 2017.

Mosteller, F. (2006). Remarks on the method of paired comparisons: I. the least squares solution assuming equal standard deviations and equal correlations. In Selected Papers of Frederick Mosteller, pages 157–162. Springer.

Negahban, S., Oh, S., and Shah, D. (2012). Iterative ranking from pairwise comparisons. In Advances in Neural Information Processing Systems, pages 2474–2482.

G. L. Nemhauser, L. A. Wolsey, and M. L Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 1978

Pahikkala, T., Tsivtsivadze, E., Airola, A., Järvinen, J., and Boberg, J. (2009). An efficient algorithm for learning to rank from preference graphs. Machine Learning, 75(1):129–165.

S. Park and S. Choi. Bayesian aggregation of binary classifiers. In ICDM, 2010.

Plackett, R. L. (1975). The analysis of permutations. Applied Statistics, pages 193–202.

Rajkumar, A. and Agarwal, S. (2014). A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International Conference on Machine Learning, pages 118–126.

Schultz, M. and Joachims, T. (2004). Learning a distance metric from relative comparisons. In Advances in Neural Information Processing Systems, pages 41–48.

Shah, N., Balakrishnan, S., Guntuboyina, A., and Wainwright, M. (2016a). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. In International Conference on Machine Learning, pages 11–20.

Shah, N. B., Balakrishnan, S., Bradley, J., Parekh, A., Ramchandran, K., and Wainwright, M. J. (2016b). Estimation from pairwise comparisons: Sharp minimax bounds with topology dependence. The Journal of Machine Learning Research, 17(1):2049–2095.

Soufiani, H. A., Chen, W., Parkes, D. C., and Xia, L. (2013). Generalized methodofmoments for rank aggregation. In Advances in Neural Information Processing Systems, pages 2706–2714.

Tian, P., Guo, Y., KalpathyCramer, J., Ostmo, S., Campbell, J. P., Chiang, M. F., Dy, J., Erdoğmuş, D., and Ioannidis, S. (2019). A severity score for retinopathy of prematurity. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1809–1819. ACM.

Vojnovic, M. and Yun, S. (2016). Parameter estimation for generalized thurstone choice models. In International Conference on Machine Learning, pages 498–506.

Vojnovic, Milan, SeYoung Yun, and Kaifang Zhou. “Convergence Rates of Gradient Descent and MM Algorithms for BradleyTerry Models.” International Conference on Artificial Intelligence and Statistics. PMLR, 2020.

Wauthier, F., Jordan, M., and Jojic, N. (2013). Efficient ranking from pairwise comparisons. In International Conference on Machine Learning, pages 109–117.

Yıldız, İ., Dy, J., Erdoğmuş, D., KalpathyCramer, J., Ostmo, S., Campbell, J. P., Chiang, M. F., and Ioannidis, S. (2020). Fast and accurate ranking regression. In International Conference on Artificial Intelligence and Statistics (AISTATS).

Yıldız, İ., Dy, J., Erdoğmuş, D., Ostmo, S., Campbell, J. P., Chiang, M. F., and Ioannidis, S. (2021). Deep spectral ranking. In International Conference on Artificial Intelligence and Statistics (AISTATS)

Yıldız, İ., Tian, P., Dy, J., Erdoğmuş, D., Brown, J., KalpathyCramer, J., Ostmo, S., Campbell, J. P., Chiang, M. F., and Ioannidis, S. (2019). Classification and comparison via neural networks. Neural Networks.

Zheng, Y., Zhang, L., Xie, X., and Ma, W.Y. (2009). Mining interesting locations and travel sequences from gps trajectories. In Proceedings of the 18th International Conference on World Wide Web, pages 791–800. ACM.