Vorschau
Vorschau


American Statistical Association Whitepaper

Cynthia Rudin, David Dunson, Rafael Irizarry, Hongkai Ji, Eric Laber, Jeffrey Leek, Tyler McCormick, Sherri Rose, Chad Schafer, Mark van der Laan, Larry Wasserman, Lingzhou Xue. Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society. American Statistical Association, July 2, 2014.
pdf

Interpretable Modeling and Modeling with Rules

Benjamin Letham, Cynthia Rudin, Tyler McCormick and David Madigan. Building Interpretable Classifiers with Rules using Bayesian Analysis. Available as tech report:
pdf bib python code

- Winner of Data Mining Best Student Paper Award, INFORMS 2013.
- Winner of Student Paper Competition sponsored by the Statistical Learning and Data Mining section (SLDM) of the American Statistical Association, 2014.

Shorter Version:
Benjamin Letham, Cynthia Rudin, Tyler McCormick and David Madigan. An Interpretable Stroke Prediction Model using Rules and Bayesian Analysis. Proceedings of AAAI Late Breaking Track, 2013.
pdf bib

Siong Thye Goh and Cynthia Rudin. Box Drawings for Learning with Imbalanced Data. Proceedings of 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2014.
pdf bib

Fulton Wang, Tyler McCormick, and Cynthia Rudin. Modeling Recovery Curves With Application to Prostatectomy. Tech report, 2014.
pdf

- Winner of Best Poster Competition, Statistical Learning and Data Mining section (SLDM) of the American Statistical Association, 2014.

Berk Ustun and Cynthia Rudin. Methods and Models for Interpretable Linear Classification. Available as tech report:
pdf

Berk Ustun, Stefano Traca, and Cynthia Rudin. Supersparse Linear Integer Models for Interpretable Classification. Available as tech report:
pdf bib

Shorter Version:
Berk Ustun, Stefano Traca, and Cynthia Rudin. Supersparse Linear Integer Models for Predictive Scoring Systems. Proceedings of AAAI Late Breaking Track, 2013.
pdf bib

Allison Chang, Cynthia Rudin, and Dimitris Bertsimas. Ordered Rules for Classification: A Discrete Optimization Approach to Associative Classification. Available on DSPACE here: OR 386-11
pdf bib

Cynthia Rudin, Benjamin Letham and David Madigan. Learning Theory Analysis for Association Rules and Sequential Event Prediction Journal of Machine Learning Research (JMLR), Volume 14, pages 3385-3436, 2013
pdf bib

Shorter Version:
Cynthia Rudin, Benjamin Letham, Ansaf Salleb-Aouissi, Eugene Kogan and David Madigan. Sequential Event Prediction with Association Rules. Proceedings of the 24th Annual Conference on Learning Theory (COLT), 2011.
pdf bib
Tyler McCormick, Cynthia Rudin, David Madigan. A Hierarchical Model for Association Rule Mining of Sequential Events: An Approach to Automated Medical Symptom Prediction. Annals of Applied Statistics, 2012.
pdf bib

Big Data, Knowledge Discovery, and Social Good Papers (with Accompanying Methodology/Theory)

Energy Grid Projects

Cynthia Rudin, David Waltz, Roger N. Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Maggie Chow, Haimonti Dutta, Philip Gross, Bert Huang, Steve Ierome, Delfina Isaac, Arthur Kressner, Rebecca J. Passonneau, Axinia Radeva, Leon Wu. Machine Learning for the New York City Power Grid. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No 2. February 2012. (Spotlight Paper for the February 2012 Issue.)
pdf bib

Cynthia Rudin, Rebecca Passonneau, Axinia Radeva, Haimonti Dutta, Steve Ierome, Delfina Isaac. A Process for Predicting Manhole Events in Manhattan. Machine Learning, Volume 80, pages 1-31, 2010.
pdf bib

Cynthia Rudin, Rebecca Passonneau, Axinia Radeva, Steve Ierome, Delfina Isaac. 21st-Century Data Miners Meet 19th-Century Electrical Cables. IEEE Computer, volume 44 no. 6, pages 103-105, June 2011. (One of three articles featured on the cover.)
pdf bib

Cynthia Rudin, Seyda Ertekin, Rebecca Passonneau, Axinia Radeva, Ashish Tomar, Boyi Xie, Stanley Lewis, Mark Riddle, Debbie Pangsrivinij, Tyler McCormick. Analytics for Power Grid Distribution Reliability in New York City. Interfaces, Accepted, 2014.
pdf bib

- Accompanies winning entry of the 2012-2013 INFORMS Innovative Applications in Analytics Award
Seyda Ertekin, Cynthia Rudin, and Tyler McCormick. Predicting Power Failures with Reactive Point Processes. Proceedings of AAAI Late Breaking Track, 2013.
pdf bib
Longer Version, Accepted to Annals of Applied Statistics
pdf
supplement pdf

Ramin Moghaddass and Cynthia Rudin. Analytics Through a Latent State Hazard Model. Working Paper, June 2014.
pdf

Dingquan Wang, Rebecca J. Passonneau, Michael Collins, Cynthia Rudin. Modeling Weather Impact on a Secondary Electrical Grid. Proceedings of the 4th International Conference on Sustainable Energy Information Technology (SEIT-2014), 2014.
pdf

Boyi Xie, Rebecca J. Passonneau, Haimonti Dutta, Jing-Yeu Miaw, Axinia Radeva, Ashish Tomar, and Cynthia Rudin. Progressive Clustering with Learned Seeds: An Event Categorization System for Power Grid Proceedings of the 24th International Conference on Software Engineering & Knowledge Engineering (SEKE), pages 100-105, 2012.
pdf bib

Rebecca Passonneau, Cynthia Rudin, Axinia Radeva, Ashish Tomar and Boyi Xie. Treatment Effect of Repairs to an Electrical Grid: Leveraging a Machine Learned Model of Structure Vulnerability. Proceedings of the KDD Workshop on Data Mining Applications in Sustainability (SustKDD), 17th Annual ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2011.
pdf bib

Rebecca Passonneau, Cynthia Rudin, Axinia Radeva, Zhi An Liu. Reducing Noise in Labels and Features for a Real World Dataset: Application of NLP Corpus Annotation Methods. Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing, 2009.
pdf bib

Axinia Radeva, Cynthia Rudin, Rebecca Passonneau and Delfina Isaac. Report Cards for Manholes: Eliciting Expert Feedback for a Machine Learning Task. Proceedings of the International Conference on Machine Learning and Applications, 2009. (Winner of Best Poster Award.)
pdf bib

Haimonti Dutta, Cynthia Rudin, Becky Passonneau, Fred Seibel, Nandini Bhardwaj, Axinia Radeva, Zhi An Liu, Steve Ierome, Delfina Isaac. Visualization of Manhole and Precursor-Type Events for the Manhattan Electrical Distribution System. Workshop on GeoVisualization of Dynamics, Movement and Change, 11th AGILE International Conference on Geographic Information Science, 2008.
pdf bib

Leon Wu, Timothy Teravainen, Gail Kaiser, Roger Anderson, Albert Boulanger, and Cynthia Rudin. Estimation of System Reliability Using a Semiparametric Model. Proceedings of IEEE EnergyTech, 2011.
pdf bib Link

Leon Wu, Gail Kaiser, Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Haimonti Dutta, and Manoj Poolery. Evaluating Machine Learning for Improving Power Grid Reliability. Proceedings of the ICML 2011 workshop on "Machine Learning for Global Challenges," International Conference on Machine Learning, 2011.
pdf bib

Leon Wu, Gail Kaiser, Cynthia Rudin, Roger Anderson. Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid. Proceedings of the KDD Workshop on Data Mining for Service and Maintenance (KDD4Service), 17th Annual ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2011.
pdf bib

This project is the winner of the 2013 INFORMS Innovative Applications in Analytics Award.

Papular press articles about this topic:

- Energy Daily article: MIT Sloan Professor's Ranking of Manholes Prioritizes Repairs and Maintenance
- Science News article: Machine vs. Manhole, appearing also in U.S. News and World Report, WIRED Science, Slashdot, Discovery News / Discovery Channel
- CIO Magazine: "Don't blow your top," Finish section, Sept 1 issue, 2010
- Featured in book Big Data: The Data Revolution by Viktor Mayer-Schonberger and Kenneth Cukier

Sports Analytics

Theja Tulabandhula and Cynthia Rudin. Tire Changes, Fresh Air and Yellow Flags: Challenges in Predictive Analytics for Professional Racing. Big Data, Volume 2, Number 2, June 2014.
pdf bib

Information Retrieval and Set Completion

Benjamin Letham, Cynthia Rudin and Katherine Heller. Growing a List. Data Mining and Knowledge Discovery (ECML-PKDD journal track).
pdf bib
Featured on Boston Public Radio (WGBH) "A New Way To Google"

Meeting Analysis

Been Kim and Cynthia Rudin. Learning About Meetings. Data Mining and Knowledge Discovery (ECML-PKDD journal track), February 2014.
pdf bib

Shorter version:
Been Kim and Cynthia Rudin. Machine Learning for Meeting Analysis in AAAI 2013 Late Breaking Track
pdf bib
This paper was the topic of several popular press articles

Crime Pattern Detection

Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri. Learning to Detect Patterns of Crime. Proceedings of ECML, 2013.
pdf bib
Shorter Version:
Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri. Detecting Patterns of Crime with Series Finder. Proceedings of AAAI Late Breaking Track, 2013.
pdf bib
This paper was the topic of several popular press articles

Jonathan Huggins and Cynthia Rudin. A Statistical Learning Theory Framework for Supervised Pattern Discovery. Proceedings of SIAM Conference on Data Mining (SDM) 2014
pdf bib

Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri. Finding Patterns with a Rotten Core: Data Mining for Crime Series with Core Sets. Accepted with minor revision to Big Data, 2014.
pdf bib talk slides

Collective Intelligence

Seyda Ertekin, Cynthia Rudin, Haym Hirsh. Approximating the Crowd. Data Mining and Knowledge Discovery (ECML-PKDD journal track), Accepted, 2014.
pdf
pdf - Appendix
bib

Seyda Ertekin, Haym Hirsh, Cynthia Rudin. Approximating the Wisdom of the Crowd. Proceedings of the Second Workshop on Computational Social Science and the Wisdom of Crowds (NIPS 2011).
pdf bib Haym's Slides

Seyda Ertekin, Haym Hirsh, Cynthia Rudin. Learning to Predict the Wisdom of Crowds. Proceedings of Collective Intelligence, 2011.
pdf bib

Machine Learning and Decision Making

Cynthia Rudin and Gah-Yi Vahn. The Big Data Newsvendor: Practical Insights from Machine Learning Analysis.
Working Paper.
pdf

Theja Tulabandhula and Cynthia Rudin. Machine Learning with Operational Costs. Journal of Machine Learning Research, Volume 14, pages 1989-2028, 2013.
pdf bib

Shorter version:
Theja Tulabandhula and Cynthia Rudin. The Influence of Operational Cost on Estimation. Proceedings of the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2012.
pdf bib

Theja Tulabandhula and Cynthia Rudin. On Combining Machine Learning with Decision Making. Machine Learning (ECML-PKDD journal track), Online First, June, 2014.
pdf

Shorter version:
Theja Tulabandhula, Cynthia Rudin, Patrick Jaillet. The Machine Learning and Traveling Repairman Problem. Proceedings of the Second International Conference on Algorithmic Decision Theory (ADT), 2011.
pdf bib
Theja Tulabandhula and Cynthia Rudin. Robust Optimization using Machine Learning for Uncertainty Sets. Proceedings of the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2014.
pdf bib
Longer Version ArXiV

Theja Tulabandhula and Cynthia Rudin. Generalization Bounds for Learning with Linear, Polygonal, Quadratic, and Conic Side Knowledge.
ArXiv

Shorter version:
Theja Tulabandhula and Cynthia Rudin. Generalization Bounds for Learning with Linear and Quadratic Side Knowledge. Proceedings of the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2014.
pdf bib

Supervised Ranking

Benjamin Letham, Cynthia Rudin, and David Madigan. Sequential Event Prediction. Machine Learning, Volume 93, pages 357-380, 2013.
pdf bib

Allison Chang, Cynthia Rudin, Michael Cavaretta, Robert Thomas and Gloria Chou. How to Reverse-Engineer Quality Rankings.
Machine Learning. Volume 88, Issue 3, pp 369-398, September 2012.
pdf bib

Papular press article about this topic:

- Businessweek article: How to Improve Product Rankings
- MIT Sloan Experts blog article: Product quality ratings: New research shows secret formulas yield questionable results

Seyda Ertekin and Cynthia Rudin. On Equivalence Relationships Between Classification and Ranking Algorithms. Journal of Machine Learning Research, Volume 12, pages 2905-2929, 2011.
pdf bib

Dimitris Bertsimas, Allison Chang, Cynthia Rudin. A Discrete Optimization Approach to Supervised Ranking. Proceedings of the 5th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2010), 2010.
pdf bib
Longer Version on DSPACE, paper OR 388-11 here. Finalist for Data Mining Best Student Paper Award, INFORMS 2011.
pdf bib

Cynthia Rudin. The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List. Journal of Machine Learning Research, Volume 10, pages 2233-2271, 2009.
pdf bib

Cynthia Rudin. Ranking with a P-Norm Push. Proceedings of the Nineteenth Annual Conference on Computational Learning Theory (COLT), pages 589-604, 2006.
pdf bib

Heng Ji, Cynthia Rudin, Ralph Grishman. Re-Ranking Algorithms for Name Tagging. In Proc. Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (HLT-NAACL) Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, 2006.
pdf bib

Cynthia Rudin and Robert E. Schapire. Margin-Based Ranking and an Equivalence Between AdaBoost and RankBoost. Journal of Machine Learning Research, Volume 10, pages 2193-2232, 2009.
pdf bib

Cynthia Rudin, Corinna Cortes, Mehryar Mohri, Robert E. Schapire. Margin-Based Ranking and Boosting Meet in the Middle. Proceedings of the Eighteenth Annual Conference on Computational Learning Theory (COLT), pages 63-78, 2005.
pdf bib

Convergence of Boosting Algorithms

Cynthia Rudin, Ingrid Daubechies, Robert E. Schapire. Does AdaBoost Always Cycle? JMLR: Workshop and Conference Proceedings, Published as a COLT Open Problem.
pdf bib

Indraneel Mukherjee, Cynthia Rudin, and Robert Schapire. The Rate of Convergence of AdaBoost. Proceedings of the Twenty-fourth Annual Conference on Learning Theory (COLT), 2011.
pdf bib

Indraneel Mukherjee, Cynthia Rudin, and Robert Schapire. The Rate of Convergence of AdaBoost. (Longer version of COLT paper) Journal of Machine Learning Research, Volume 14, pages 2315-2347, August 2013.
Link pdf bib

- Solved published open problem in COLT (Computational Learning Theory), 2013.
Cynthia Rudin, Ingrid Daubechies, Robert E. Schapire. The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins. Journal of Machine Learning Research, 5 (Dec): 1557-1595, 2004.
pdf bib
- Solved well-known open theoretical problem as to whether AdaBoost attains maximum margins.
Cynthia Rudin, Ingrid Daubechies, and Rob Schapire. On the Dynamics of Boosting. Advances in Neural Information Processing Systems (NIPS) 16, 2003.
pdf bib

Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies. Analysis of Boosting Algorithms using the Smooth Margin Function. Annals of Statistics, Volume 35, Number 6, pages 2723-2768, 2007.
pdf bib

Cynthia Rudin, Robert E. Schapire, and Ingrid Daubechies. Precise Statements of Convergence for AdaBoost and arc-gv. In Proc. AMS-IMS-SIAM Joint Summer Research Conference: Machine Learning, Statistics, and Discovery 131-145, 2007.
pdf bib

Cynthia Rudin, Robert E. Schapire, and Ingrid Daubechies. Boosting Based on a Smooth Margin. Proceedings of the Seventeenth Annual Conference on Computational Learning Theory (COLT), 2004.
pdf bib

Other Papers

Cynthia Rudin and Kiri L. Wagstaff. Machine Learning for Science and Society. Machine Learning, 2013.
pdf

Cynthia Rudin. Teaching "Prediction: Machine Learning and Statistics." Proceedings of the ICML Workshop on Teaching ML, 2012.
pdf bib

Ryan Roth, Owen Rambow, Nizar Habash, Mona Diab, and Cynthia Rudin. Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking. The 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL/HLT), 2008.
pdf bib

Cynthia Rudin. Stability of Learning algorithms. Computer Science ArXiV.
pdf

Cynthia Rudin and Brian Spencer. Equilibrium Island Arrays in Strained Solid Films. Journal of Applied Physics, November 15, 1999 - Volume 86, Issue 10, pages 5530-5536.
pdf

Edited Collections

Eds. Peter Qian, Yilu Zhou, and Cynthia Rudin. Proceedings of the 2011 INFORMS Data Mining and Health Informatics (DM-HI) Workshop
pdf