Siong Thye Goh
Prediction Analysis Lab, MIT.

Siong Thye I am a graduate student in ORC (Operations Research Center) of MIT. I have been working under the supervision of Professor Cynthia Rudin from Duke University. My research interests include developing new interpretable machine learning algorithms and statistical method, and studying statistical learning theory under the causal inference framework. I am also interested in applying submodular structure to improve the efficiency of advertisement.

I work at the Prediction Analysis Lab at MIT where you can find many other talented individuals who are also interested in applying accurate and yet interpretable machine learning techniques to social good applications, e.g. criminology and medical data.

I can be contacted at

My LinkedIn.

Causal Inference

A Minimax Surrogate Loss Approach to Causal Inference

To overcome the problem that both treatment and control outcomes for the same unit are required for causal inference problem, we proposed surrogate loss functions that incorporate both treatment and control data. A specific choice of loss function, namely, a type of hinge loss, yields a minimax support vector machine formuation. The resulting optimization problem requires the solution to only a single convex optimization problem, incorporating both treatment and control units, and it enables the kernel trick to be used to handle nonlienar (also non-parametric) estimation.

Siong Thye Goh and Cynthia Rudin (2017) A Minimax Surrogate Loss Approach to Causal Inference. Submitted. (pdf and supplement)

Density Estimation

Cascaded High Dimensional Histograms

Histograms are frequently used to understand distributions of data when the dimension of data is small. We discuss how to represent high dimensional histogram in the form of either a tree or a list (one-sided tree). Our models look for a balance between accuracy of representation of the data set as well as interpretability of the model.

Siong Thye Goh and Cynthia Rudin (2016) Cascaded High Dimensional Histograms: A Generative Approach to Density Estimation. Submitted. (pdf)

Imbalanced Data Classification

Box Drawings for Learning with Imbalanced Data

We propose two machine learning algorithms to handle highly imbalanced classification problems. The constructed classifiers are unions of parallel axis rectangles around the positive examples, and thus have the benefits of being interpretable.

Siong Thye Goh and Cynthia Rudin (2014) Box Drawings for Learning with Imbalanced Data. KDD 2014: 333-342. (pdf)

Using Fast Boxes to Predict Gas Turbine Failure

In practice, we rarely know when a machine begins to malfunction. We propose a method to label such unsupervised data. Fast boxes algorithm are then applied to predict gas turbine failures.

Siong Thye Goh, Xinmin Cai, Chao Yuan, Amit Chakraborty, and Matthew Evans (2016) Gas Turbine Failure Prediction Utilizing Supervised Learning Methodologies. Patent W2016040085A1. (link)

Sparse Coding

Sparse Coding for Faulty Sensor Detection using L-1 norm on the Residual

We apply sparse coding to the task of anomaly detection. In particular, we propose post-processing steps that take the time component into consideration to reduce the number of false positive in this task. Our method performs better than one-class SVM in the task of anomaly detection.

Siong Thye Goh, Chao Yuan, Amit Chakraborty, and Matthew Evans (2016) Gas Turbine Sensor Failure Detection Utilizing a Sparse Coding Methodology. Patent W2016040082A1. (link)

Linear Discriminant Analysis

Null Space Based Linear Discriminant Analysis

Null space based linear discriminant analysis is a common tool that is used in dimension reduction in pattern recognition..Previously, the standard way to perform null space based linear discriminant analysis is to perform Singular Value Decomposition (SVD) which is slow. We present a new implementation of the null space based linear discriminant analysis without performing any SVD. The main complexity comes from an economic QR decomposition with column pivoting which is much faster than the previous implementation.

Delin Chu and Siong Thye Goh (2010) A New and Fast Implementation for Null Space Based Linear Discriminant Analysis Pattern Recognition, Vol 34, Issue 4, April 2010, Pages 1373-1379. (paper)

Orthogonal Linear Discriminant Analysis

The traditional LDA (linear discriminant analysis) computations require taking inverses. However, when the problem is under-sampled, the inverse of the total scatter matrix is not well defined. There have been various generalizations of the LDA algorithm and they require the inversion of matrices and computation of singular value decompositions. We propose a method to perform such computations with only orthogonal transformations, i.e. such a method is inverse-free and numerically stable.

Delin Chu and Siong Thye Goh (2010) A New and Fast Orthogonal Linear Discriminant Analysis on Undersampled Problems SIAM J. Sci. Comput., 32(4), 2274-2297 . (pdf)

Uncorrelated Linear Discriminant Analysis

We find all the solutions to the uncorrelated linear discriminant analysis and parametrize them explicitly. Furthermore, we propose new and fast algorithms to compute ULDA (uncorrelated linear discriminant analysis) without performing singular value decomposition.

Delin Chu, Siong Thye Goh, and Y.S. Hung (2011) Characterization of All Solutions for Undersampled Uncorrelated Linear Discriminant Analysis Problems SIAM J. Matrix Anal. Appl. Vol 32, No. 3, pages 820-844 . (pdf)


Even-Variable Balanced Boolean Functions with Optimal Algebraic Immunity

We construct six infinite classes of balanced Boolean functions. These six classes of Boolean functons achieve optimal algebraic degree, optimal algebraic immunity and high nonlinearity. We also prove a lower bound of the nonlinearities of these balanced Boolean functions and prove the better lower bound of nonlinearity for Carlet-Feng's Boolean function.

Chik-How Tan and Siong-Thye Goh (2011) Several Classes of Even-Variable Balanced Boolean Functions with Optimal Algebraic Immunity IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences Vol.E94-A No.1 pp.165-171 . (link)

Determining All Permutations with Linear Translators

Let G be a finite function and consider functions F of the form of F=G+aH where a is a non-zero constant and H is a trace function. We characterize the conditions where F is a permutation polynomial when G is a (1) permutation polynomial, (2) k-to-1 function, and (3) k-even function. The (2) and (3) cases positively answer the open problem proposed by Charpin and Kyureghyan. The technique can be generalized to obtaining permutation on any finite commutative right with identity, and H is a function from ring to a subring.

Guang Gong, Siong Thye Goh, and Yin Tan (2016) Determining Permutation Polynomials with Linear Translator. submitted .

Teaching Experience

I worked as a teaching assistant in the Department of Mathematics at the National University of Singapore when I pursued my Master's degree in Mathematics. I have completed the teaching assistant training program from the Center for Developement of Teaching and Learning, NUS. I TA-ed the following classes:
I received the Best Teaching Assistant Awards from the Department of Mathematics and Faculty of Science for each of the classes.
I answer questions on Mathematics Stack Exchange and Cross Validated sometimes .
profile for Siong Thye Goh on Stack Exchange, a network of free, community-driven Q&A sites
My CV can be found here .