PhD Candidate, MIT EECS
A computational understanding of image memorability
Images carry the attribute of memorability: a predictive value of whether the image will be later remembered or forgotten. Understanding how image memorability works and what it is affected by has numerous applications - from better user interfaces and design to smarter image search and education tools. I am interested in gaining a better understanding of memorability from the ground up: to what extent is memorability consistent across individuals? How quickly can an image be forgotten? How can we model the effects of image context on memorability (can we make an image more memorable by changing its context)? Can we use people's eye movements and pupil dilations to make predictions about memorability? Read on for the answers to some of these questions.
Bylinskii, Z., Isola, P., Bainbridge, C., Torralba, A., Oliva, A. "Intrinsic and Extrinsic Effects on Image Memorability", Vision Research 2015 (in press).
Bylinskii, Z. "Computational Understanding of Image Memorability", MIT Master's Thesis 2015.
Vo, M., Gavrilov, Z., and Oliva, A. "Image Memorability in the Eye of the Beholder: Tracking the Decay of Visual Scene Representations". Vision Sciences Society (VSS) 2013.
What makes a visualization memorable, comprehensible, and effective?
A collaboration with Harvard University's visualization group, this line of work aims to understand how people interact with and perceive data visualizations (graphs, charts, infographics, etc.). We are interested in answering the questions: which visualizations are easily remembered and why? What information can people extract from visualizations? How can we measure comprehension? Does chart junk help or hinder understanding and memorability? What do people pay the most attention to?
Kim, N.W., Bylinskii, Z., Borkin, A.M., Oliva, A., Gajos, K.Z., and Pfister, H., 2015, "A Crowdsourced Alternative to Eye-tracking for Visualization Understanding", CHI'15 Extended Abstracts (accepted).[paper pdf] [poster]
Borkin, M., Vo, A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., and Pfister, H., 2013, "What Makes a Visualization Memorable?", IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis) 2013.[paper pdf] [supplemental material] [website] [media coverage]
More to come on this front!
I am currently running the MIT Saliency Benchmark. We are constantly updating the results page with new models, and will continue to refine the way models are evaluated to give the most accurate sense of progress in the field of saliency modeling.
Bylinskii, Z., Judd, T., Borji, A., Itti, L., Durand, F., Oliva, A., and Torralba, A. "MIT Saliency Benchmark", available at: http://saliency.mit.edu
Bylinskii, Z., DeGennaro, E., Rajalingham, R., Ruda, H., Zhang, J. Tsotsos, J.K. "Towards the quantitative evaluation of visual attention models", Vision Research 2015 (in press).
Are all training examples equally valuable?
When learning a new concept, not all training examples may prove equally useful for training: some may have higher or lower training value than others. The goal of this paper is to bring to the attention of the vision community the following considerations: (1) some examples are better than others for training detectors or classifiers, and (2) in the presence of better examples, some examples may negatively impact performance and removing them may be beneficial. In this paper, we propose an approach for measuring the training value of an example, and use it for ranking and greedily sorting examples. We test our methods on different vision tasks, models, datasets and classifiers. Our experiments show that the performance of current state-of-the-art detectors and classifiers can be improved when training on a subset, rather than the whole training set.
Lapedriza, A., Pirsiavash, H., Bylinskii, Z., Torralba, A. "Are all training examples equally valuable?", arXiv (1311.6510 [cs.CV]) 2013.[paper pdf]
Detecting Reduplication in Videos of American Sign Language
Abstract: A framework is proposed for the detection of reduplication in digital videos of American Sign Language (ASL). In ASL, reduplication is used for a variety of linguistic purposes, including overt marking of plurality on nouns, aspectual inflection on verbs, and nominalization of verbal forms. Reduplication involves the repetition, often partial, of the articulation of a sign. In this paper, the apriori algorithm for mining frequent patterns in data streams is adapted for finding reduplication in videos of ASL. The proposed algorithm can account for varying weights on items in the apriori algorithmís input sequence. In addition, the apriori algorithm is extended to allow for inexact matching of similar hand motion subsequences and to provide robustness to noise. The formulation is evaluated on 105 lexical signs produced by two native signers. To demonstrate the formulation, overall hand motion direction and magnitude are considered; however, the formulation should be amenable to combining these features with others, such as hand shape, orientation, and place of articulation.
Gavrilov, Z., Sclaroff, S., Neidle, C., Dickinson, S. "Detecting Reduplication in Videos of American Sign Language", Proc. Eighth International Conf. on Language Resources and Evaluation (LREC), 2012.
Skeletal Part Learning for Efficient Object Indexing
The goal of this project is to construct an indexing and matching framework operating on the graph encodings of object shapes. A parts-based indexing mechanism has greater robustness to occlusion and part articulation, while the graph-based representation provides angle and size invariance. The idea is pair-wise matching object graphs to extract common recurring subgraphs which then constitute the part vocabulary. Given a novel query object, its graph can be matched to the parts which vote for object hypotheses. Classifiers can additionally be used to learn associations between object categories and object-to-part similarity values.
Gavrilov, Z., Macrini, D., Zemel, R., Dickinson, S. "Skeletal Part Learning for Efficient Object Indexing", unpublished technical report, 2013.
I have been funded by the Natural Sciences and Engineering Research Council of Canada via: the Undergraduate Summer Research Award (2010-2012), the Julie Payette Research Scholarship (2013), and the Doctoral Postgraduate Scholarship (2014-ongoing).
Template by OS Templates