Political Methodology Research Workshops

Spring 2017

12:00-2:00 p.m.
Millikan Room, E53-482, unless otherwise noted
Having taken Quant 1 (17.800) or the equivalent is sufficient preparation for all the sessions.

March 17 (NEW Venue: E51-095) - Introduction to Python and Web Scraping. Instructor: Soubhik Barari
Contemporary political science often draws on rich datasets from social media posts, government pdf documents, and public APIs. To collect and analyze a massive amount of qualitative and quantitative datasets effectively, we need the right computational tools. In this workshop we will demo how to use Python, a popular programming language for data mining, to efficiently scrape diverse web sources using tools like BeautifulSoup, mechanize, and PDFMiner.

April 7 - Using the HMDC Cluster for High-Performance Computing. Instructor: Soubhik Barari
When working with big data, how do you scale up computational resources for large-scale statistical analysis? This workshop will introduce the cluster machine for high-performance computing maintained by HMDC (Harvard-MIT Data Center). Topics covered include setting up accounts, submitting jobs that require high computing power and memory, and parallel computing.

May 5 - ( E51-095) Spatial Data: GIS Visualization and Analysis. Instructor: Justin de Benedictis-Kessner
In social science, spatial information is often an invaluable component of data. Statistics that incorporate this information and the ability to visualize this information along with geographic context are excellent tools for the political scientist. This workshop will cover the basics of visualizing geographic data, the convenient tools available for geocoding, and how to do basic spatial statistics – all in R. Participants should be comfortable working in R and have successfully installed the rdgal and rgeos packages in R ahead of time. This workshop will be in E51-095.

