Concepts in Multicore Programming
This course is in development for 2013 or beyond. The below description should be taken as an example of content and is subject to change. If you are interested in this course, please
sign up for email updates and/or check this course page at a later date.
The Multicore Software Challenge
An irreversible shift towards multicore x86 processors is underway. Building multicore processors delivers on the promise of Moore's Law, but it creates an enormous problem for developers. Multicore processors are parallel computers, and parallel computers are notoriously difficult to program.
We are already seeing desktop and laptop systems based on dual- and quad-core mainstream processors from Intel and AMD, and the number of cores per processor are poised to explode over the next 3 years (Intel is forecasting processors with 80+ cores by 2011).
To develop and deploy reliable applications that fully leverage the new microprocessors, many applications must be written, or rewritten, as parallel or multithreaded applications. This development is difficult, expensive, time-consuming, and error-prone; and requires new programming skill sets. Organizations need a solution to this multicore software challenge.
This hands-on tutorial is an introduction to key multicore programming concepts impacting development time, performance and reliability. Attendees will learn the fundamental issues in the design of concurrent programs and be introduced to the techniques necessary to make effective use of multicore systems. Both instructors will be present throughout the short course, providing an opportunity for rich interaction during the hands-on labs.
All attendees will receive:
- A copy of the “Multithreaded Algorithms” chapter from the new edition of Introduction to Algorithms
- A copy of the book How to Survive the Multicore Software Revolution (or at Least Survive the Hype)
Fundamentals: Core concepts, understandings and tools (40%)
Latest Developments: Recent advances and future trends (20%)
Industry Applications: Linking theory and real-world (40%)
Lecture: Delivery of material in a lecture format (70%)
Discussion or Groupwork: Participatory learning (10%)
Labs: Demonstrations, experiments, simulations (20%)
Introductory: Appropriate for a general audience (40%)
Specialized: Assumes experience in practice area or field (50%)
Advanced: In-depth explorations at the graduate level (10%)
The participants of this course will be able to:
- Understand key concepts in multicore application development.
- Learn the key questions to ask when going multicore.
- Build and debug multicore-ready applications.
Who Should Attend
This course should appeal to software developers, software architects, and software team leaders and project managers.
Prerequisites: C++ programming experience. Either Windows (32-bit) with Visual Studio or Linux/x86 (32- or 64-bit) multicore laptop desired. (Since participants will operate in groups of two, only one of each pair actually needs a laptop). Instructors will provide limited access to a 16-core computer. Prior experience with multicore programming is not required.
Goals for the day: Understand multicore architecture trends (Moore’s law, chip multiprocessors, etc.). Exhibit the ability to compile and run basic Cilk++ programs. Display a hands-on knowledge of basic multicore-programming concepts, including nested and loop parallelism, serial semantics, and race conditions. Describe performance concepts, including work, span, and parallelism. Show an understanding of the practical implications of elementary scheduling theory.
9:00 am Module 1 (1.5 hours) The multicore-software challenge
- Technology trends
- Problems amenable to parallelism
- Chip multiprocessors and cache consistency
- Leading multicore concurrency platforms, including Pthreads/WinAPI threads, OpenMP, Threading Building Blocks, and Cilk++
- Program correctness and race conditions
10:30 am Module 2 (1.5 hours) Lab: Introduction to parrallel programming
- Parallelizing quicksort
- Cilkscreen race detector
- Cilk performance analyzer
12:00 noon Lunch (1 hour)
1:00 pm Module 3 (1.5 hours) Parallelism and performance
- Nested and loop parallelism
- Serial semantics and composability
- Programming examples
- Measures of work, span, and parallelism
- Scheduling theory
2:30 pm Module 4 (2 hours) Lab: Matrix multiplication
4:30 pm Module 5 (0.5 hours) How the Cilk++ concurrency platform works
- Work stealing
5:00 pm Adjourn
Goals for the day: Display a familiarity of advanced parallel programming concepts, such as locking, deadlock, synchronizing through memory, and reducers. Show an ability to deal with hurdles to parallelization, including insufficient parallelism, loop-carried dependences, grain size, burdened parallelism, memory bandwidth, nondeterminism, and legacy threading.
9:00 am Module 6 (1.5 hours) Nonlocal variables and synchronization
- Global and nonlocal variables
- Deadlock, lock contention, convoying
- Synchronizing through memory
- Memory models
- Reducer hyperobjects
10:30 am Module 7 (1.5 hours) Lab: Nonlocal variables and reducers
12:00 noon Lunch (1 hour)
1:00 pm Module 8 (1 hour) Practical issues in parallelization
- Lack of parallelism
- Loop-carried dependences
- Grain size
- Burdened parallelism
- Memory bandwidth
- Legacy threading
2:00 pm Module 9 (2 hours) Lab: Overcoming parallelization hurdles
4:00 pm Module 10 (1 hour) Multicore Jeopardy!
5:00 pm Adjourn
About The Lecturers
Charles E. Leiserson
Charles is Professor of Computer Science and Engineering at MIT, member of the Theory of Computation research group in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), head of CSAIL's Supercomputing Technologies research group, and founder and CTO of Cilk Arts. His research centers on developing theoretical principles of parallel and distributed computing, especially as they relate to engineering reality. Charles' textbook, Introduction to Algorithms, coauthored with Ronald L. Rivest, Thomas H. Cormen, and Clifford Stein, is the leading textbook on computer algorithms and the most cited reference in all of computer science. Charles was network architect for the Connection Machine Model CM-5 Supercomputer manufactured by Thinking Machines Corporation and Director of System Architecture at Akamai Technologies, where he directly managed a 35-person engineering team that developed a worldwide content-distribution network that now numbers over 20,000 servers. He is a MacVicar Faculty Fellow at MIT, an ACM Fellow, and a senior member of IEEE and SIAM.
Pablo Halpern is Member of the Technical Staff at Cilk Arts. He is a member of the C++ Standards Committee, and the author of the popular book The C++ Standard Library From Scratch. Pablo’s expertise includes C++ programming, language and compiler design, Linux device drivers, embedded systems, network management, and implementation of command-line interfaces. Pablo has spent nearly two decades in software development at Bloomberg L.P., Hewlett Packard, Polygen, Wang Laboratories, Millipore Corporation, BMC Software, and Intersolv. During this time, Pablo has also developed and taught beginning and advanced C++ courses.
This course takes place on the MIT campus in Cambridge, Massachusetts. We can also offer this course for groups of employees at your location. Please contact the Short Programs office for further details.