MEMORANDUM

 

To:       All programmers of Measurement Studio and Automation Division

From:    Widagdo Setiawan

             Sam Madden 11AM 6.033 recitation

Date:    March 2, 2006

Subject: Plans for reducing race condition bugs in multithreaded modules

 

Problems

Multithreaded modules have been used intensively in our software products. Until now, we never had any well-defined rules regarding multithreaded algorithms. However, after reading the RaceTrack paper [1] from Microsoft Research, it is evident that even after years of extensive code review, subtle race condition bugs still reside in both their Visual Studio Library and Common Language Runtime modules. Therefore, to avoid or reduce similar problems in our software products, I have decided to enforce the following rules whenever a thread safe module is constructed.

Rules

1.       Use standard lock techniques

It is apparent that all three bugs described in the RaceTrack paper arose because the developers did not use any standard locking mechanisms to prevent race conditions. However, the standard locking mechanisms in the .NET Framework are known to be very effective in preventing race conditions. Therefore, to reduce race condition bugs, programmers must use one of the standard .NET locking mechanisms whenever possible, unless the profiling step of the module shows that the locking mechanisms become the bottleneck of the module.

2.       When using non-standard or specialized locking mechanism, programmers must write and submit documentation of the behavior of the module to our thread analysis committee.

As we all know, threading is not a trivial problem. Programs using non-standard or specialized locking mechanism will only amplify the likelihood of race condition bugs. These programs will be harder to understand because the implemented locking mechanism will be unique to that module. Furthermore, since the programmer will never have been exposed to the specialized locking mechanism before, the risk of having race condition bugs will increase. Therefore, programmers must explicitly document how the specialized locking mechanism works, and how the program behaves under different conditions. This documentation will then be reviewed by the thread analysis committee for possible race conditions in the specialized locking mechanism.

Although these rules seem demanding, I deem them necessary considering that our products are used in many critical control systems where race conditions could produce catastrophic failure.

 

[1] Y. Yu, T. Rodeheffer, W. Chen. RaceTrack: Efficient Detection of Data Race Conditions via adaptive tracking. ACM 1-59593-079-5/05/0010, October 2005.