неделя, 9 октомври 2011 г.

Thread articles

I am starting a series of articles with topics based on threads and thread-safety.

Thread safety, volatile and synchronized

In today's world we can barely write a complex app, without using more than one thread. You know - the user interface should not freeze, your computer has more than one core and you decide to use their full power, etc... The point is that as long as a second thread enters into the picture, we the programmers are out of the comfort zone. The program flow is not so ordinary anymore. If you are somewhat newbie to threads, you will ask:  

1. What is it all about the thread safety?
Thread safety is all about shared state - state, used by more than one thread simultaneously. This is all you have to care about. For example a variable used by more than one thread. Sounds simple enough, right?

Since an example is worth a thousand words, take a look at this class:
public class LongRunningTask() {
      private boolean shouldStop;

      public void work() {
           while (!shouldStop) {
                 launchRocketToTheMoon();
           }
      }

      public void stop() {
           shouldStop = true;
      }
}

Let's consider that 100 rockets were launched to the moon and you are tired of it. You call the stop() method to stop the process, which was running in a different thread. The scary thing here is that there isn't a guarantee that your program will stop launching rockets at all. You will ask why?

Lesson learned 1: If two threads are accessing the same variable, there isn't any guarantee that when the first thread makes a change to the variable, the second thread will see it.

The reason is that caching can be applied for greater efficiency at different levels and the second thread will receive a stale data. So, what is the solution?

2. Apply the volatile modifier to the variable.
When using the volatile modifier you are basically telling - give me the latest value from the RAM, don't even try to get it from a cache.


The following version of the class, fixes the above defficiency:
public class LongRunningTask() {
      private volatile boolean shouldStop;

      public void work() {
           while (!shouldStop) {
                 launchRocketToTheMoon();
           }
      }

      public void stop() {
           shouldStop = true;
      }
}

Lesson learned 2: Use the volatile keyword to prevent receiving stale data.

At this level, you may say: Ah clear, so we put the volatile behind every variable, used by more than one thread, and we are fine with the thread-safety. Totally wrong. With using the volatile modifier you only guarantee that you won't be accessing cached stale data. This can be sufficient only in the most basic cases(like the rocket example, discussed above). But in more complex situations, the so called race conditions come to bite you.

3. Race conditions and the synchronized keyword:

Take a look at this simple counter class:
public class IdCounter {
     private volatile int counter;

     public int increment() {
          counter++;
          return counter;
     }
}

Although the class looks innocent, it will fail miserably if used by more than one thread simultaneously. If you run two threads and call the increment method from them, there is a very high chance you won't get the sequence 1, 2, 3, 4, 5, 6, ... but something different like 1, 2, 2, 3, 4, 5, ...

You ask: "Why that? I am not receiving stale data. I am using volatile."

Let's take a closer look at the counter++ line. It is equivalent to counter = counter + 1. This simple line is doing three things. First it gets the old value of the counter, adds 1 to it and returns the new value, which is set to the counter. Here is the catch: If the first thread is getting the old value, and at the same time the second thread is getting the old value, then counter = oldValue + 1 = 1 + 1 = 2, and counter = oldValue + 1 = 1 + 1 = 2. So we called increment two times and received the same number. We are totally screwed. This is a classical example of a race condition. The two threads are participating in a race and depending on their position in the race, different unpredictable things can happen. Race conditions are difficult to debug because they are not reproducible. You run the code for the first time - it works correctly. You run the code for the second time - it doesn't work correctly. It is a nightmare.

So how to solve the problem? The synchronized keyword comes to the rescue.
The synchronized keyword can be used to guard both a whole method or only a block of code.
When you apply synchronized to a method, then when a second thread tries to enter the same method, it will be waiting there for the first thread to exit from the method.

Here is the fixed thread-safe code:
public class IdCounter {
     private int counter;

     public synchronized int increment() {
          counter++;

          return counter;
     }
}

When one thread calls the increment() method, and then a second thread calls it, the second thread will wait for the first one to exit from the method, before entering it. So there is no danger for the counter now. The increment operation is now atomic. Notice also that the volatile modifier was removed. This is because when you are in a synchronized method or block, then no caching of variables is used. So synchronized is also doing what volatile does.

Lesson learned 3: Use the synchronized keyword to prevent race conditions.

Stay tuned for the second part of the article where Java Concurrent API will be touched, and also some more complex examples will be covered.

Няма коментари:

Публикуване на коментар