Thursday, December 25, 2008

Using Java concurrency utilities

The inspiration for this post comes from Jacob Hookom's blog and I can only second the recommendations he gives. Although, as always, I would caution to test any such implementation properly, that it works well and actually provides a benefit. There are lots of pitfalls and concurrency is tricky even with the excellent utilities provided in Java.

To summarize the interesting problem: parallelize the execution of lengthy tasks in a web request, without creating many threads for each request, but also ensuring that the thread pool is not starved by one request. The idea is to have a reasonably sized thread pool and to limit the number of tasks executing in parallel to a number small enough to allow the expected amount of concurrent requests to share the pooled threads.

Essentially, limiting the number of tasks executing in parallel can be done in two ways: limit the number of tasks submitted at one time or limit the number of workers that execute a set of tasks. Jacob takes the first approach, I will take the second approach, which seems to make it simpler to manage time-out issues.

Here's some code:

<V> Queue<Future><V>> submit(int numberOfWorkers, Queue<Callable><V>> tasks,
long timeout, TimeUnit unit)
throws InterruptedException, TimeoutException {
Queue<Future><V>> result = new ConcurrentLinkedQueue<Future><V>>();
List<WorkerTask><V>> workers = new ArrayList<WorkerTask><V>>(numberOfWorkers);
for (int i = 0; i < numberOfWorkers; i++) {
workers.add(new WorkerTask<V>(result, tasks));
}
List<Future><Object>> deadWorkers
= executor.invokeAll(workers, timeout, unit);
for (Future<Object> obituary : deadWorkers) {
if (obituary.isCancelled()) {
throw new TimeoutException();
}
}
return result;
}


And the code for a WorkerTask:

private static class WorkerTask<V> implements Callable<Object> {

private Queue<Callable><V>> tasks;
private Queue<Future><V>> result;

public WorkerTask(Queue<Future><V>> result, Queue<Callable><V>> tasks) {
this.result = result;
this.tasks = tasks;
}

public Object call() {
for (Callable<V> task = tasks.poll(); task != null; task = tasks.poll()) {
FutureTask<V> future = new FutureTask<V>(task);
future.run();
if (Thread.interrupted()) {
Thread.currentThread().interrupt(); // Restore interrupt.
break;
}
result.add(future);
}
return null;
}
}

Note that it is important to have thread-safe collections for tasks and result, we should actually make sure that the tasks are in a thread-safe collection, but I'll ignore that for now. Note also the check if the thread has been interrupted in the call() method of WorkerTask. That is vital to be able to cancel the task when you don't want to wait for it any longer (i.e. on time-out). If possible, the submitted tasks should also handle interrupts. Note the careful restoration of the interrupt status so that the caller of the method may also be notified.

Monday, November 24, 2008

GC is for Goodstuff Collector

I have over the past few months noticed that there is a fairly common fear of creating objects in Java. When I query the authors, it always seems to boil down to an attempt to create more performant code through avoiding garbage collection.

So why would one want to create more objects, anyway? Well, one good reason would be to get more readable code, e.g. through encapsulating an algorithm or using appropriate helper objects to express an algorithm more clearly.

Even when the code does get put into a helper object, there seems to be a tendency to keep that instance around and reuse it, to avoid garbage collection so that the code performs better. My first comment is always "You should measure that". If I wanted to put it more sharply I should say "If you can't prove it's a performance benefit, then it probably isn't". I have worked with enough optimization to know that what's true for one language will not be true for another. Even using the same language, something that gives a performance benefit on one machine may be a detriment on another kind of machine (or even the same kind of machine with different "tweaks" like page sizes and such).

If you create a temporary object that lives only as long as you need it you gain the following benefits:
  1. Your object is always in a pristine state when you want to use it.
  2. Your code is a big step closer to being thread safe.
  3. Your code is more readable and easier to analyze.
  4. Your garbage collector gets to do less work.
"What?", you say, "The garbage collector does less work by collecting more garbage?".

Indeed it does. When we hear "garbage collector" we tend to think of the work involved in clearing out an overfilled store room, all that work to haul all the garbage out. But the garbage collector doesn't even know the garbage exists, the very definition of garbage is that it can no longer be reached from anywhere. What the garbage collector really does is create a whole new store room and move the stuff you want to keep over to it and then it lets the old store room and all the garbage disappear in a puff of smoke. So all the work done by the garbage collector is really done to keep objects alive, i.e. the GC is really the "goodstuff" collector.

This is obviously a somewhat simplified view and I don't think it holds completely for objects with finalizers (which is probably why finalizers are really bad for performance). Every single measurement and microbenchmark I've done confirms that creating and destroying objects is never worse and often much better than trying to keep objects around. I've done a few, first because I didn't trust it, then because others were so convinced of the opposite that I had to research it. That isn't an absolute proof that it will be true for all scenarios, but I think it's a pretty good indication of where you should be pointing when shooting from the hip.

From an IBM developer works article, we get the numbers that allocating an object in Java takes about 10 machine instructions (an order of magnitude better than the best C/C++ allocation) and that throwing an object away costs, you guessed it, absolutely zero. So just remember that GC stands for "Goodstuff Collector" and you'll be on the right track.

Saturday, August 16, 2008

The legacy of inheritance

Is inheritance really useful or is it a feature that causes more problems than it solves? Certainly I can't think of a case where I've been really grateful that I've been able to inherit from a superclass but I can think of several instances where it has caused friction and where the extension mechanism of inheritance tended to lead the programmer the wrong way.

Consider the following code (public fields for brevity):

public class Square {
  public int base;
  public int getArea() {
return base * base;
 }
}

public class Rectangle extends Square {
  public int height;
  @Override public int getArea() {
    return base * height;
  }
}

public class Parallelogram extends Rectangle {
  public double angle;
}

Now how do we implement a Rhombus? Is it a Square extended with an angle (and overridden area calculation) or a Parallelogram with conditions on setting the properties so that the invariant is preserved (which is why we should have accessors, by the way)?

Well, the correct answer is neither, even though we have a nice sequence of extensions. The problem is that we have been led astray by the extension mechanism and violated the "is a" rule for subclassing and ended up with a corrupt type system. Clearly a Parallelogram is not a Rectangle which equally clearly is not a Square so a subclass instance may not safely be used in place of a superclass instance. Reversing the class hierarchy solves the issue, however it creates subclasses that are restrictions of the superclass rather than extensions.

An acquaintance of mine who is highly experienced in creating standardized components for information interchange once claimed that it only works to inherit properties by restriction. This is well worth considering, although I'm not entirely convinced because it causes a different set of problems when restriction means "that particular property is not used", as it so often does in information interchange scenarios. In that case it is usually not interesting nor useful to know what the superclass is. In our case at hand it will work because the subclasses actually have a meaningful and useful value of the property but it is restricted by an invariant, so it is also useful to be able to view them as instances of the superclass.

Let's try coding again:

public class Parallelogram {
  protected int base;
  protected int height;
  protected double angle;

  public void setAngle(double angle) {
    this.angle = angle;
}
}

public class Rectangle extends Parallelogram {
  @Override public void setAngle(double angle) {
     // What should we do here?
  }
}
...

Oops, this doesn't seem to work well, either. Obviously we can't just inherit the setAngle method or we could end up with a corrupt Rectangle, nor is there any reasonable action we can take. We can get out of our pickle, however, because constructors are not inherited! Simply make our instances immutable:
public class Parallelogram {
  public final int base;
  public final int height;
  public final int angle;

  public Parallelogram(int base, int height, double angle) {
    this.base = base;
    this.height = height;
    this.angle = angle;
  }

  public int getArea() {
    return base * height;
  }
}

public class Rectangle extends Parallelogram {
  public Rectangle(int base, int height) {
    super(base, height, Math.PI/2);
}
}

public class Square extends Rectangle {
  public Square(int side) {
    super(side, side);
  }
}

That seems to work, and the implementation of getArea() is actually profitably inherited. But we still have a problem in Java with a Rhombus, since a Square is both a Rectangle and a Rhombus but a Rhombus is not a Rectangle. To solve that would require defining the types as interfaces and have separate implementation classes (now I understand why that is often  recommended :-) ).

If we had chosen to implement Parallelogram with a field for the length of the side instead of the height, the formula for the area would have been base*Math.sin(angle)*side. Even if this would have worked for our Rectangle, it would have been inefficient (although when the fields are immutable it could probably be optimized by the compiler and/or JVM).

On the whole, I believe it is seldom profitable to inherit the implementation of a method, if you have a good example to the contrary, please share it.

I think that overriding all public methods would be preferable, even if you just decide to call super's version, at least it would show that you gave some thought to whether it was correct or not. Without having done an exhaustive study, I believe this is especially true for interfaces that indicate that a class "is" something rather than that it "is a" something, i.e. those interfaces whose name usually end in "able". Try Cloneable, Externalizable, Runnable and Printable.

Monday, July 21, 2008

Cedric's coding challenge

Although I came late to the party at Cedric's coding challenge, this is the kind of problem I just can't resist taking at least a little dip into. The idea is to generate all numbers up to a maximum that do not have any repeated digits.

My initial thought was that the pattern of digits for an n-digit number should be the same, you just need to recode the values. I.e. given 3 digits a, b and c from which you should generate all 3-digit sequences without repeated digits, then generating the 2-digit suffixes ac and ca after choosing b as the first digit is the same as choosing a as the first digit, generating bc and cb and then recoding a->b, b->a and c->c.

Anyway, I soon realized that I couldn't come up with an efficient recoding algorithm and also that generating the suffixes didn't cost much more than an iteration over a pre-generated suffix list. So I basically ended up with the same algorithm as crazybob and Jonathan Hawkes improvement. Happily, my implementation was a bit faster still than both of theirs, 30% faster than Bob's and 20% faster than Jonathan's in Bob's way of counting and when measured like Jonathn does it, mine is 65% faster than Bob's and 30% faster than Jonathan's. Here it is:

/**
* Finds base 10 numbers whose digits don't repeat. Solution to Cedric's coding
* challenge: http://beust.com/weblog/archives/000491.html
*
*
@author Torbjorn Gannholm
*/

public class NonRepeatingDigits {
private final long max;
private final int base;
private final Listener listener;
private final Digit head;

/**
* Finds all numbers with non-repeating digits in the supplied base up to max.
*
@param base The base, e.g. octal (8), decimal (10), hexadecimal (16).
*
@param max Upper bound for returned values.
*
@param listener Listener to receive valid numbers.
*/

public NonRepeatingDigits(int base, long max, Listener listener) {
this.base = base;
this.max = max;
this.listener = listener;
head =
new Digit(base);
}

public void run() {
// One digit numbers.
for (Digit current = head.next.next;
current !=
null;
current = current.next) {
int value = current.value;
if (value > max) return;
listener.hear(value);
}
// Pick first digit (non-zero), then find the rest of the number.
for (int digitsAfterFirst = 1;
digitsAfterFirst < base;
digitsAfterFirst++) {
for (Digit previous = head.next, current = head.next.next;
current !=
null;
// Restore removed digit to sequence.
previous.next = current, previous = current, current = current.next) {
// Remove chosen digit from sequence.
previous.next = current.next;
if (followingDigits(digitsAfterFirst, current.value * base)) return;
}
}
}

/** A digit, part of a linked list of all digits in a base.
* (Shamelessly paraphrased from crazybob) */

private static class Digit {
private Digit next;
private final int value;

/** Constructs a non-digit that acts as the head for the list, and the
* digits in the list. */

Digit(
int base) {
value = -
1;
next =
new Digit(base, 0);
}

/** Constructs the real sequence of digits */
private Digit(int base, int value) {
this.value = value;
if (++value < base) {
next =
new Digit(base, value);
}
}
}

/**
* Recursively generates digits after the first of a valid number.
*
@param remaining Number of digits left to generate.
*
@param prefixValue Value of previous digits.
*
@return true when values generated are greater than max.
*/

private boolean followingDigits(int remaining, int prefixValue) {
if (remaining > 1) {
for (Digit prev = head , current = head.next;
current !=
null;
// Restore removed digit to sequence.
prev.next = current, prev = current, current = current.next) {
// Remove chosen digit from allowable digits.
prev.next = current.next;
if(followingDigits(remaining - 1,
(prefixValue + current.value) * base))
return true;
}
}
else {
for (Digit current = head.next; current != null; current = current.next) {
int value = prefixValue + current.value;
if (value > max) return true;
listener.hear(value);
}
}
return false;
}
}

Sunday, May 4, 2008

Beautiful enums

Whatever complaints you may have against java, I think you have to grant that at least enums are a thing of beauty. Consider this code line:

Collections.sort(myChildren, Child.Order.ByAge.descending());

The best part is that this started to appear as a side effect of another rational structuring of code, creating a synergy effect of two positive drives together. This kind of synergy always appeals to my aesthetic sense. Here's the Child class:

public final class Child {
private final Integer age;
private final String name;
...
public static enum Order implements Comparator {
ByAge() {
public int compare(Child lhs, Child rhs) {
return lhs.age.compareTo(rhs.age);
}
},
ByName() {
public int compare(Child lhs, Child rhs) {
// TODO: Should really use a collator.
return lhs.name.compareTo(rhs.name);
}
};

public abstract int compare(Child lhs, Child rhs);

public Comparator ascending() {
return this;
}

public Comparator descending() {
return Collections.reverseOrder(this);
}
}
}

Lovely! Other examples of the synergy effect I've seen recently are my earlier posted thread safe state pattern and this blog entry. In both those cases the synergy was better code structure and better performance.

Monday, March 31, 2008

Java Thread-safe State Design Pattern

Thread-safe State


Apart from the things mentioned here, this pattern also conveys the intent and benefits of the GoF State pattern, being an extension thereof. The term "behaviour" is interpreted more strictly so that an object is always considered to change its behaviour when its state changes.

Intent


Make sure that an object is observable only in a valid consistent state and that transitions between valid consistent states are atomic and side-effects occur if and only if the associated transition occurs validly.

Motivation


Consider a class where behaviour depends on more than one field and/or side-effects occur as a result of certain state-changes. Such behaviour can be very difficult to analyze for thread-safety and the safe classic approach is to synchronize all methods. Such behaviour can also be difficult to analyze for logical correctness, bringing the motivation for the state pattern into play.

The key idea is to leverage the state pattern and store the current state in an AtomicReference. Current state should be represented by an immutable object. State transitions are performed by a compareAndSet (CAS) operation and a retry on failure. For transitions with side-effects, the CAS is performed to a boilerplate blocking state, the side effects performed, the resultant state set and the blocking state released, whereby each thread released from the blocking state retries the attempted operation on the resulting state. The CAS always compares with the "this" reference of the state instance being transitioned from and can only succeed for one thread, all other threads attempting a simultaneous transition have to retry on the new state.

Applicability


Any object that has a method that has a more complex interaction with the object state than one of either a simple get or a simple set (in which case just marking the field volatile is simpler and more performant).

Structure (given as implementation examples)


Delegate all calls to an immutable state instance:
class Implementation implements Interface {
  private final AtomicReference state;

  public ... doSomething(...) {
      return state.get().doSomething(...);
  }

  private class StateA implements Interface {
    private final ... valueX;
    private final ... valueY;

    public ... doSomething(...) {
       .... valueX ... valueY ...
    }
  }
}

Simple state changes performed by CAS or retry:
class Implementation implements Interface {
  private final AtomicReference state;

  public void setX(y) {
      state.get().setX(y);
  }

  private class StateA implements Interface {
    private final ... x;

    public void setX(y) {
       if (!state.compareAndSet(this, new StateA(y))) {
         state.get().setX(y);
       }
    }
  }
}

State changes with side-effects CAS to a blocking state first:
class Implementation implements Interface {
  private final AtomicReference state;

  public void printAndSetX(y) {
      state.get().printAndSetX(y);
  }

  private class StateA implements Interface {
    private final ... x;

    public void printAndSetX(y) {
       BlockingState guard = new BlockingState();
       if (state.compareAndSet(this, guard)) {
         try {
           print(x);
           state.set(new StateA(y));
         } finally {
           guard.release();
         }
       } else {
         state.get().printAndSetX(y);
       }
    }
  }

  private class BlockingState implements Interface {
    private final CountDownLatch latch = new CountDownLatch(1);

    public void printAndSetX(y) {
      try {
        latch.await();
      } catch (InterruptedException e) {
        // Don't care.
      }
      state.get().printAndSetX(y);
    }

    private void release() {
      latch.countDown();
    }
  }
}

Participants



  • Interface - defines the public interface.

  • Implementation - maintains an AtomicReference to current StateImplementation, implements the public interface.

  • StateImplementation - the implementation of the public interface for a particular state.

  • BlockingState - specialized boilerplate StateImplementation that blocks all calls until released and then delegates to the current state.

Collaborations



  • Implementation delegates relevant (usually all) calls to the current StateImplementation.

  • StateImplementation performs state transitions by a state.compareAndSet(this, nextState) or retry the transition on the current state (which is different from "this" since CAS failed) so that each state object can only be transitioned from once.

  • BlockingState blocks all calls until released, thereupon retries on new current StateImplementation.

Consequences



  1. Interactions with the object always occur with a consistent state of the object, the immutability of the state delegate guaranteeing a consistent result of the interaction and each separate interaction is therefore in effect atomic.

  2. State transitions are explicit and atomic.

  3. Performance seems to be as good as or better than a straightforward non-thread-safe implementatin with mutable fields. Performance is better than that of synchronization (compare "optimistic locking" versus "read/write locking") for applicable objects if some interactions do not entail state changes, especially when the object is reachable from more than one thread (even if access is never contended).

Sunday, March 23, 2008

Performance and thread-safe state handling

In my previous post I introduced using the state pattern and an AtomicReference to help maintain thread safety. My reason for starting to play with the pattern was really to make it easier to analyze the code and prove correctness, but obviously the question of performance inevitably comes up and it is interesting to know the trade-offs.

I set up a benchmark where ten threads would simultaneously try to execute the same code and each thread would just hammer away for an obscenely large number of times. First up for the test was the simple lamp:

public interface Lamp {
boolean isOn();
void setOn(boolean on);
}

The code to execute was:

b = lamp.isOn();
lamp.setOn(!b);

I had an implementation with just a volatile field, an implementation with both methods synchronized, a hybrid implementation where the field was volatile and only the setter synchronized and an implementation using the state pattern. I also ran the bad non-thread-safe implementation. The results with the average time for one run through the code: volatile (669 ns), bad (697 ns), state (985 ns), hybrid (1570 ns), synchronized (2535 ns). Interestingly enough, the state pattern kicks butt. Actually the state pattern was even faster than volatile in the first tests I ran, but that is probably because I ran too few iterations to pay the full cost of garbage collection.

So how about a predicted worst case, the CoffeeMachine where the state changes on every method call?

public interface CoffeeMachine {
void fillBeans(int beans);
void fillWater(int water);
Coffee makeCoffee(int water, int beans) throws OutOfWaterException, OutOfBeansException;
}

And the test code, same rules as above, ten threads whacking away:

Random r = new Random();
....
int w = r.nextInt(5);
cm.fillWater(w);
int b = r.nextInt(3);
cm.fillBeans(b);
try {
cm.makeCoffee(w,b);
} catch (OutOfWaterException e) {
oow.getAndIncrement();
} catch (OutOfBeansException e) {
oob.getAndIncrement();
}

This was a straight shoot-out between a synchronized implementation and a state pattern implementation. Result: synchronized (5021 ns), state (5333 ns). A slight victory for synchronization, yeah!

So how about adding some getters into the CoffeeMachine, how does that change things? The test code was changed to:

Random r = new Random();
....
int w = r.nextInt(5);
if(cm.checkWater() < w*10) cm.fillWater(w*100);
int b = r.nextInt(3);
if (cm.checkBeans() < b*10) cm.fillBeans(b*100);
try {
cm.makeCoffee(w,b);
} catch (OutOfWaterException e) {
ons.getAndIncrement();
} catch (OutOfBeansException e) {
offs.getAndIncrement();
}

With the getters present it makes sense to try a hybrid approach again. Result: state (2547 ns), hybrid (3675 ns), synchronized (4930 ns).

Conclusion: not only does the state pattern with an AtomicReference to an immutable internal state representation make it easier to manage code complexity, it is also beneficial from a performance point of view, being just slightly slower than synchronization in the worst case and twice as fast in normal cases.

Addendum: My keen colleague, the informed sceptic, pointed out to me that perhaps some more testing is needed before it can be generally said that the state pattern is better performing. He claimed that normal usage is that threads do something else and then hit the shared resource, which results in the locks being mostly uncontended. I suppose he has a point. But how to test it?

First attempt, just run my Lamp benchmark single-threaded: volatile (10 ns), synchronized (10 ns), bad (19 ns), hybrid (22 ns), state (42 ns). Oops! State pattern is 4 times slower than synchronized. Interesting also that plain old non-thread-safe code is slower than volatile and synchronized. The hybrid approach is twice as slow as synchronized. But what is going on here? We've run a long warm-up and then we're just whacking away at our Lamp from a single thread. Obviously HotSpot is able to do some nice things with the synchronized methods in this case (good to know).

Next attempt, run a minimal warmup and a short time: volatile (10 ns), bad (18 ns), state (44 ns), hybrid (63 ns), synchronized (87 ns). So now we're not hitting it hard enough for HotSpot to do all its magic synchronized is dog slow again.

Third attempt, hit the class hard from just two threads: bad (120 ns), volatile (121 ns), state (187 ns), hybrid (338 ns), synchronized (752 ns). But we're still hitting it hard. We need to figure out how to simulate the non-contention scenario with many threads.

Fourth attempt, put in a sleep in each loop. No good, there's too much noise coming in from the calls to sleep(). For what its worth, the numbers are very similar, differences are less than 1% (duh!). We need another angle of attack.

Just for a change of pace I decided to run the CoffeeMachine benchmark in one thread: state (114 ns), hybrid (160 ns) and synchronized (212 ns). Hmm, I realize I didn't really run single-threaded because there were two threads that had access to the CoffeeMachine object. One thread was just waiting for the other to finish, though, so all locks were uncontended. Trying that with Lamp I get: volatile (13 ns), bad (20 ns), state (42 ns), hybrid (65 ns) and synchronized (89 ns). It seems that uncontended locks are just as expensive as contended locks and HotSpot was able to do some powerful magic when the object was only accessible from one thread. Sanity check on CoffeeMachine running truly single-threaded: state (110 ns), hybrid (113 ns), synchronized (208 ns). Surprise! HotSpot couldn't do so much this time! I just have to check the performance against a standard non-thread-safe implementation: standard (123 ns) versus state (110 ns)!!!!

Conclusion 2: The thread-safe state pattern is excellent to use as a standard pattern. As my eager colleague points out so often, the important thing is to be able to analyze the correctness of the code and performance is usually not so important. Well, the state pattern was originally evolved to make it easier to handle varying behaviour depending on state and transitions between these behavioural states, so that requirement is solidly fulfilled. For anything but the simplest of classes, the thread-safe state pattern also seems to be most performant way of achieving thread safety and actually performs better than an ordinary non-thread-safe implementation in at least the one case tested here.

Monday, March 17, 2008

Thread-safe state handling

In my previous post I noted that handling mutable state in multi-threaded programs is a problem that should preferably be avoided. Sometimes it may be too complicated to avoid mutate state and I will in this post look into some ways of handling it, including how the classic State design pattern can help, with a little twist. A good, if old, introduction to threading issues in Java can be found here.

Consider the following code:

public class Lamp {
private boolean on;
public boolean isOn() {return on;}
public void setOn(boolean on) {this.on = on;}
}

No threading problems, right?

Wrong! I had read about it but learnt it again the hard way when one of my unit tests started failing spuriously. I could tell from other measuring points that there was no way that assert could fail, yet it did, sometimes. The issue is that, especially on a multi-core machine, there is no guarantee that other threads will ever get to see changes. To guarantee that, either the same synchronization monitor must be used for both reads and writes, or the variable needs to be volatile.

With that lesson in mind, let's look at another example:

public class WaterDispenser {
private volatile int water;
public void fill(int amount) { water += amount; }
public void get(int amount) throws InsufficientWaterException {
if (water >= amount) water -= amount;
else throw new InsufficientWaterException();
}
}

Unfortunately, += and -= are not atomic operations so two threads filling up at the same time may result in only the one fill being effective. Also there is a gap between the check and the decrement so the class may end up in an invalid state if two threads get water at the same time.

It is always an option just to synchronize the methods and it may well be the best option in this case. It is when you have frequently used methods that just query the state that you want to avoid synchronization because it will give an overhead on every query.

Just for illustration, let's use an AtomicInteger here, which has the same memory consistency guarantees as volatile and a bit of extra functionality. The addition in fill is now atomic, but in the get we also check that we have enough water so it's not good enough to just do an atomic subtraction after the check since that precondition may not be valid when we get to the subtraction, we have to atomically check that the value didn't change with a compareAndSet or redo.

public class WaterDispenser {
private AtomicInteger water;
public void fill(int amount) { water.addAndGet(amount); }
public void get(int amount) throws InsufficientWaterException {
int currentWater;
int nextWater;
do {
currentWater = water.get();
nextWater = currentWater - amount;
if (nextWater < 0) throw new InsufficientWaterException();
} while (!water.compareAndSet(currentWater, nextWater));
}

It becomes more and more complex to analyze and avoid race conditions the more you need to do, consider for example extending the WaterDispenser to become a CoffeeMachine, when you have to use coffee beans as well as water.

The State pattern is used to make it easier to analyze and express state-dependent behaviour. Not surprisingly, I suppose, it can also be used to simplify concurrent state-handling, although perhaps with a stricter interpretation of "state-dependent behaviour".

public class CoffeeMachine {
private AtomicReference<CoffeeMachineState> state = new AtomicReference<CoffeeMachineState>(new CoffeeMachineState(0,0));

public void fillBeans(int beans) {
state.get().fillBeans(beans);
}
public void fillWater(int water) {
state.get().fillWater(water);
}
public Coffee makeCoffee(int water, int beans) throws OutOfWaterException, OutOfBeansException {
return state.get().makeCoffee();
}

private class CoffeeMachineState {
private final int water;
private final int beans;
CoffeeMachineState(w, b) {
water = w;
beans = b;
}
void fillBeans(int moreBeans) {
if (!state.compareAndSet(this, new CoffeeMachineState(water, beans+moreBeans)) {
state.get().fillBeans(moreBeans);
}
}
void fillWater(int moreWater) {
if (!state.compareAndSet(this, new CoffeeMachineState(water+moreWater, beans)) {
state.get().fillWater(moreWater);
}
}
Coffee makeCoffee(int w, int b) throws OutOfWaterException, OutOfBeansException {
if (water < w) throw new OutOfWaterException();
if (beans < b) throw new OutOfBeansException();
if (!state.compareAndSet(this, new CoffeeMachineState(water+moreWater, beans)) {
return state.get().makeCoffee(w,b);
}
return new Coffee(w,b);
}
}
}
Again it would probably be better to just sychronize the methods in this case, but it serves to illustrate the pattern: delegate calls to the current state, which either manages to change the state or delegates to the new state if another thread got in first.

Back to the lamp example, which is a classic for the state pattern, but we'll add some functionality. The lamp is placed in a room and when we turn it on we should illuminate the room and when we turn it off we should darken the room. Now we have a gap between the state change and the room operation, a gap where another thread can come in and wreak havoc. The solution is to introduce a transitional state that holds other threads at bay while you complete the transition.

class Lamp {
private AtomicReference<Lamp> state = new AtomicReference<Lamp>(new Off());
private final Room room;
public Lamp(Room room) {this.room = room;}
public boolean isOn() {return state.get().isOn();}
public void setOn(boolean on) {state.get().setOn(on);}

private class On extends Lamp {
public boolean isOn() {return true;}
public void setOn(boolean on) {
if (on) return;
synchronized (Lamp.this) {
TransitionGuard guard = new TransitionGuard();
if (state.compareAndSet(this, guard) {
room.darken();
state.set(new Off());
} else {
state.get().setOn(on);
}
}
}
}

private class Off extends Lamp {
public boolean isOn() {return false;}
public void setOn(boolean on) {
if (!on) return;
synchronized (Lamp.this) {
TransitionGuard guard = new TransitionGuard();
if (state.compareAndSet(this, guard) {
room.illuminate();
state.set(new On());
} else {
state.get().setOn(on);
}
}
}
}

private class TransitionGuard extends Lamp {
public boolean isOn() {
synchronized (Lamp.this) {
return state.get().isOn();
}
}
public void setOn(boolean on) {
synchronized (Lamp.this) {
state.get().setOn(on);
}
}
}
}

Synchronizing would be easier, but with this pattern we only pay the synchronization cost during complex state transitions, although we do get a certain cost for object creations, but this is generally very low.

One final word of caution: if using an object involves querying more than one of its stateful values, that is also a potential race and you should consider returning an immutable object representing a snapshot of all the queryable state instead.

Thursday, March 6, 2008

Is synchronisation expensive? Is that the right question?

Sometimes I find that I come across a lot of pieces of information which together end up pointing in a certain direction. If you've ever read "The celestine prophecy" you might term this "synchronicity", which, by coincidence (or synchronicity), fits very well with the current topic :-).

The trigger point for this post was that a colleague of mine got into a discussion with a speaker at jFokus about what happens when you "synchronize" (the speaker even blogged a bit about it). Now I know that my colleague usually knows what he is talking about, so what is it? Is synchronization expensive or not?

When I asked my colleague he said that grabbing an uncontended lock is on the order of 10-8 seconds. As I had been benchmarking a hash set implementation, I had everything set up to test (and confirm) this. On my machine the overhead for grabbing an uncontended lock (putting the synchronized keyword on a method) was 70ns. Now that sounds pretty cheap, doesn't it? (I have subsequently found out that there are strange issues when running benchmarks. Running again in a supposedly more correct way, the cost of synchrnization would seem to be more like 7ns, or rather, with other noise involved, between 1 and 30, mostly under 10. So my colleague was dead on.)

Well, considering that I'd just been very happy about reducing the execution time of the method from 30ns to 20ns, the prospect of adding a 70ns overhead was not appealing. So synchronization is expensive, then?

Turning it around again, how much does it cost you to get stale data in some threads (the data may actually never be updated for some threads)? How much does it cost you if your object gets into an inconsistent state? In that perspective synchronization is usually very cheap.

Swing tries to do without synchronization by mandating that all access to display state needs to be done from the EDT. I sometimes see this expressed as all modifications need to be done on the EDT, but because how the java memory model publishes modifications, I am fairly sure that even querying the state of components needs to be done on the EDT. I'm beginning to think that the Swing threading model causes a viral spread of things to be done on the EDT which can only be broken by having synchronization at some point. This model is perhaps so subtly difficult to handle that it should be considered broken.The conclusion must be that you have to do what you have to do to protect yourself against the bad effects of unprotected access to shared mutable state and asking whether synchronization is expensive does not necessarily lead you in the right direction. A better question to ask would be "Can I do this in some other way than by accessing shared mutable state?"

Accessing shared mutable state is really the issue to look at (and where my synchronicity is leading me) and it is fraught with more dangers than just avoiding race conditions. Spot the bug (and I don't mean the obvious one that the access to lineNumber isn't protected):

final int[] lineNumber = new int[1];
while (in.hasNextLine())
{
final String line = in.nextLine();
lineNumber[0]++;
EventQueue.invokeLater(new Runnable()
{
public void run()
{
statusLine.setText("" + lineNumber[0]);
textArea.append(line);
textArea.append("\n");
}
});
}
(Background to this example). An example of avoiding access to shared mutable state can be found in my first blog post.

Thursday, February 28, 2008

I wish languageXYZ had inner classes....

I've been in love with inner classes since I first encountered them in the first java course I attended.

I remember we had an exercise to build a text chat client and one thing we had to do was handle a case where a button behaved differently depending on which state it was in. I proceeded to create two different listeners and removed the one and added the other each time the state changed. This surprised the teacher who had never seen any other solution than a simple if-then-else in the one listener (a solution that didn't even enter my head, strangely enough). Perhaps I was going overboard but it just seemed so much tidier.

The book we got was "Thinking in Java" and, needless to say, I feasted on the chapter about inner classes (greenhouse controls was the example used there) and the realization that the (non-static) inner class lives in the environment formed by the surrounding object. Really cool, I didn't have to pass all state as parameters any more.

And an inner class can extend the class it resides in, there's great potential there. I was quite taken by the beauty of the hypergeometric probability calculator I wrote up to calculate the probability of a bridge player holding a specific pattern of cards:

public class HypergeometricProbability {
private class SpecificCard extends HypergeometricProbability {
...
}
...
public HypergeometricProbability andHoldsSpecificCard(int ofRemainingCards) {
/* I wish I knew how to get a reference to the surrounding object without having to pass "this" explicitly. */
return new SpecificCard(ofRemainingCards, this);
}
...
}

And then you have anonymous classes that can capture state from the method invocation they're declared in. I must confess I generally don't use anonymous classes, I find it usually much neater and more readable to create a named private inner class with a proper constructor encapsulating the needed state.

Another cool thing in java that I'm still playing with finding the full potential of is java enums. How great is it that they're actual classes that can implement interfaces, extend classes and completely encapsulate all things about the state they represent?

When researching for my previous blog entry, I came across a blog entry by Daniel Spiewak, Magic closure syntax. This roused my curiosity and got me thinking "Can it be that java is different enough that you have to find the java way of doing things? This won't be exactly like you're used to doing it with your favourite abstraction from languageXYZ, but will it necessarily be less elegant?" I hope to be able to explore this question further in following blog entries and you're welcome to provide me with challenges.

Thursday, February 21, 2008

Java closures and threads

I have always thought that the biggest weakness of the proposed closures for Java is threading. Strangely enough, the impression one gets from other articles is that closures and threading is a good match and that closures will somehow make threading easier. To analyze this issue I'll start from Neal Gafter's blog post about concurrent loops and closures.

Neal proposes the following code using closures:

public Collection getAttendees() {
List result = new ArrayList();
for eachConcurrently(EventResponse r : getResponses(), threadPool) {
if (r.mayAttend()) {
Principal attendee = r.getAttendee();
synchronized (result) {
result.add(attendee);
}
}
}
return result;
}

Imagining myself coming across this code in some code mass I was analyzing, I would think it looked very neat. Then the synchronized keyword would pop out at me and rouse my curiosity, I would look closer and see for eachConcurrently which explains it nicely and I would move on, no problem.

But what if the synchronized block was missing? How easily would I catch the error?

Therein lies the weakness, there is no clear signal that there may be other issues to consider, in this case access to shared resources. You may sometimes even be submitting your closure to multi-threaded execution without knowing it. A variation of the same flaw were mechanisms like DCOM that allowed transparent use of remote components.

At least you get the idea that something out of the ordinary is happening in the horribly clumsy code presented by Neal as being the current state of affairs without closures, no doubt to emphasize his point and inspire fear in the hearts of the doubters.

But the java.util.concurrent package actually leads you down the right path if you care to follow. A Callable should normally return a result and a Future returns a "canned" execution result from the Callable, or the exception thrown packaged in an ExecutionException, once the processing is done and you are ready to deal with it. Which might lead to the following code instead:

public Collection getAttendees() {
CompletionService ecs =
new ExecutorCompletionService(threadPool);
int numberOfTasks = 0;
for (final EventResponse r : getResponses()) {
if (r.mayAttend()) {
++numberOfTasks;
submitGetAttendeeTask(r, ecs);
}
}
return collectGetAttendeeTaskResults(numberOfTasks, ecs);
}

Does that look so formidable? Fair enough, it needs some more code to be complete, but nothing scary there either:

private void submitGetAttendeeTask(final EventResponse r,
CompletionService ecs) {
ecs.submit(new Callable() {
public Principal call() {
return r.getAttendee();
}
});
}

private Collection collectGetAttendeeTaskResults(int numberOfTasks,
CompletionService ecs) {
final List result = new ArrayList();
for (int i = 0; i < numberOfTasks; ++i) {
try {
result.add(ecs.take().get());
} catch (InterruptedException ex) {
throw new AssertionError(ex); // shouldn't happen
} catch (ExecutionException ex) {
throw new AssertionError(ex); // shouldn't happen
}
}
return result;
}

Having come this far, it would certainly be nice to lose some of the boilerplate in creating the Callable and the possible unwrapping of the ExecutionException, so there are certainly some good motivations behind closures. Can this be solved by other means? Certainly, CICE would help with the first, and for the second there could be devised a cute way of specifying and parametrizing sets of exceptions rather than having Callable throws Exception and Future.get() throws ExecutionException. One thing you still would have to deal with is the InterruptedException that is thrown if your main thread is interrupted while waiting for the results. But that is something you maybe should think about or at least might want to think about or use in the future, it is one of those threading issues after all.