Small Note About Concurrency

I’m reading Scala in Depth. In Section 2.3.2 on concurrency, the author gives the following example of a thread safe “index service” (removed trait definition for brevity):

class ImmutableService[Key, Value] {
  var currentIndex = new ImmutableHashMap[Key,Value]
  def lookUp(k : Key) : Option[Value] = currentIndex.get(k)
  def insert(k : Key, v: Value) : Unit = synchronized {
    currentIndex = currentIndex + ((k, v))
  }
}

The author shows that this implementation, is much faster than an implementation which uses a value reference to a mutable map, and synchronises both lookup and insert operation.

When I was reading this code, I was wondering whether this implementation is indeed safe. After all, the assignment operation to the var itself is not synchronised – do we have a guarantee that it is atomic?

The Java Language Specification answered my question:

For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Writes and reads of volatile long and double values are always atomic.

Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.

Note that in general, updating 64-bit chunks of memory in the JVM may not be atomic, but reference updates specifically must be atomic as per the JLS.

This is a rather delicate point – in the general case, if reference assignments were not guaranteed to be atomic (as is sometimes the case in other languages), this example would not be correct since the reading threads may see a partially initialised reference.

Since this is such a delicate point, I think it’s worth mentioning explicitly in the book’s text.

Advertisements

7 thoughts on “Small Note About Concurrency”

  1. Note, however, that volatile fields have another effect, on per-thread caching. Our reference here is JLS chapter 17, ‘Synchronization’: https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html

    Volatile fields create a happens-before edge both on reading and on writing. Your code example synchronizes only the writing. Does this matter? I believe it does. Consider the following code:

    // Thread 1
    val svc = new ImutableService[Int, Int]
    Future { svc.insert(1, 1) } onComplete { case _ => assert(svc.lookUp(1).isDefined }

    Suppose for the sake of the example that the two futures execute on different threads. The Future mechanism guarantees there is a happens-before edge between two mapped futures. But there is no happens-before edge between writing in insert and reading in lookUp. The synchronized construct only orders itself wrt. other synchronizations, not wrt. regular access to the object owning the monitor. So the assert can read an ‘older’ value, and fail.

    The JLS gives the following related example (17.3):

    For example, in the following (broken) code fragment, assume that this.done is a non-volatile boolean field:

    while (!this.done)
    Thread.sleep(1000);

    The compiler is free to read the field this.done just once, and reuse the cached value in each execution of the loop. This would mean that the loop would never terminate, even if another thread changed the value of this.done.

    I don’t believe there’s a relevant difference between boolean and reference fields.

    BUT: in practice, I’ve been unable to reproduce such a race condition. This may be because I’m missing something in the spec. I think it’s likely, though, that the JVM implementation makes more informal guarantees than the spec, and these programs take advantage of it. I’d still mark my currentIndex field volatile by default, absent proof of a performance issue.

    Would love to be proven wrong, though.

    1. I’m too used to editable comments and don’t proofread enough. Ignore the ‘// Thread 1’ comment. And the relevance of the JSL quote is to point out that a thread doesn’t have to *ever* update its cached copy of a field it once read.

      1. Yes, you are right.
        I had a vague notion that leaving a synchronised block should flush the thread’s local cache to main memory, but digging deeper, I see that the spec doesn’t make any guarantee about if/when other threads that are not synchronising on the same lock will see this change.
        And even if there was some guarantee there, I think that using volatile would be the better – if only to make the assumptions and semantics of the code explicit.

      2. Looks like I was (probably) wrong after all. The JLS chapter 17 is just badly written or incomplete or something.

        The javadoc for package java.util.concurrent says, “An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.”. ‘All actions’ includes writing to shared variables.

        The JSR 133 (new JMM) FAQ (https://www.cs.umd.edu/users/pugh/java/memoryModel/jsr-133-faq.html#synchronization) says: “Synchronization ensures that memory writes by a thread before or during a synchronized block are made visible in a predictable manner to other threads which synchronize on the same monitor. After we exit a synchronized block, we release the monitor, which has the effect of flushing the cache to main memory, so that writes made by this thread can be visible to other threads.”

  2. Egad! I finally understood that the JLS does say this same thing!

    Two actions in the same thread always have a happens-before relationship matching their program execution order. This includes the action-in-a-thread of locking or unlocking a monitor. Unlocking a monitor has a happens-before relationship with locking it in another thread. And in the other thread, locking the monitor happens-before reading the shared variable. And happens-before relations are transitive.

    Conclusion: I was completely wrong and the field does not have to be volatile.

    Although it totally should be private. A public var is just euugh.

    1. I’m not sure I understand.

      Assume we have a writer thread which executes:
      create thread & cache var -> lock -> modify var -> unlock & flush cache
      And a reader thread which executes:
      create thread & cache var -> read var -> read var

      Since the reader thread doesn’t lock on anything, this is a legal ordering:
      1. W: create thread & cache var
      2. R: create thread & cache var
      3. R: read var
      4. W: lock
      5. W: modify var
      6. W: unlock & flush cache
      7. R: read var
      In step 7, the reader is seeing a stale value from its cache.

      Since the reader is not locking the monitor at any point, there’s no happens-before relation between 6 and 7. Am I missing something?

      1. You’re right. I forgot that in your case the reader didn’t synchronize on the monitor. My comments from today don’t apply to your example.

        I started out (this morning) by asking myself, even if both the reader and the writer both synchronized on the same monitor, how did the apparently unrelated shared variable value get from one to the other? It’s a standard assumption that it does, but I couldn’t see at first where the JLS says so. And by the end I forgot that you were talking about a different case entirely.

        My bad.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s