Some Scala Randomness

I thought it’d be a fun exercise to implement a ?: operator for Scala, like the one Java has:

boolValue ? ifTrue : ifFales

I turned it into a ?! operator, though, because Scala didn’t like me trying to define a method called ‘:’.

implicit class FancyBoolean(val b: Boolean) {

	def ?[T] (ifTrue: => T) = if (b) {
		new TrueBranch[T] {
			override def ![A >: T, F <: A](ifFalse: => F): A = ifTrue
		}
	} else {
		new TrueBranch[T] {
			override def ![A >: T, F <: A](ifFalse: => F): A = ifFalse
		}
	}

	trait TrueBranch[T] {

		def ![A >: T, F <: A] (ifFalse: => F): A
	}
}

Let’s see how this stuff behaves:

scala> true ? 1 ! 2
res2: Int = 1

scala> false ? 1 ! 2
res3: Int = 2

scala> false ? 1 ! "Hi!"
res4: Any = Hi!

scala> true ? println("Evaluated true!") ! println("Evaluated false!")
Evaluated true!

Nice :)

Small Note About Concurrency

I’m reading Scala in Depth. In Section 2.3.2 on concurrency, the author gives the following example of a thread safe “index service” (removed trait definition for brevity):

class ImmutableService[Key, Value] {
  var currentIndex = new ImmutableHashMap[Key,Value]
  def lookUp(k : Key) : Option[Value] = currentIndex.get(k)
  def insert(k : Key, v: Value) : Unit = synchronized {
    currentIndex = currentIndex + ((k, v))
  }
}

The author shows that this implementation, is much faster than an implementation which uses a value reference to a mutable map, and synchronises both lookup and insert operation.

When I was reading this code, I was wondering whether this implementation is indeed safe. After all, the assignment operation to the var itself is not synchronised – do we have a guarantee that it is atomic?

The Java Language Specification answered my question:

For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Writes and reads of volatile long and double values are always atomic.

Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.

Note that in general, updating 64-bit chunks of memory in the JVM may not be atomic, but reference updates specifically must be atomic as per the JLS.

This is a rather delicate point – in the general case, if reference assignments were not guaranteed to be atomic (as is sometimes the case in other languages), this example would not be correct since the reading threads may see a partially initialised reference.

Since this is such a delicate point, I think it’s worth mentioning explicitly in the book’s text.

Understanding the Y Combinator

I am now reading Types and Programming Languages, and I’ve been trying to wrap my head around the Y combinator and why it works. This is how I explained it to myself:

We want to define a recursive function, which means the function should be able to refer to itself. Unfortunately, Lambda calculus does not give us this option, so we define a function of the form:

g = \lambda func. \lambda param.\space\text{return something or} func(param')

Where param' is some reduced version of param . In order for this thing to be a recursion, we need to find a parameter func for which:

func = g(func)

We notice that func is a fixed point of g by definition, hence the name fixed point combinators. We also notice that it must be a function of g (otherwise it’s a constant and we can show trivially that it doesn’t work, duh), so we can write:

Y(g) = func = g(func) = g(Y(g))

Cool. Now we are looking for term_0 such that:

Y = \lambda f.\underbrace{\text{ } f(term_0) \text{ }}_{term_0}

Hmmm… Ok. Let’s try to name term_0 and pass it as a parameter to f , will that work?

Y = \lambda f.\underbrace{(\lambda x. f(x))}_{term_0} term_0 =\lambda f.\underbrace{(\lambda x. f(x))}_{term_0}\underbrace{(\lambda x. f(x))}_{term_0}

This sucks. After our change we no longer need to pass term_0 to f , but term_1 :

Y = \lambda f.\underbrace{(\lambda x. f(term_1))(\lambda x. f(term_1))}_{term_1}

Fortunately, can easily express term_1 using our bound variable x as (x x) ! So if we write:

Y = \lambda f.(\lambda x. f(x x))(\lambda x. f(x x))

We win!

We can see that the Y combinator as we defined it is a fixed point of the function g, just like we wanted.
Great success :)

I ignored reduction rules for the sake of simplicity, but it’s interesting to note that the book talks about another type of fixed point combinator, which should be used under different reduction rules:

fix = \lambda f. (\lambda x. f (\lambda y. x x y))(\lambda x. f (\lambda y. x x y))

I wonder how many fixed points g has under each set of reduction rules.

 

Please Don’t Use the Word Voodoo

Not many things irk me as the term Voodoo, when it’s applied to software engineering.

“I found a piece of code that solves this bug, but it’s completely Voodoo”

“Working with this library is like doing Voodoo”

Don’t. Say. That. It makes the problem you are working on seem like there is some unknown force that makes it impossible to solve in logical and analytical means. It makes it sound like the problem cannot be solved at all. It makes you sound like you are powerless against it. It makes you sound like you gave up.

It makes you sound unprofessional.

There are no ghosts in your computer. Everything has a logical explanation, you just haven’t found it yet. And wherever you use the term Voodoo, you can replace it with the words “I don’t understand” and get a perfectly valid sentence, that invites further analytical investigation, questioning and rational decision making.

“I found a piece of code that solves this bug but I don’t understand why it works”

“I don’t understand the library I’m working with”

From here you can continue to ask questions and make informed decisions. You can always invest more time in understanding the piece of technology you are dealing with. But you can also ask how much time it is going to require, what is the risk if you don’t, and if it is worth your time.

Please don’t use the term Voodoo when you talk about software engineering. We all are professionals here.

Tesseract OCR + Open CV 2 on iOS

Disclaimer: This is a quick and messy post that I wrote just so that I will not forget what I did. I am not sure that this is the best way to do things and there may be redundant or missing steps. I will clean this up if I need to do it again. If you happen to go through this procedure and find mistakes, drop me a note and I will fix the post.

The problem

  • I needed to add this build of Tesseract OCR to my iOS project, which uses Open CV 2.
  • Following the installation procedure in the link, caused the following error:

    ‘opencv2/opencv.hpp’ file not found

  • I tried solving it using this stack overflow solution, but it caused a circular dependency issue – every time I removed and added some framework, I got the following error for another framework:

    ld: framework not found <Framework Name>
    clang: error: linker command failed with exit code 1 (use -v to see invocation)

What worked

  • Added the Tesseract OCR iOS project from the link into my workspace
  • Added another build target for the Tesseract project, of type: Cocoa Touch Static Library
  • For my new target:
    • Under build phases, compile sources – added Tesseract.mm (add any file that needs compiling, really)
    • Under build phases, copy files – added Tesseract.h
    • Under link with libraries, added:
      • libstdc++.6.0.9.dylib (Not sure it’s needed here)
      • All the files under the lib folder in the Tesseract OCR project
    • In Build Settings, click “all” so that you can see all settings and look for ‘architecture’ in the search box. Changed “Build Active Architecture Only” to no.
      Made sure I have all the relevant architectures selected.
    • In Build Settings, C++ Standard Library, chose libc++
      (I did the same for the project itself as well, I’m not sure it’s necessary)
  • Selected my new library as a build target and built the thing.
  • For my main project, under the app target, general, linked framework and libraries, added:
    • My new library
    • libstdc++.6.0.9.dylib (Here it’s a must)
  • Added tessdata to the project

Used Tesseract to recognise text in images to my heart’s content.