How to add Options in Scala

Options are Monads and we can perform operations in their underlying types in a typesafe way using Scalaz:

  import scalaz._
  import Scalaz._

  val a: Option[BigDecimal] = BigDecimal("0.123").some
  val b: Option[BigDecimal] = BigDecimal("1.234").some
  val c: Option[BigDecimal] = BigDecimal("2.345").some

  println( (a |@| b |@| c) { _ + _ + _ } )

  //yields: Some(3.702)

and if the chain is broken, we don’t have to worry since it’s getting short-circuited:

  import scalaz._
  import Scalaz._

  val a: Option[BigDecimal] = BigDecimal("0.123").some
  val b: Option[BigDecimal] = None
  val c: Option[BigDecimal] = BigDecimal("2.345").some

  println( (a |@| b |@| c) { _ + _ + _ } )

  //yields: None

Type Inference in Scala example

Let’s take a function definition and slim it down using type inference in our advantage:

def f: (Int => Int) = (x:Int) => x + 1
def f: (Int => Int) =  x      => x + 1
def f: (Int => Int) =            _ + 1
def f               = (x:Int) => x + 1

Spring JPA, Hibernate, Data XML configuration

Here’s a cleaned up, plug’n’play version of a Spring database XML config taken from a demo project employing Spring Data for the Repositories interfaces and Hibernate behind JPA.

<beans 	xmlns=""

	<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager"
		  p:entityManagerFactory-ref="entityManagerFactory" />

	<jpa:repositories base-package=""

	<tx:annotation-driven transaction-manager="transactionManager"/>

	<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
		<property name="dataSource" ref="dataSource" />
		<property name="jpaVendorAdapter">
			<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"
					p:generateDdl="${database.generateDdl}" />
		<property name="jpaProperties">
				<prop key="hibernate.dialect">${database.databasePlatform}</prop>
				<prop key="hibernate.max_fetch_depth">3</prop>
				<prop key="hibernate.fetch_size">50</prop>
				<prop key="hibernate.batch_size">10</prop>
				<prop key="hibernate.show_sql">true</prop>
		<property name="packagesToScan" value=""/>
	<bean class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close" id="dataSource"


Scala Pimp-My-Library example

Recently working on a project test there was an incompatibility of the assert keyword between ScalaTest and Scalaz. I was looking after Scalaz’s feature of invoking some on any data type after import scalaz.Scalaz._

After a while I gave up and wrote it myself, surprisingly simple and concise:

class SomeType[T](t:T){
def some = Some(t)
implicit def typeToSomeType[T](t:T) = new SomeType(t)

Making possible:

scala> 123.some
res7: Some[Int] = Some(123)

res8: Some[String] = Some(hi)

Implicits is a nice way to type-safe upon compile time some global state (as long as it doesn’t become unwieldy long).

In Groovy we could achieve same effect via the MetaClass object, but without compile time type-safety as Groovy it’s a dynamic language and all the magic happens at runtime. Also we can’t use parametric polymorphism while invoking the MetaClass object (or at least I don’t know how to do it!).

groovy> String.metaClass.quote={"*"+delegate+"*"}
===> groovysh_evaluate$_run_closure1@4b51ac10
groovy:000> "hi".quote()
===> *hi*

ConcurrentHashMap computeIfAbsent method in Java 8

The very nifty method computeIfAbsent has been added in the ConcurrentMap interface in Java 8 as part of the atomic operations of the ConcurrentMap interface. It’s more precisely a default method that provides an alternative to what we use to code ourselves:

if (map.get(key) == null) {
   V newValue = mappingFunction.apply(key);
   if (newValue != null)
      return map.putIfAbsent(key, newValue);

but this time providing a function as a second argument.

Most often this method will be used in the context of ConcurrentHashMap in which case the method is implemented in a thread-safe synchronised way.

In terms of usage the method is handy for situations where we want to maintain a thread-safe cache of expensive one-off computed resources.

Here’s another example of holding a key-value pair where value is a thread-safe counter represented by an AtomicInteger:

private final Map counters = new ConcurrentHashMap();

private void accumulate(String name) {
    counters.computeIfAbsent(name, k -> new AtomicInteger()).incrementAndGet();

Visitor Design Pattern in Scala

Scala provides built-in support for the Visitor Design Pattern through the use of pattern matching.

The Visitor Design Pattern is trying to address the problem of never-ending new functionality that is otherwise implemented by adding new methods in the inheritance tree.

According to the Visitor pattern the inheritance tree is decoupled from a new functionality and encapsulated in a separate object that packs all implementations according to the inheritance tree normally by using method overloading of all the various types.

In Scala by the use of the built-in pattern matching this becomes very easy:

class Animal { def walk:String }

class Dog extends Animal { override def walk = "on 4" }

class Man extends Animal { override def walk = "on 2" }

 *  Visitor Pattern provides a solution to never-ending new
 *  functionality that would otherwise be implemented with new
 *  method in the hierarchy tree.
def talk(animal: Animal) = animal match {
  case Dog => "wav wav"
  case Man => "hi"

 *  New functionality implemented in separate methods that
 *  uses pattern matching to implement tailored functionality.
def swim(animal: Animal) = animal match {
  case Dog => "on 4"
  case Man => "on 4"

Strategy Pattern in Scala – A Pragmatic Example

Let’s say we want to create a newsletter grouping together all the Daily Deals coming from popular IT Book Publishers.

We can source the Daily Deal information by web-scrapping the publisher website.

We’ll employ the Strategy Pattern as we want to encapsulate the web-scrapping algorithm that is distinct for each publisher website and we want to maintain flexibility of introducing new publishers in the future without altering the core context.

Essentially where we want to get to is a form like this:

new WebScrappingContext(Strategy).dailyDeal(url)

In the above form, the Strategy can vary and be interchangeable, paired with the publisher url that will be applied to the Strategy Functor.

Let’s start by defining a helpful type Strategy that gets a url and produces the Daily Deal:

type WebScrappingStrategy = String => String

Next we’ll create the context that would be hosting the Strategy Functor and would apply the url string on it:

  case class WebScrap(strategy: WebScrappingStrategy) {
    def dailyDeal(url:String) = strategy( url )

Note that we have used a case class but we could as well use a method to achieve the same effect:

def dailyDeal( url:String, webScrappingStrategy: WebScrappingStrategy ) = webScrappingStrategy( url )

Then we’ll get busy with web-scrapping publisher websites.

The Manning Daily Deal web-scrapper Strategy:

    def ManningWebScrappingStrategy: WebScrappingStrategy  =
      (url:String) =>
      Jsoup.parse( new WebClient(BrowserVersion.CHROME).getPage( url )
                    .asInstanceOf[HtmlPage].asXml )
        .select("div.dotdbox b").text

Note that the Manning website is using JavaScript to create the Daily Deal section that Jsoup cannot parse therefore we use HtmlUnit.

One step further with the call to the Strategy context case class:

  def ManningDailyDeal = {

    val ManningWebScrappingStrategy: WebScrappingStrategy  =
      (url:String) =>
      Jsoup.parse( new WebClient(BrowserVersion.CHROME).getPage( url )
                    .asInstanceOf[HtmlPage].asXml )
        .select("div.dotdbox b").text

    WebScrap( ManningWebScrappingStrategy ).dailyDeal( "" )

The Strategy Pattern is the last returned line in the above variable:

WebScrap( ManningWebScrappingStrategy ).dailyDeal( "" )

The url is inherent to the Strategy and ultimately to the Publisher therefore the above grouping under the ManningDailyDeal variable.

Similarly here are the other Publisher Strategies:

  def OReillyDailyDeal = {

    val OReillyWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_)"a[href$=DEAL] strong").get(0).text

    WebScrap( OReillyWebScrappingStrategy ).dailyDeal( "" )
  def APressDailyDeal = {

    val APressWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_)"div.block-dotd").get(0).select("a")

    WebScrap( APressWebScrappingStrategy ).dailyDeal( "" )
  def SpringerDailyDeal = {

    val SpringerWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_)"div.block-dotd").get(1).select("a")

    WebScrap( SpringerWebScrappingStrategy ).dailyDeal( "" )

That’s pretty much it for the Strategy Pattern. If we want to take it one step further packing it up in a nice Factory Method OO Pattern:

  trait Publisher
  object Manning extends Publisher
  object APress extends Publisher
  object Springer extends Publisher
  object OReilly extends Publisher

  object DailyDeal {
    def apply(publisher: Publisher) = publisher match {
      case Manning => ManningDailyDeal
      case APress => APressDailyDeal
      case Springer => SpringerDailyDeal
      case OReilly => OReillyDailyDeal

So we can make calls like so:


Full code on this GitHub repository.

How to Web Srap Html page after JS loads

Sometimes Jsoup is not enough and in cases where we want the final version of the Html file after JS (redirects etc) first loads then we can use HtmlUnit.

It makes the difference between this:

<div class="dotdbox"> 
 <div style="color: #000000;text-align: center;padding: 3px 2px 0px 2px; font-size: 11px;background-color: #ffffff;"> 
  <script language="JavaScript" src="" type="text/javascript"></script> 
  <p><span style="font-size: 11px;"><a href="/free/dotd.html">Get the Deal of the Day email alert</a></span></p>

and that:

Jsoup.parse( new WebClient(BrowserVersion.CHROME).getPage("")
                        .asInstanceOf[HtmlPage].asXml ).select("div.dotdbox").text
<div class="dotdbox"> 
 <div style="color:#000000;text-align:center;padding:3px 2px 0;font-size:11px;background-color:#ffffff;"> 
                       January 19, 2014 
  <br /> 
  <br /> 
  <b> <a href=""> Practical Data Science with R </a> </b> 
  <br /> 
  <br /> Get half off the eBook or pBook 
  <br /> 
  <br /> Enter dotd040614 in the Promotional Code box when you check out 
  <p> <span style="font-size:11px;"> <a href="/free/dotd.html"> Get the Deal of the Day email alert </a> </span> </p> 

CentOS/RedHat make port 8080 visible

I am a happy DigitalOcean customer primarily because of the low cost, the SSD drives, the friendly stuff and the flexibility by which you can reshape your purchased resources into droplets within the 4 DataCenters (2 in NY and 2 in Amsterdam) supported.

Until the need for a UK DataCenter arises which leads me to RackSpace.

On both private cloud hosting providers I am making a web service available that needs to be accessible @ port 8080. The CentOS flavour assembled in DigitalOcean has everything permitted by default in its iptables settings but the one assembled in RackSpace does not.

When I issue the iptables command I get:

[dimitrisli@lon1 ~]# iptables -L -n --line-numbers
Chain INPUT (policy ACCEPT)
num target prot opt source destination
2 ACCEPT icmp --
3 ACCEPT all --
4 ACCEPT tcp -- state NEW tcp dpt:22
5 REJECT all -- reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all -- reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
num target prot opt source destination

And just by adding permission for port 8080 will put it by default under the last reject input policy so the correct command should be putting the permission at the current spot of the reject input policy:

[dimitrisli@lon1 ~]# iptables -I INPUT 5 -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT -m comment --comment "Jetty Server port"

[dimitrisli@lon1 ~]# service iptables save

that eventually does the trick.