Let’s say we want to create a newsletter grouping together all the Daily Deals coming from popular IT Book Publishers.
We can source the Daily Deal information by web-scrapping the publisher website.
We’ll employ the Strategy Pattern as we want to encapsulate the web-scrapping algorithm that is distinct for each publisher website and we want to maintain flexibility of introducing new publishers in the future without altering the core context.
Essentially where we want to get to is a form like this:
new WebScrappingContext(Strategy).dailyDeal(url)
In the above form, the Strategy can vary and be interchangeable, paired with the publisher url that will be applied to the Strategy Functor.
Let’s start by defining a helpful type Strategy that gets a url and produces the Daily Deal:
type WebScrappingStrategy = String => String
Next we’ll create the context that would be hosting the Strategy Functor and would apply the url string on it:
case class WebScrap(strategy: WebScrappingStrategy) {
def dailyDeal(url:String) = strategy( url )
}
Note that we have used a case class but we could as well use a method to achieve the same effect:
def dailyDeal( url:String, webScrappingStrategy: WebScrappingStrategy ) = webScrappingStrategy( url )
Then we’ll get busy with web-scrapping publisher websites.
The Manning Daily Deal web-scrapper Strategy:
def ManningWebScrappingStrategy: WebScrappingStrategy =
(url:String) =>
Jsoup.parse( new WebClient(BrowserVersion.CHROME).getPage( url )
.asInstanceOf[HtmlPage].asXml )
.select("div.dotdbox b").text
Note that the Manning website is using JavaScript to create the Daily Deal section that Jsoup cannot parse therefore we use HtmlUnit.
One step further with the call to the Strategy context case class:
def ManningDailyDeal = {
val ManningWebScrappingStrategy: WebScrappingStrategy =
(url:String) =>
Jsoup.parse( new WebClient(BrowserVersion.CHROME).getPage( url )
.asInstanceOf[HtmlPage].asXml )
.select("div.dotdbox b").text
WebScrap( ManningWebScrappingStrategy ).dailyDeal( "http://www.manning.com" )
}
The Strategy Pattern is the last returned line in the above variable:
WebScrap( ManningWebScrappingStrategy ).dailyDeal( "http://www.manning.com" )
The url is inherent to the Strategy and ultimately to the Publisher therefore the above grouping under the ManningDailyDeal
variable.
Similarly here are the other Publisher Strategies:
def OReillyDailyDeal = {
val OReillyWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_).get.select("a[href$=DEAL] strong").get(0).text
WebScrap( OReillyWebScrappingStrategy ).dailyDeal( "http://oreilly.com" )
}
def APressDailyDeal = {
val APressWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_).get.select("div.block-dotd").get(0).select("a")
.get(0).select("img").attr("alt")
WebScrap( APressWebScrappingStrategy ).dailyDeal( "http://www.apress.com" )
}
def SpringerDailyDeal = {
val SpringerWebScrappingStrategy: WebScrappingStrategy = Jsoup.connect(_).get.select("div.block-dotd").get(1).select("a")
.get(0).select("img").attr("alt")
WebScrap( SpringerWebScrappingStrategy ).dailyDeal( "http://www.apress.com" )
}
That’s pretty much it for the Strategy Pattern. If we want to take it one step further packing it up in a nice Factory Method OO Pattern:
trait Publisher
object Manning extends Publisher
object APress extends Publisher
object Springer extends Publisher
object OReilly extends Publisher
object DailyDeal {
def apply(publisher: Publisher) = publisher match {
case Manning => ManningDailyDeal
case APress => APressDailyDeal
case Springer => SpringerDailyDeal
case OReilly => OReillyDailyDeal
}
}
So we can make calls like so:
DailyDeal(Manning)
DailyDeal(OReilly)
DailyDeal(APress)
DailyDeal(Springer)
Full code on this GitHub repository.