Sunday, September 17, 2017

wunderscraping and S3 classes in R

I recently wrote an R package where I use generic functions.  Generic functions are how R implements object oriented programming, and for R they are very informal.  Everyone who uses R uses generic functions.  A familiar generic function is summary, which takes any single object and prints summary information.  How does summary know how to handle each type of object?  It doesn't, summary is a generic function that passes the object off to its method, which is another function that does know what to do with it.  Methods MUST be named as such: generic.class.  So, the summary method for lm objects is summary.lm.  Users can see help for summary.lm directly with ?summary.lm.

In summary (no pun!), all you have to do to add a new method is write a function and name it as generic.class.  So, if I made a new class for weather objects named Wx and I wanted to add a summary method, I'd simply name the function summary.Wx.  If I want to add a completely new method then I need to register it first, using UseMethod.  With UseMethod, I can make a new generic function that will route objects to their appropriate methods.  A simple generic function is foo <- function(x) UseMethod('foo'), for which a simple method is foo.bar <- function(x) print(class(x)) # -> 'bar'.  Be careful that the generic function accepts the arguments that the methods will need.  If foo.bar is foo.bar <- function(x, sufffix) print(paste0(x, suffix)) then foo(barObject, 'the third') will fail with an unused argument error.  Generic functions that must accept unknown arguments for future methods can use ellipses: foo <- function(x, ...) UseMethod('foo').  Notice that foo must accept all arguments that any method foo may require, so it's simple to use ellipses if you cannot be sure the generic function will only ever need the class object.

Below is a short concrete example creating a scheduler class and methods to add, clean, and execute the schedule.  The class uses environments and datetime objects, both of which can be unfamiliar to most R users, but using generic functions the scheduler object is given a methods interface that is easy to use:

scheduler <- function() { ## constructor function for scheduler object
    e <- structure(new.env(), class='scheduler')
    e $count <- 0
    e $date=format(Sys.Date(), tz='America/New_York')
    e
}

## generic functions
check <- function(x) UseMethod('check')
clean <- function(x) UseMethod('clean')
plan <- function(x, ...) UseMethod('plan')
schedule <- function(x) UseMethod('schedule')
## default methods
check.default <- function(x) warning(paste0('get cannot handle class ', class(x)))
clean.default <- function(x) warning(paste0('clean cannot handle class ', class(x)))
plan.default <- function(x) warning(paste0('set cannot handle class ', class(x)))
schedule.default <- function(x) warning(paste0('schedule cannot handle class ', class(x)))

## scheduler methods
check.scheduler <- function(scheduler) ls.str(scheduler)

clean.scheduler <- function(scheduler) scheduler $schedule <- with(scheduler, schedule[schedule>Sys.time()])

plan.scheduler <- function(scheduler, ...) { # convenience wrapper around seq.POSIXt
    scheduler $schedule <- seq(strptime(0, '%H'), strptime(23, '%H'), ...)
    scheduler $times <- strftime(scheduler $schedule, FORMAT)
}

schedule.scheduler <- function(scheduler) {} ## execute the schedule

Using the class is simple:

mysch <- scheduler()
plan(mysch, by='90 min') # using seq.POSIXt makes periodic scheduling easy
## clean(mysch) # don't clean to start now, else wait till next period

someFunction(mysch) ## write the method for the scheduler and use it from someFunc

For an even more concrete example, check out wunderscraper here.  Make a schedule, as discussed above, and use it to scrape wunderground using main(mysch).  All users will need to register first for a Wunderground API key!

Post Script:
What do you think about S3 classes?  They are very informal, and yet very useful and making R more user friendly and also indicating, but not enforcing, a certain structural expectation about how users should work with a class.  As long as people don't do such insane and anti-social things as changing the class of an R object, then the informality of S3 classes is OK.  What is more controversial, perhaps, about S3 classes is that they are method centric, rather than class centric.  Users familiar with Python or C++ understand using classes are relatively independent objects, whereas R's generic functions creates a framework where all objects are related by a similar set of methods.  Users can completely ignore the fact that a generic function named plot exists for visualizing objects, and instead make a new generic function named graph, or some other synonym, but again that would be rather anti-social coding behaviour, and not something most people are going to do by accident.

R has more formal classes implemented in the S4 anc RC classes, but I actually prefer the informality and flexibility of S3.  I particularly find S4 a poor fit with R's already byzantine typing.  RC classes look useful for reference semantics, however as you can see in the above example, environments provide a useful container with reference semantics.  What is particularly useful about the environment is that if the function using the environment crashes, the environment state remains as left by the function, and can be inspected or simply reused.  The wunderscraper, for example, must keep count of how many API calls it makes in a day.  If the wunderscraper crashes, it can be restarted with the same environment, and it will pick back up.  For development purposes, I can even change the scraping function, recompile the code, and restart the new code with the old environment, as long as the new code hasn't change the environment.
Darkest dungeon is a slow descent into madness.  It starts off innocently enough.  A path through the woods to a crumbling hamlet.  A few heroes, a torch, and a dungeon waiting to be explored.  And then dungeon conquered!  We return to a celebratory drink or a prayer at the hamlet.  More heroes show up--wide eyed and ready explore and rebuild the ruined landscape.  And the party embarks for another dungeon, and another, and another.  But soon the voices start.  The dungeons grow darker, the monsters more brutal, and the heroes more mad.  Weeks pass in the game; hours pass in real life.  The dungeon eventually massacres your favorite heroes.  Such brutality cannot go unanswered!  With maddening, obsessive, resolve redoubled the entire hamlet wails, thrashes, and prepares to lash back.  Every waking moment is dedicated to taking the dungeon.  Sanity begins to slip, for both the heroes and the players, and soon the dungeon, it's loot and denizens, are the only thing that matters.  The dungeon occupies every thought until we occupy its halls as our home; having become monsters ourselves.

Darkest Dungeon is a party management roguelike.  Players embark from a home-base hamlet to explore a series of dungeons.  After each dungeon the player returns to a hamlet where they can heal the wounded heroes in the party, upgrade hero skills and equipment, and recruit new heroes.  Heroes belong to one of thirteen classes with more classes under development.  Each class has seven skills but only four can be active at a time.  All skills can be upgraded.  All heroes have a weapon and armour that cannot be swapped but are also upgradeable.  Finally each hero can equip two trinkets that modify their stats; some trinkets can only be equipped by one class.

I've installed and deleted Darkest Dungeon from my computer no fewer than three times.  This is because the game is so good that when I start playing the only way to stop is to remove the game.  This also requires me to wait half an hour to redownload the game whenever I foolishly decide to start playing it again.  This game is a time suck.  Party management is deep, and the dungeons are fun and the hamlet-dungeon rhythm really keeps the player in a "just one more" mindset so that's hard to quit playing.

Darkest Dungeon pulls off Lovecraft Horror as good as any other game or movie.  The game is further flavoured with an Eastern European air in terms of the trappings; from the hamlet, the heroes, and the voice-acting.  The narration and quips are appropriately atmospheric and not usually overdone.  When the player kills a monster with poison the narrator speaks in a seething voice:
"slowly, gently, this is how a life is taken"
And when the player takes a critical hit the narrator exclaims:
"mortality clarified...in a single strike"

The strategy is also fun.  The player can craft different strategies around different parties.  Many effective straightforward party builds exist but the player can also put together more complex parties where heroes depend heavily on particular skills in the party.  Each party has it's own rhythm and critical flaws that become apparent only when trying to apply the build in the dungeon.

...to be continued?