Saturday, October 20, 2012

eValhalla Setup

[Previous in this series: eValhalla Kick Off, Next: eValhalla User Management]

The first step in eValhalla after the official kick off is to setup a development environment with all the selected technologies. That's the goal for this iteration. I'll quickly go through the process of gathering the needed libraries and implement a simple login form that ties everything together.

Technologies

Here are the technologies for this project:
  1. Scala programming language - I had a dilemma. Java has a much larger user base and therefore should have been the language of choice for a tutorial/promotional material on HGDB and JSON storage with it. However, this is actually a serious project to go live eventually and I needed an excuse to code up something more serious with Scala, and Scala has enough accumulated merits, so Scala it is. However, I will show some Java code as well,  just in the form of examples, equivalent to the main code.
  2. HyperGraphDB with mJson storage - that's a big part of my motivation to document this development. I think HGDB-mJson are a really neat pair and more people should use them to develop webapps. 
  3. Restlet framework REST - this is one of very few implementations of JSR 311, that is sort of lightweight and has some other extras when you need them. 
  4. jQuery - That's a no brainer.
  5. AngularJS - Another risky choice, since I haven't used this before. I've used KnockoutJS and Require.js, both great frameworks and well-thought out. I've done some ad hoc customization of HTML tags, tried various template engines, AngularJS promises to give me all of those in a single library. So let's give it a shot.
Getting and Running the Code

Before we go any further, I urge you to get, build and run the code. Besides Java and Scala, I encourage you to get a Git client (Git is now supported on Windows as well), and you need the Scala Build Tool (SBT). Then, on a command console, issue the following commands:
  1. git clone https://github.com/publicvaluegroup/evalhalla.git
  2. cd evalhalla
  3. git checkout phase1
  4. sbt
  5. run
Note the 3d step of checking out the  phase1 Git tag - every blog entry is going to be a separate development phase so you can always get the state of the project at a particular blog entry.  If you don't have Git, you can download an archive from:

https://github.com/publicvaluegroup/evalhalla/zipball/phase1

All of the above commands will take a while to execute the first time, especially if you don't have SBT yet. But at the end you should see the something like this on your console:

[info] Running evalhalla.Start 
No config file provided, using defaults at root /home/borislav/evalhalla
checkpoint kbytes:0
checkpoint minutes:0
Oct 18, 2012 12:01:01 AM org.restlet.engine.connector.ClientConnectionHelper start
INFO: Starting the internal [HTTP/1.1] client
Oct 18, 2012 12:01:01 AM org.restlet.engine.connector.ServerConnectionHelper start
INFO: Starting the internal [HTTP/1.1] server on port 8182
Started with DB /home/borislav/evalhalla/db

and you should have a running local server accessible at http://localhost:8182. Hit that URL, type in a username and a password and hit login.

Architectural Overview

The architecture is made up of a minimal set of REST services that essentially offer user login and access-controlled data manipulation to a JavaScript client-side application. The key will be to come up with an access policy that deals gracefully with a schema free database.

The data itself consists of JSON objects stored as a hypergraph using HGDB-mJson. From the client side we can create new objects and store them. We can then query for them or delete them. So it's a bit like the old client-server model from the 90s. HyperGraphDB supports strongly typed data, but we won't be taking advantage of that. Instead, each top-level JSON object will have a special property called entity that will contain the type of the database entity as a string. This way, when we search for all users for example, we'll be searching for all JSON objects with property entity="user".

There are many reasons to go for REST+AJAX rather than, say, servlets. I hardly feel the need to justify it  - it's stateless, you don't have to deal with dispatching, you just design an API, more responsive, we're in 2013 soon after all. The use of JSR 311 allows us to switch server implementations easily. It's pretty well-designed: you annotate your classes and methods with the URI paths they must be invoked for. Then a method's parameters can be bound either to portions of a URI, or to HTTP query parameters or form fields etc.

I'm not sure yet what the REST services will be exactly, but the idea is to keep them very generic so the core could be just plugged for another webapp and writing future applications could be entirely done in JavaScript.

Project Layout

The root folder contains the SBT build file build.sbt, a project folder with some plugin configurations for SBT and a src folder that contains all the code following Maven naming conventions which SBT adopts. The src/main/html and src/main/javascript folders contain the web application. When you run the server with all default options, that's where it serves up the files from. This way, you can modify them and just refresh the page.  Then src/main/scala contains our program and src/main/java some code for Java programmers to look at and copy & paste. The main point of the Java code is really to help people that want to use all those cool libraries but prefer to code in Java instead.

To load the project in Eclipse,  use SBT to generate project files for you. Here's how:
  1. cd evalhalla
  2. sbt
  3. eclipse
  4. exit
Then you'll have a .project and a .classpath file in the current directory, so you can go to your Eclipse and just do "Import Project". Make sure you've run the code before though, in order to force SBT to download all dependencies.

Code Overview

Ok, let's take a look at the code now, all under src/main. First, look at html/index.html, which gets loaded as the default page. It contains just a login form and the interesting part is the function eValhallaController($scope, $http). This function is invoked by AngularJS due to the ng-controller attribute in the body tag. It provides the data model of the HTML underneath and also a login button event handler, all through the $scope parameter. The properties are associated with HTML element via ng-model and buttons to functions with ng-click. An independent tutorial on AngularJS, one of few since it's pretty new, can be found here.

The doLogin posts to /rest/user/login. That's bound to the evalhalla.user.UserService.authenticate method (see user package). The binding is done through the standard JSR 311 Java annotations, which also work in Scala. I've actually done an equivalent version of this class in Java at  java/evalhalla/UserServiceJava. A REST service is essentially a class where some of the public methods represent HTTP endpoints. An instance of such a class must be completely stateless, a JSR 311 implementation is free to create fresh instances for each request. The annotations work by deconstructing an incoming request's URI into relative paths at the class level and then at the method level. So we have the @Path("/user") annotation (or @Path("/user1") for the Java version so they don't conflict).  Note the @Consumes and @Produces annotations at the class level that basically say that all methods in that REST service accept JSON content submitted and return back JSON results. Note further how the authenticate method takes a single Json parameter and returns a Json value. Now, this is mjson.Json and JSR 311 doesn't know about it, but we can tell it to convert to/from in/output stream. This is done in the java/evalhalla/JsonEntityProvider.java class (which I haven't ported to Scala yet). This entity provider and the REST services themselves are plugged into the framework at startup, so before examining the implementation of authenticate, let's look at the startup code.

The Start.scala file contains the main program and the JSR 311 eValhalla application implementation class. The application implementation is only required to provide all REST services as a set of classes that the JSR 311 framework introspects for annotations and for the interfaces they implement. So the entity converter mentioned above, together with both the Scala and Java version of the user service are listed there. The main program itself contains some boilerplate code to initialize the Restlet framework and asks it to serve up some files from the html and javascript folders and it also attaches the JSR 311 REST application under the 'rest' relative path.

An important line in main is evalhalla.init(). This initializes the evalhalla package object defined in scala/evalhalla/package.scala. This is where we put all application global variables and utility methods. This is where we initialized the HyperGraphDB instance. Let's take a closer look. First, configuration is optionally provided as a JSON formatted file, the only possible argument to the main program. All properties of that JSON are optional and have sensible defaults. With respect to deployment configuration, there are two important locations: the database location and the web resources location. The database location, specified with dbLocation, is by default taken to be db under the working directory, from where the application is run. So for example if you've followed the above instructions to run the application from the SBT command prompt for the first time, you'd have a brand new HyperGraphDB instance created under your EVALHALLA_HOME/db. The web resources served up (html, javascript, css, images) are configured with siteLocation the default being src/main so you can modify source and refresh. So here is how the database is created. You should be able to easily follow Scala code even if you're a mainly Java programmer.

    val hgconfig = new HGConfiguration()
    hgconfig.getTypeConfiguration().addSchema(new JsonTypeSchema())
    graph = HGEnvironment.get(config.at("dbLocation").asString(), hgconfig)
    registerIndexers
    db = new HyperNodeJson(graph)

Note that we are adding a JsonTypeSchema to the configuration before opening the database. This is important for the mJson storage implementation that we are mostly going to rely on. Then we create the graph database, create indices (for now just an empty stub) and last but not least create an mJson storage view on the graph database - a HyperNodeJson instance. Please take a moment to go through the wiki page on HGDB-mJson. The graph and db variables above are global variables that we will be accessing from everywhere in our application. 

Some other points of interest here are the utility methods:

  def ok():Json = Json.`object`("ok", True);
  def ko(error:String = "Error occured") = Json.`object`("ok", False, "error", error);

Those are global as well and offer some standard result values from REST services that the client side may rely on. Whenever everything went well on the server, it returns an ok() object that has a boolean true ok property. If something went wrong, the ok boolean of the JSON returned by a REST call is false and the error property provides an error message. Any other relevant data, success or failure, is embedded with those ok or ko objects. 

Lastly, it is common to wrap pieces of code in transactions. After all, we are developing a database backed applications and we want to take full advantage of the ACID capabilities of HyperGraphDB. Scala makes this particularly easy because it supports closures. So we have yet another global utility method that takes a closure and runs it as a HGDB transaction:

  def transact[T](code:Unit => T) = {
    try{
    graph.getTransactionManager().transact(new java.util.concurrent.Callable[T]() {
      def call:T = {
        return code()
      }
    });
    }
    catch { case e:scala.runtime.NonLocalReturnControl[T] => e.value}
  }

This will always create a new transaction. Because BerkeleyDB JE, which is the storage engine by default as of HyperGraphDB 1.2, doesn't supported nested transaction, one must make sure the transact is not called within another transaction. So when we are in a situation where we want to have a transaction and we'd happily be embedded in some top-level one, we can call another global utility function: ensureTx, which behaves like transact except it won't create a new transaction if one is already in effect.

Ok, armed with all these preliminaries, we are now able to examine the authenticate method:

    @POST
    @Path("/login")
    def authenticate(data:Json):Json = {
        return evalhalla.transact(Unit => {
            var result:Json = ok();
            var profile = db.retrieve(jobj("entity", "user", 
                        "email", data.at("email").asString().toLowerCase()))
            if (profile == null) { // automatically register user for now         
              val newuser = data.`with`(jobj("entity", "user"));
              db.addTopLevel(newuser);
            }
            else if (!profile.is("password", data.at("password")))
                result = ko("Invalid user or password mismatch.");
            result;   
        });
    }

The @POST annotation means that this endpoint will be matched only for an HTTP post method. First we do a lookup for the user profile. We do this by pattern matching. We create a Json object that first identifies that we are looking for an object of type user by setting the entity property to "user". Then we provide another property, the user's email which we know is supposed to be unique so we can treat it as a primary key. However, note that neither HyperGraphDB nor its Json storage implementation provides some notion of a primary other than HyperGraphDB atom handles. The HyperNodeJson.retrieve method returns only the first object matching the pattern. If you want an array of all objects matching the pattern use HyperNodeJson.getAll. Note the 'jobj' method call in there: this is a rename in the import section of the Json.object factory method. It is necessary because in Scala object is a keyword. Another way to use a keyword as a method name in Scala beside import rename, on can wrap it in backquotes ` as is done with data.`with` above which is basically an append operation, merging the properties of one Json object into another. The db.addTopLevel is explained in the HGDB-mJson wiki. Also, you may want to refer to the mJson API Javadoc. One last point though about the structure of the authenticate method: there are no local returns. The result variable contains the result and it is written as the last expression of the function and therefore returned as the result. I like local returns actually (i.e. return statement in the middle of the method following if conditions or within loops or whatever), but the way Scala implements them is by throwing a RuntimeException. However, this exception gets caught inside the HyperGraphDB transaction which has a catch all clause and treats exceptions as a true error conditions rather then some control flow mechanism.  This can be fixed inside HyperGraphDB, but avoiding local returns is not such a bad coding practice anyway.

Final Comments

Scala is new to me so take my Scala coding style with a grain of salt. Same goes with AngularJS. I use Eclipse and the Scala IDE plugin from update site  http://download.scala-ide.org/releases-29/stable/site (http://scala-ide.org/download/nightly.html#scala_ide_helium_nightly_for_eclipse_42_juno for Eclipse 4.2). Some of the initial Scala code is translated from equivalent Java code from other projects. If you haven't worked with Scala, I would recommend giving it a try. Especially if, like me, you came to Java from a richer industrial language like C++ and had to give up a lot of expressiveness.

I'll resist the temptation to make this into an in-depth tutorial of everything used to create the project. I'll say more about whatever felt not that obvious and give pointers, but mostly I'm assuming that the reader is browsing the relevant docs alongside reading the code presented here. This blog is mainly a guide.

In the next phase, we'll do the proverbial user registration and put in place some security mechanisms.

No comments:

Post a Comment