Tuesday, July 28, 2015

HyperGraphDB 1.3 Released

Kobrix Software is pleased to announce the release of HyperGraphDB 1.3.

This is a maintenance release containing many bugs fixes and small improvements. Most of the efforts in this round have gone towards the various application modules built upon the core database facility.

Go directly to the download page.

HyperGraphDB is a general purpose, free open-source data storage mechanism. Geared toward modern applications with complex and evolving domain models, it is suitable for semantic web, artificial intelligence, social networking or regular object-oriented business applications.
This release contains numerous bug fixes and improvements over the previous 1.2 release. A fairly complete list of changes can be found at the Changes for HyperGraphDB, Release 1.3 wiki page.

HyperGraphDB is a Java based product built on top of the Berkeley DB storage library.

Key Features of HyperGraphDB include:
  • Powerful data modeling and knowledge representation.
  • Graph-oriented storage.
  • N-ary, higher order relationships (edges) between graph nodes.
  • Graph traversals and relational-style queries.
  • Customizable indexing.
  • Customizable storage management.
  • Extensible, dynamic DB schema through custom typing.
  • Out of the box Java OO database.
  • Fully transactional and multi-threaded, MVCC/STM.
  • P2P framework for data distribution.
In addition, the project includes several practical domain specific components for semantic web, reasoning and natural language processing. For more information, documentation and downloads, please visit the HyperGraphDB Home Page.

Many thanks to all who supported the project and actively participated in testing and development!

Monday, September 15, 2014

Jayson Skima - Validating JavaScript Object Notation Data

A schema is simply a pattern. A pure form. Computationally it can be used to try to match an instance against the pattern or create an instance from it. That's how XML schema, and even DTD, were traditionally used - mostly for validation but also as an easy way to create a fill-in-the-blanks type of template. Since JSON has been taking over XML by storm, the need for a schema eventually (in my case, finally!) overpowered the simplicity minimalist instinct of JSON lovers. Thus was born JSON Schema. Judging from the earliest activity on the Google Group where the specification committee hung out (https://groups.google.com/forum/#!forum/json-schema), the initial draft was done in 2008 and it was very much inspired by XML Schema. Which is not necessarily a bad thing. For one, the syntax used to define a schema is just JSON. A quick example:

"required":["firstName", "lastName"]

That is a complete, perfectly valid and useless schema, defined to validate some imaginary data describing people. People data will be the running example theme in this post, and if you prefer shopping cart orders, well, sorry to disappoint. So, when we validate against that schema, here is what we are enforcing:
  1. The thing must be a JSON object because we put "type":"object".
  2. If it has a firstName property, the value of that property must be a string.
  3. The value of the age property, if present, must be an number.
  4. The properties firstName and lastName are required.
Fairly straightforward. Nevermind that we haven't defined the format (i.e. the sub-schema) for the lastName property, we are still requiring its presence, we just don't care what the value is going to be. So, this is how it goes: a schema is a JSON object where you specify various constraints. If all the constraints are satisfied by a given JSON datum, the schema matches, otherwise it doesn't. The standard standardizes on what are the possible constraint, but with a few extra keywords to structure large, complex schemas.

Why Do I Care?

I can tell why I care then you can decide for yourself. When working with any sort of data structure, if you can't just assume that the structure has the expected form, it becomes very annoying, paranoia strikes at every corner and you become defensive and all sorts of mental disorders can ensue. That's why we use strongly typed languages. But if you, like me, have been doing the unorthodox thing and using JSON as your main data structure instead of spitting out a jar full of beans for your domain model then this sort of trouble befalls you.

A few years ago I made the decision to drop the beans for a project and since then I've been using the strategy in a few other, smaller scale projects where it works even better. But the popular Java libraries for JSON are a disaster. There has already been one JSR 353 about JSON, a "whatever" API, no wonder it seems dead on arrival, almost as bad as Jackson and Gson. And now Java 9 is promising a "lightweight JSON API" which looks like it might actually be well-designed, albeit it has different goals than what I need and simplicity is not one of them. So I wrote mJson. It is a small, 1 Java file, JSON library. I wanted something simple, elegant and powerful. The first two I think I've achieved, the "powerful" part is only half-way there. For instance, many people expect JSON libraries to have JSON<-> POJOs mappings and mJson doesn't, though it has extension points to do your own easily (frankly it takes 1/2 day to implement this stuff if you need it so much).

Modeling with beans offers the type checker's help on validating that structures have the desired form. If you are using JSON only to convert it to Java beans, I suppose the mapping process is a roundabout way to validate, to a certain extent. Otherwise, you either consent to live with the risk of bugs or to the extra bloat needed to defensively code against a structure that may be broken. To avoid these problems, you can write a schema, sort of like your own type rules and make use of it at strategic points in the program. Like when you are getting data from the internet. Or when you are testing a new module. Not that I'm advocating going for JSON + Schema instead of Java POJOs in all circumstances. But you should try it some time, see where it makes sense. By the way, in addition to a being a validation tool schemas are essentially metadata that represents your model (just like XML Schema).
Good. Now I want to give you a quick...

Crash Course on JSON Schema

First, the constraints are organized by the type of JSON. which is probably your starting point to describe how a JSON looks like:
{"type": "string"|"object"|"array"|"number"|"boolean"|"null"|"integer"|"any"}
As you can see, there are two additional possible types besides what you already expected: to avoid floats and to allow any type (which is the same as omitting a type constraint altogether).

Now, given the type of a JSON entity, there are further constraints available. Let's start with object. With properties you describe the object's properties' format in the form of sub-schemas for each possible property you want to talk about. You don't have to list all of them but if your set is exhaustive, you can state that with the additionalProperties keyword:
  "properties": { "firstName": { "type":"string"}, etc.... },
That keywords is actually quite versatile. Here we are disallowing any other properties besides the ones explicitly stated. If instead we want to allow the object to have other properties, we can set it to true, or not set it altogether. Or, the value of the additionalProperties keyword can alternatively be a sub-schema that specifies the form of all the extra properties in the object.

We saw how to specify required properties in the example above. Two other options constrain the size of an object: minProperties and maxProperties. And for super duper flexibility in naming, you can use regular expression patterns for property names - this could be useful if you have some format made of letters and numbers, or UUIDs for example. The keyword for that is patternProperties:
  "patternProperties": { "j_[a-zA-Z]+[0-9]+": { "type":"object"} },
The above allows 1 to 100 properties whose names follow a j_letters_digits pattern. That's it about objects. That's the biggy.

Validating arrays is mainly about validating their elements so you provide a sub-schema for the elements with the items keyword. Either you give a single schema or an array of schemas. A single schema will apply to all elements of an array while an array has to match element for element (a.k.a. tuple typing). That's the basis. Here are the extras: we have minItems and maxItems to control the size of the array; we have additionalItems which only applies when items is an array and it controls what to do with the extra elements when there are some. Similarly to the additionalProperties keyword, you can put false to disallow extra elements or supply a schema to validate them. Finally you can require that all items of an array be distinct with the uniqueItems keyword. Example:
  "items": { "type": "string" },
Here we are mandating exactly 10 unique strings in an array. That's it for areas. Numbers and strings are pretty simple. For number you can define range and divisibility. They keywords are minimum (for >=), maximum (for <=), exclusiveMinimum (if true, minimum means >), exclusiveMaximum (if true, maximum means <). Strings can be validated through a regex with the pattern keyword and by constraining their length with minLength and maxLength. I hope you don't need examples to see string and number validation in action. The regular expression syntax accepted is ECMA 262 (http://www.ecma-international.org/publications/standards/Ecma-262.htm).

Notice that there aren't any logical operators so far. In a previous iteration of the JSON Schema spec (draft 3), some of those keywords admitted areas as values with the interpretation of an "or". For example, the type could be ["string", "number"] indicating that a value can be either a string or a number. Those have been abandoned in favor of a comprehensive set of logical operators to combine schema into more complex validating behavior. Let's go through them: "and" is allOf, "or" is anyOf , "xor" is oneOf, "not" is not. Those are literally to be interpreted as standard logic: not has to be a sub-schema which must not match for validation to succeed, allOf has to be array of schemas and all of them have to match for the JSON to be accepted. Similarly anyOf is an array of which at least one has to match while oneOf means that exactly one of the schemas in the array must match the target JSON. For example to enforce that a person is married, we could declare that it must either have a husband or a wife property, but not both:
{ "oneOf":[
If you have a predefined list of values, you could use enum. For example, a gender property has to be either "male" or "female":
    "gender":{"enum":["male", "female"]}

With that, you know almost everything there is to know about JSON Schema. Almost. Above I mentioned "a few extra keywords to structure large complex schemas". I exaggerated. Actually there is only one such keyword: $ref (the related keyword id is not really needed). $ref allows you to refer to schemas defined elsewhere instead of having to spell out again the same constructs. For example if there is a standard format for address somewhere on the internet, with a schema defined for it and if that schema can be obtained at http://standardschemas.org/address (a made up url), you could do:
The fun part of $ref is that the URI can be relative to the current schema and and you can use JSON pointer in a URI fragment (the part after the pound # sign) to refer to a portion of the schema within a document! JSON Pointer is a small RFC (http://tools.ietf.org/html/rfc6901) that specs out Unix path-like expressions to navigate through properties and arrays in a complex JSON structure. For example the expression /children/0/gendre refers to the gendre of the first element in a children array property. Note that only the slash is used, no brackets or dots and that's perfectly enough. If you want to escape a slash inside a property name write ~1 and to escape a tilde write ~0. To gets your hands on some presumably rock solid zip code validation, for example, you could do:
So that means you can define the schemas for your RESTful API at a standard location and publish those and/or refer to them on your API responses. Any JSON validator has to be capable of fetching the right sub-schema and a good implementation will cache them so you don't have to worry about network hops. A reference URI can be relative to the current schema so if you have other schemas on the same base location, they can refer to each other irrespective of where they are deployed. As a special case to that, you can resolve fragments relative to the current schema. For example:
  "myschemas": {
     "properName": { "type":"string", "pattern":"[A-Z][a-z]+"}
    "firstName":{ "$ref":"#/myschemas/properName"},
    "lastName":{ "$ref":"#/myschemas/properName"}
Because the JSON Schema specification allows properties that are not keywords, we can just pick up a name, like myschemas here, as a placeholder for sub-schemas that we want to reuse. So we've defined that a proper name must start with a capital letter followed by one or more lowercase letters, and then we can reuse that anywhere we want. This is such a common pattern then the specification has defined a keyword to place such sub-schemas. This is the definitions keyword which must appear at the top-level, has no role in validation per se, but is just a placeholder for inline schemas. So the above example should be properly rewritten as:
  "definitions": {
     "properName": { "type":"string", "pattern":"[A-Z][a-z]+"}
    "firstName":{ "$ref":"#/definitions/properName"},
    "lastName":{ "$ref":"#/definitions/properName"}
To sum up, using the $ref keyword and the definitions placeholder is all you need to structure large schemas, split them into smaller ones, possibly in different documents, refer to standardized schemas over the internet etc.


Now to make use of JSON schema, there aren't actually that many implementations available yet. The popular (and bloated) Jackson supports draft 3 so far, and this part doesn't seem actively maintained. One of the JSON Schema spec authors has implemented full support on top of Jackson: https://github.com/fge/json-schema-validator, so you should know about that implementation especially if you are already a Jackson user. But if you are not, I want to point you to another option available since recently:mJson 1.3 which supports JSON Schema Draft 4 validation:

Json schema = Json.read(new URL("http://mycompany.com/schemas/model");
Json data = Json.object("firstName", "John", "lastName", "Smith").set("children", Json.array().add(/* etc... */));
Json errors = schema.validate(data);
for (Json e: errors.asJsonList())
   System.out.println("JSON validation error:" + e);

In all fairness, some of the other libraries also have support for generating JSON based on a schema, with default values specified by the default keyword which I haven't covered here. mJson doesn't do that yet, but if there's demand I'll put it in. The keywords I haven't covered are title, description (meta data keywords not used during validation) and id. To become and expert, you can always read the spec. Here it is, alongside some other resources:

For Dessert

To part ways, I want to leave you with a little gem, one more resource. Somebody came up with a much more concise language for describing JSON structures, It's called Orderly, it compiles into JSON Schema and I haven't tried it. If you do, please report back. It's at http://orderly-json.org/  and it looks like this:

object {
  string name;
  string description?;
  string homepage /^http:/;
  integer {1500,3000} invented;

Tuesday, August 26, 2014

Where are the JVM Scripting IDEs?

The raise of scripting languages in the past decade has been spectacular. And since the JVM platform is the largest, a few were designed specifically for that platform while many others were also implemented on top. It is thus that we have JRuby, Jython, Groovy, Clojure, Rhino, JavaFX and the more obscure (read more fun) things like Prolog and Scheme implementations. Production code is being written, dynamic language code bases are growing, whole projects don't even have any Java code proper. Yet when it comes to tooling, the space is meager to say the least.

What do we have? In Eclipse world, there's the Dynamic Languages Toolkit which you can explore at http://www.eclipse.org/dltk/, or some individual attempts like http://eclipsescript.org/ for the Rhino JavaScript interpreter or the Groovy plugin at http://groovy.codehaus.org/Eclipse+Plugin. All of those provide means to execute a script inside the Eclipse IDE and possible syntax highlighting and code completion. The Groovy plugin is really advanced in that it offers debugging facilities, which of course is possible because the Groovy implementation itself has support for it. That's great. But frankly, I'm not that impressed. Scripting seems to me a different beast than normal development. Normally you do scripting via a REPL, which is traditionally a very limited form of UI because it's constrained by the limitation of a terminal console. What text editors do to kind of emulate a REPL is let you select the expression to evaluate as a portion of the text, or take everything on a line, or if they are more advanced, then use the language's syntax to get to the smallest evaluate-able expression. It still feels a little awkward. Netbeans' support is similar. Still not impressed. "What more do you want?", you may ask. Well, don't know exactly, but more. There's something I do when I write code in scripting languages, a certain state of mind and a way of approaching problems that is not the same as with the static, verbose languages such as Java.

The truth is the IDE brought something to Java (and Pascal and C++ etc.) that made the vast majority of programmers never want to look back. Nothing of the sort has happened with dynamic languages. What did IDEs bring? Code completion was a late comer, compared to integrated debugging and the project management abilities. Code completion came in at about the same time as tools to navigate large code bases. Both of those need a structured representation of the code and until IDEs got powerful and fast enough to quickly generate and maintain in sync such a representation, we only had an editor+debugger+a project file. Now IDEs also include anything and everything around the development process, all with the idea that the programmer should not leave the environment (nevermind that we prefer to take a walk outside from time to time - I don't care about your integrated browser, Chrome is an Alt-tab away!).

Since I've been coding with scripting languages even before they became so hot, I had that IDE problem a long time ago. That is to say, more than 10 years ago. And there was one UI for scripting that I thought was not only quite original,  but a great match for the kind of scripting I was usually doing, namely exploring and testing APIs, writing utilities, throw away programs, prototypes, lots of activities that occasionally occupy a bigger portion of my time than end-user code.  That UI was the Mathematica notebook. If you have never heard of it, Mathematica (http://www.wolfram.com/mathematicais a commercial system that came out in the 90s and has steadily been growing its user base with even larger ambitions as of late. The heart of it is its term-rewrite programming language, nice graphics and sophisticated math algorithms, but the notion of a notebook, as a better than REPL interface, is applicable to any scripting (i.e. evaluation-based, interpreter) language. A notebook is a structured document that has input cells, output cells, groups of cells, groups of groups of cells etc. The output cells contain anything that the input produces which can be a complex graphic display or even an interactive component. That's perfect! How come we haven't seen it widely applied?

Thus Seco was born. On a first approximation, Seco is just a shell to JVM dynamic languages that imitates Mathematica's notebooks. It has its own ambition a bit beyond that, moving towards an experimental notion of software development as semi-structured evolutionary process. Because of that grand goal, which should not distract you from the practicality of the tool that I and a few friends and colleagues have been using for years, Seco has a few extras, like the fact that your work is always persisted on disk, the more advanced zoomable interface beyond the mere notebook concepts. The best way to see why this is worth blogging about is to play with it a little. Go visit http://kobrix.com/seco.jsp.

Seco was written almost in its entirety by a former Kobrix Software employee, Konstantin Vandev. It is about a decade old, but active development stopped a few years ago. I took a couple of hours here and there in the past months to fix some bugs, started implementing a new feature to have a centralized searchable repository for notebooks so people can backup their work remotely, access it and/or publish it. That feature is not ready, but I'd like to breathe some life into the project by making a release. So consider this an official Seco 0.5 release which besides the aforementioned bug fixes upgrades to the latest version of HyperGraphDB (the backing database where everything get stored) and removes dependency on the BerkeleyDB native library so it's pure Java now.    

Monday, July 21, 2014

Why is the Fundamental Theorem of Software Engineering Fundamental?

Have you heard of the adage "All problems in computer science can be solved by another level of indirection."? If you are a programmer, chances are you have read about it in a book or an article talking about how to best structure the software you write. It's been dubbed the fundamental theorem of software engineering (the FTSE) so you should know what it is about. If you don't then, quickly, go read up on the subject at ... no, I won't do the googling for you.

A common explanation you will find is that the FTSE is talking about abstraction layers: another level of indirection is achieved by raising the abstraction level. That's a good way to view it, but only some of it. Abstraction, when realized in software, often results in a layer so that the details of whatever is on the other side remain hidden, and a layer causes references to go in round about ways to get to the point, hence indirection. However, there are other forms of indirection where one just wants to reduce coupling between two parts. Here we are not really getting rid of any details, but maybe creating a common language about them. Think of the Adaptor pattern for example. Not convinced? Go read the insightful commentary on the subject by Zed Shaw at http://zedshaw.com/essays/indirection_is_not_abstraction.html.

So, just for the fun of it, how would you explain the FTSE to a neophyte? Wikipedia defines indirection as the ability to reference something using a name, reference, or container instead of the value itself. This is both misleading and revelatory. Misleading because this is a much more general definition than the intuition most programmers have about the FTSE. After all, everything relies on naming and references, so what could a theorem stating the obvious have to teach us? But it's also revelatory because it hints at the fact that much of software engineering, and therefore much of computing, is about answering the equally famous question what's in a name. Therefore, to apply the FTSE in practice we need to allow ourselves to answer that question the best we can at any point in time. That is, we need to be able to define the meaning of a name within any given context. The meaning of a symbol in a software system of even moderate complexity quickly starts exhibiting nuances and goes through changes much like words in natural language. This is because just like natural language is a complicated symbolic model of the messy world, a software system can be similarly characterized.

In a sense software, any software, is a model of the world. The symbols we use inside a software program acquire meaning (or semantics if you will) by virtue of what the entities they refer to actually do. A function's meaning is its implementation. The meaning of a piece of data is how it's used, how it interacts with other data. And programming types govern that usage to some extent, but mostly it is about what we do with the data inside the program. For instance, to discover the meaning of a field in a data structure about which you have no documentation, you would trace where and how it is being used. New requirements or changes and adjustments to old requirements are nothing more than a refined model of the world and/or system that needs to be reflected in software. This process of refining a model is enriching the meaning of the terms used to describe things and that translates into modifying the semantics of certain symbols in the software. Which is what we call "changing the implementation". Now, the practice of programming in not seen as creating and refining meaning of symbols, but I believe that is a very important perspective. It is a perspective where the resolution of symbolic references in context is at the foundation of programming, instead of the usual "giving instructions to a machine" view. I came to this conclusion about 15 years ago while designing a programming language and looking at various theoretical constructs in PL design, their limitations and trade-offs. Over the years, I have developed a reflex to see a lot of design problems in terms the important symbols in play, the context, both lexical and runtime (or static and dynamic if you prefer), and what is the resolution process of those symbolic references. I also see a lot of programming language constructs as being in essence a bag of reference resolution tools. That is, programming constructs are in large part tools a programmer has at their disposal to define the meaning of symbolic references. Let's take a look at some.


Variables are the quintessential programming tool. Probably the first construct you ever learn in programming, directly lifted from algebra, it is the simplest form of abstraction - use a name instead of a value. Of course, as the term "variable" suggests, the whole point is for the thing itself to change, something that is able to vary. So in effect a variable both establishes an identity and an interface for changes to be affected, thus bypassing all sorts of metaphysical problems in one nice engineering sweep. Conceptually variables are direct symbolic references, associations between a name and a value "container". Implementation wise, they are usually the same thing as pointers to memory locations and that's how they've always been understood. In fact, this understanding is a consequence of the fact that in compiled languages the name of a variable disappears from the compiled version, it is replaced by the memory location. A benefit of this strategy is that the cost of using a variable is as low as it can be - just a RAM memory cell access. On the other hand, any flexibility in how the name is to be interpreted at runtime is completely gone.

Note that we are not talking here about the distinction between "reference variables" vs. "primitive data variables" or whatever. Though that's an important distinction, what we are concerned about is merely the fact that variables are names of things. What is thought of as "reference variables" (or pointers) in various languages has to do with how they are copied during an assignment operation or as a function argument, whether the value is mutable or not etc.

Aliases and Macros

Aliases are relatively uncommon as a separate construct in modern languages. When pointers are assigned to each other, this is called aliasing because we have two names for the same memory location, i.e. two references with the same referent. For example, the assignment of any variable of Object type in Java is considered aliasing. While we do have another level of indirection here since we could change one reference without disturbing the other, this type of aliasing is not particularly interesting. But consider another type of aliasing, through macros (think C #define) where a name is explicitly declared to be a replacement of another name. The indirection here involves an extra translation step and the meaning of the alias in this case is not that it has the same referent, but that its referent is the original name. As a consequence, mentioning the alias at a place in the program where the same symbol is used for an entirely different thing will make the alias refer that that thing. Another benefit of this kind of aliasing is that it can be used to refer to other things besides variables, for example types, especially when a type description is a complex expression that we don't want to write over and over again. Macros are also in the same spirit - languages that recognize the value of compile-time decision making will offer a macro facility.  And it is a shame they are not more popular. They just have a bad aura because they are associated with purely textual manipulation, a completely separate language on its own. However, if one sees them as a way to do structured code generation, or compile-time evaluation, they are much more appealing. One has to realize that macros and aliases are about name & conquer just as much as variables and functions are, and that they are in fact a great level of indirection mechanism. A mechanism that occupies another spot in the compile-run time dimension. Speaking of which, the fact that there is nothing in between that strict compile vs. run-time separation is strong limitation in language expressiveness. Partial evaluation techniques could be what macros at run time look like, but those are still confined to academic research mainly.

To sum up so far: the key difference between variables and aliases is the timing of the reference resolution process. With variables, the referent is obtained when the program is running while with aliases it is obtained at compile time. In one case we have the context of a running program, in the other the context of a running compiler.


Overloading is a programming mechanism that allows one to define different implementations of the same function depending on the arguments used during invocation. In other words, overloading lets you associate a different meaning with a given name depending on the syntactical context of usage of that name. It's a context-dependent reference resolution process that happens generally at compile time, hence within a static context. A rough natural language analogue would be homonyms that have different meanings only because used as a different part of speech. For example, in "all rivers flow" and "the flow is smooth" the semantic import is the same, but the strict meaning is different because in one case we have  a verb while in the other we have a subject. A variation on the theme is Common Lisp and its generic functions where the dispatching can be defined on an actual object, via an equals predicate. In that case the context for the resolution is a dynamic one, hence it has to do more with semantics in a sense. That's more like homonymy where the semantics of a word depend on the semantics of the surrounding words.


Overriding is about changing the meaning of a name in a nested lexical scope. I deliberately use the word meaning here to talk about whatever a name refers to, understanding that often the meaning of a symbol in a programming language is simply the referent of that symbol. A nested lexical scope could be a nested function, or a class declaration or some code block the delimits a lexical scope. In some cases, one looses the ability to access the referent from the enclosing scope. In others, the programming language provides a special syntax for that. (i.e. the super keyword in Java).  Again, we are talking about a reference resolution mechanism in a certain specialized context. The context is completely specified by the lexical scope and is therefore static. Note that people somewhat erroneously use the word overwrite instead of override. The correct term in Computer Science is override. In English, it means to cancel a previous decision whereas overwrite literally mean to write over something. More on the mechanics of overriding below.


A compound structure such as an object in JavaScript, or a class in Java/C# or a struct in C is, among other things, a context where certain names have meanings. Specifically, the fields and methods that belong to that structure are references that are resolved precisely in the context of the runtime object. Well, actually it depends. An interesting case are the static variables in Java classes. A static variable is a member of the class object rather than of its instances. One way teachers of the language describe static variables is that all objects of that class share the same variable. That's even how the official Java documentation talks about static variables: in terms of variables that all objects share (see http://docs.oracle.com/javase/tutorial/java/javaOO/classvars.html). But that is inaccurate because an object (i.e. an instance of the class) is not the proper context to refer to the static variable. If we have:

class A { public static int a = 0; }
class B extends A { public static int a  = 1;}
A x = new B();
System.out.println(x.a); // what are we supposed to see here?

What does the last statement print out? That sounds like a good entry level job interview question. The question comes down to what is the proper context for the reference resolution of the name a? If we see static variables as variable shared by all objects of the same class, the value should clearly be 1 since x's actual type is B and B.a is what all objects of type B share. But that is not what happens. We get 0 because Java only cares about the declared type which in this is A. The correct way to talk about static variables is as member variables of the object class (in Java, every class is an object at runtime, an object whose type is java.lang.Class). This is why some recent Java compilers issue a warning when you refer to a static variable from an object context (though not the JDK7 compiler!) To be fair to official documentation, the above mentioned tutorial does recommend use of the class rather than the object to refer to static variables. However, the reason given is that otherwise it does not make it clear that they are class variables. Now, if we had non-static member variables? We get the same result: the declared type of the object variable x is what matters, not the actual runtime type. If instead of accessing a public variable, we were accessing  a public function then the runtime type would have been the one used to resolve the reference.

So why is that the case? Because part of the reference resolution process happens at compile time rather than at run-time. Traditionally, in programming languages a name is translated to a memory location during compilation so then at runtime only the location matters and the referent is obtained the fastest possible way. With so called "virtual methods", like non-static methods in Java, we get to do an extra hop to get to the memory location at runtime. So for variables, both static and non-static, and for static methods, the reference resolution is entirely based on the static context (type annotations available at compile time) while for non-static function it becomes dynamic. Why is only one kind of name inside a class's lexical context resolved in this way. No deep reason, just how Java was designed. Of course I could have just said "virtual tables only apply to non-static functions", but that's not the point. The point is that in defining programming constructs, an important part of the semantics of a programming language is based on narrowing down what is the context and the process for reference resolution in all the various ways a symbol can be mentioned in a program. For most mainstream languages, this only applies to identifiers, a special lexical category, but it is in fact more general (e.g. operators in C++ or Prolog, any type of symbol in Scheme).  A common name for this kind of deferring of the reference resolution is late binding. And everybody likes it because "it's more flexible". Good.


To finish, let me mention closures, a term and a construct that has thankfully made it into mainstream programming recently. The power of closures stems from their ability to capture a computational context as a reference resolution frame. It is a way to transcend the rigid lexical scoping somewhat and have a piece of code rely on a set of variables that are hidden from the users of that code, yet whose lifetime is independent of its execution. So the variables captured in a closure behave like global variables in terms of their lifetime, but like local variables in terms of their scope. So they represent a very particular intersection between the design dimensions of visibility&lifetime. But what that does in effect, to put it in more general terms, is that the meaning of a name is carried over from one context to another without it interfering with meanings of that same name in other contexts.

Okay. So we started with a software engineering principle and we dug somewhat into a few programming language concepts from the perspective of reference resolution. Naming and reference are philosophical problems that are probably going to get resolved/dissolved soon by neuroscience. In the meantime, the cognitive phenomenon and whatever philosophy and linguistics has thought us so far about it could serve a bit more as an inspiration to how people communicate with machines through programming. So you can see where I'm heading with the question posed at the title of this post. It is the reference resolution that is fundamental an indirection is simply what we do to (re)define what that resolution process ultimately looks like. I have more to say about it, but this already got a bit long so I'll stop here.

Saturday, February 1, 2014

Application Components with AngularJS

This post is about self-contained business components with AngularJS that can be instantiated at arbitrary places in one's application. 

With AnuglarJS, one writes the HTML plainly and then implements behavior in JavaScript via controllers and directives. The controllers are about the model of which the HTML is the view, while directives are about extra tags and attributes that you can extend the HTML with. You are supposed to implement business logic in controllers and UI logic in directives. Good. But there are situations where the distinction is not so clear cut, in particular when you are building a UI by reusing business functionality in multiple places.

In a large applications, it often happens that the same piece of functionality has to be available in different contexts, in different flows of the user interaction. So it helps to be able to easily package that functionality and plug it whenever it's needed. For example, a certain domain object should be editable in place, or we need the ability to select among a list of dynamically generated domain objects. Those types of components are application components because they encapsulate reusable business logic and they are even tied to a specific backend, as opposed to, say, UI components which are to be (re)used in different applications and have much wider applicability. Because application components are not instantiated that often (as opposed to UI components which are much more universal), it is frequently tempting to copy&paste code rather than create a reusable software entity. Unless the componentization is really easy and requires almost no extra work.

If you are using AngularJS, here is a way to easily achieve this sort of encapsulation: 
  1. Put the HTML in a file as a self-contained template "partial" (i.e. without the top-level document HTML tags). 
  2. Have its controller JavaScript be somehow included in the main HTML page.
  3. Plug it in any other part of the application, like other HTML templates for example. 
This last part cannot be done with AngularJS's API. We have to write to gluing code. Since we will be plugging by referring to our component in an HTML template, we have to write a custom directive. Instead of writing a separate directive for each component as AngularJS documentation recommends, we will write one directive that will handle all our components. To be sure, there is generic directive to include HTML partials in AngularJS, the ng-view directive, but it's limited to swapping the main content of a page, too coarse grained that is. By contrast, our directive can be used anywhere, nested recursively etc. Here is an example of its usage:

<be-plug name="shippingAddressList">
  <be-model-link from-child-scope="currentSelection" 

This little snippet assumes we have an HTML template file called shippingAddressList.ht that lets the user select one among several addresses to ship the shopping cart contents to. We have a top-level tag called be-plug and nested tag called be-model-link. The be-model-link tag associates attributes of the component's model to attributes of the model (i.e. scope in AngularJS's terms) of the enclosing HTML. More on that below. Here is the implementation:
app.directive('bePlug', function($compile, $http) {
  return {
    scope : {},
    link:function(scope, element, attrs, ctrl) {
      var template = attrs.name + ".ht";
      $http.get(template).then(function(x) {
        $.each(scope.modelLinks, function(atParent, atChild) {
          // Find a parent scope that has 'atParent' property
          var parentScope = scope;
          while (parentScope != null && 
            parentScope = parentScope.$parent;
          if (parentScope == null) 
            throw "No scope with property " + atParent + 
                  ", be-plug can't link models";
          scope.$$childHead.$watch(atChild, function(newValue) {
            parentScope[atParent] = newValue;
          parentScope.$watch(atParent, function(newValue) {
            scope.$$childHead[atChild] = newValue;

Let's deconstruct the above code. First, make sure you are familiar with how to write directives in AngularJS and you understand what AngularJS scopes are. Next, note that we are creating a scope for our directive by declaring a scope:{} object. The purpose is twofold: (1) don't pollute the parent scope and (2) make sure we have a single child in our scope so we have a handle on the scope of the component we are including.

Good. Now, let's look at the gist of the directive, its link method. (I'm sure there is some valid reason that method is named "link". Perhaps because we are "linking" an HTML template to its containing element. Or to a model via the scope? Something like that.) In any case, that's were DOM manipulation is done. So here's what's happening in our implementation:
  • We fetch the HTML template from the server. By naming convention, we expect the file to have extension .ht. The rest of the relative path of the template file is given in the name attribute.
  • Once the template is loaded, we set it as the HTML content of the directive's element. So the resulting DOM will have a be-plug DOM node which the browser will happily ignore and inside that node there will be our component's HTML template.
  • Then we "compile" the HTML content using AngularJS's $compile service. This method call is essentially the whole point of the exercise. This is what allows AngularJS to bind model to view, to process any nested directives recursively etc. In short, this is what makes our textual content inclusion into a "runtime component instance". Well, this and also the following:
  • ...the binding of scope attributes between our enclosing element and the component we are including. This binding is achieved in the following for loop by observing variable changes in the scopes of interest.
That last point needs a bit more explaining. The HTML code that includes our component presumably has some associated model scope with attributes pertaining to business logic. On the other hand, the included component acquires its own scope with its own set of attributes as defined by its own controller. The two scopes end up in a parent-child relationship with the directive's scope (a third one) in between. From an application point of view, we probably have one or several chained parent scopes holding relevant model attributes and we'd want to somehow connect the data in our component model to the data in the enclosing scope. In the example above, we are connecting the shippingAddress attribute of our main application scope to the currentSelection attribute of the address selection component. In the context of the enclosing logic, we are dealing with a "shipping address", but in the context of the address selection component which simply displays a choice of addresses to pick from we are dealing with a "current selection". So we are binding the two otherwise independent concepts.

To implement this sort of binding of a given pair of model attributes, we need to know: the parent scope, the child scope, the name of the attribute in the parent scope and the name of the attribute in the child scope. To collect the pairs of attributes, we rely on a nested tag called be-model-link implemented as follows:
app.directive('beModelLink', function() {
  return {
    link:function(scope, element, attrs, ctrl) {
      if (!scope.modelLinks)
        scope.modelLinks = {};
      scope.modelLinks[attrs.fromParentScope] = attrs.fromChildScope;

Because we have not declared a private scope for the be-model-link directive, the scope we get is the one of the parent directive. This gives us the chance to put some data in it. And the data we put is the mapping from parent to child model attributes in the form of a modelLinks object. Note that we refer to this modelLink object in the setup of variable watching in the be-plug directive where we loop over all its properties and use AngularJS' $watch mechanism to monitor changes on either side and affect the same change to the linked attributes. To find the correct parent scope, we walk up the chain and get the first one which has the stated from-parent-scope attribute, throwing an error if we can't find it. The child scope is easy because there is only one child scope to our directive.

That's about it. We are essentially doing server-side includes like in the good (err..bad) old days, except because of the interactive nature of the "thing" with AJAX and all, and the whole runtime environment created by AngularJS, we have a fairly dynamic component. Hope you find this useful. 

Friday, September 6, 2013


I must confess that it's very difficult to bring myself to write code with a long life expectancy in an alternative JVM language. Scala is slowly taking over and that's great. But if I'm working on an API, an open general purpose one, or some "important" layer within a system, the thought of not using Java doesn't even cross my mind. Not that those other languages are not good. But anyway, in this post I just wanted to complain that I'm tired of seeing, let alone writing getters and setters for bean properties. Since first-class properties don't seem to be on the table for Java, can we at least come up with a better naming convention? What would it take to have a tool supported, alternative Java Beans spec? The addition of closures in Java 8 is perhaps an opportunity and an excuse to revise this (e.g. event listeners can be specified more concisely). Perhaps it's already being discussed in the depths of the Oracle.

But let's start with the 'getFoo'/'setFoo' pair. For a "real world" example:

public class Customer {
  private String email;
  public void setEmail(String email) { this.email = email; }
  public String getEmail() { return email; }

That convention is ugly and verbose. For one thing why the need for the 'set' and 'get' prefixes? It is clear that if you are passing an argument to a function then you'd be setting. If there's no argument you are not setting, so you are probably getting as evidenced by the return value. Moreover, the fact that a setter returns no value is pure waste since you could always return this. Why is that important? Method chaining! Yes, method chaining is not to be underestimated and it's not just a trick of the conciseness-conscious coder. Since we in the Western world read left to right, programming languages are written and evaluated left to right. As a consequence, symbols to the left of the current position (of reading, writing or evaluating) tend to establish a context that could just be relied upon rather than repeated. When you put a semi-colon in Java, that's like a reset, context goes away, next statement. But if you are setting up a bunch of properties, as is frequently the case, your next statement will start by re-establishing the context you already had. That's more verbiage there on the screen starring at you and definitely kind of ugly. But it's also more work for the thing doing the evaluation. Values are returned on the call stack, so when your setter returns this, the next set operation doesn't need to push the object on the stack since it's already there. So there's a performance improvement there. Probably quite negligible, but I'd like to point it out because it ties to the observation that, in the kind of programming that consists of moving data in one form from one place to another form somewhere else, which is the bulk of so called "business" applications, less code does mean better performance.

The difference between:

Customer customer = new Customer();

and this:

Customer customer = new Customer().firstName("Bill").lastName("Watterson").occupation("cartoonist");

is not just a matter of taste. Well, unless you really have bad taste that is.

All it would take is for a couple of IDEs to support the new convention and a small lib that could quickly become ubiquitous. Or perhaps a big lib, something called betterjava.jar that would fix bad API designs from the early, and why not later, days of Java. I'm sure people aren't short of ideas as the Java standard APIs aren't short of problems.

Monday, November 19, 2012

eValhalla User Management

[Previous in this series: eValhalla Setup]

In this installment of the eValhalla development chronicle, we will be tackling what's probably the one common feature of most web-based application - user management. We will implement:
  1. User login
  2. A "Remember me" checkbox for auto-login of users on the same computer
  3. Registration with email validation
The end result can be easily taken and plugged in your next web app! Possible future improvements would be adding captchas during registration and/or login and the ability to edit more extensive profile information.

I have put comments wherever I felt appropriate in the code, which should be fairly straightforward anyway. So I won't be walking you line by line. Rather, I will explain what it does and comment on design decisions or on less obvious parts.   First, let me say a few words about how user sessions are managed.

User Sessions

Since we are relying on a RESTful architecture, we can't have a server hold user sessions. We need to store user session information at the client and transfer it with each request. Well, the whole concept of a user session is kind of irrelevant here since the application logic is mostly residing at the client and the server is mainly consulted as a database. Still, the data needs protection and we need the notion of a user with certain access rights. So we have to solve the problem of authentication. Each request needs to carry enough information for the server to decide whether the request should be honored or not. We cannot just rely on the user id because this way anybody can send a request with anybody else's id. To authenticate the client, the server will first supply it with a special secret key, an authentication token,  that the client must send on each request along with the user id. To obtain that authentication token, the client must however identify themselves by using a password. And that's the purpose of the login operation: obtaining an authentication token for use in subsequent requests. The client will then keep that token together with the user id and submit them as HTTP headers on every request. The natural way to do that with JavaScript is storing the user id and authentication tokens as cookies.

This authentication mechanism is commonly used when working within a RESTful architecture. For more on the subtleties of that approach, just google "user authentication and REST applications". One question is why not just send the password with every request instead of a separate token. That's possible, but more risky - a token is generated randomly and it expires, hence it is harder to guess. The big issue however is XSS (cross-site scripting) attacks. In brief, with XSS an attacker insert HTML code into a field that gets displayed supposedly as just static text to other users (e.g. a blog title) and the code simply does an HTTP request to a malicious server submitting all the users' private cookies with it. To avoid them, we will have to pay special attention on HTML sanitation. That is, we have to properly escape every HTML tag displayed as static text. We can also put that authentication token in an HTTPOnly cookie for extra security.

Implementation - a User Module

Since user management is so common, I made a small effort to build a relatively self-contained module for it. There are no standard packaging mechanisms for the particular technology stack we're using, so you'd just have to copy&paste a few files:

  • /html/ahguser.ht - contains the HTML code for login and registration dialogs as well as top-level Login and Register links that show up right aligned. This depends on the whole Angular+AngularUI+Bootstrap+jQuery environment. 
  • /javascript/ahguser.js - contains the Angular module 'ahgUser' that you can use in your Angular application. This depends on the backend:
  • /scala/evalhalla/user/package.scala - the evalhalla mJson-HGDB backend REST user service. 

The backend can be easily packaged in a jar, mavenized and all, and this is something that I might do near the end of the project. 

The registration process validates the user's email by emailing them a randomly generated UUID (a HyperGraphDB handle) and requiring that they provide it back for validation before they can legitimately log into the site. Since the backend code is simple enough, repetitive even, let's look at one method only, the register REST endpoint:

def register(data:Json):Json = {
  return transact(Unit => {
    // Check if we already have that user registered
    var profile = db.retrieve(
        jobj("entity", "user", 
             "email", data.at("email")))
    if (profile != null)
      return ko().set("error", "duplicate")
    // Email validation token
    var token = graph.getHandleFactory()
    db.addTopLevel(data.set("entity", "user")
                       .set("validationToken", token)
      "Welcome to eValhalla - Registration Confirmation",
      "Please validate registration with " + token)
    return ok

So we see our friends db, transact, jobj etc. from last time. The whole thing is a Scala transaction closure with the much nicer than java.lang.Callable Scala syntax. As a reminder, note that most of the functions and objects referred here are globally declared in the evalhalla package. For example, db is an instance of the HyperNodeJson class. The call to normalize just ensures the email is lower-case, because email in general are case-insensitive. While the logic is fairly straightforward, let me make a few observations about the APIs.

User profile lookup is done with a Json pattern matching.  Recall that the database stores arbitrary Json structures (primitives, arrays, objects and nulls) as HyperGraphDB atoms. It doesn't have the notion of "document collections" like MongoDB for example. This is because each Json structure is just a portion of the whole graph. So to distinguish between different types of objects we define a special JSON property called entity that we reserve as a type name attached to all of our top-level Json entities. Here we are dealing with entities of type "user". Now, each atom in HyperGraphDB and consequently each user profile has a unique identifier - the HyperGraphDB handle (a UUID). There is no notion of a primary key enforced at the database level. We know that an email should uniquely identify a user profile so we perform the lookup with the Json object {entity:"user", email:<the email>} as a pattern. But this uniqueness is enforced at the application level because new profiles are added only precisely via this register method. I've explained the db.addTopLevel method on the  HGDB-mJson wiki page, but here is a summary for the impatient: while db.add will create a duplicate version of the whole Json structure recursively and while db.assertAtom will only create something if it doesn't exist yet, db.addTopLevel will create a new database atom only for the top-level JSON, but perform an assert operation for each of its components.

Finally, note that before adding the profile to the database, we delete the passwordrepeat property. This is because we are storing in the database whatever JSON object the client gives us. And the client is giving us all fields coming from the HTML registration form, pretty much as the form is defined. So we get rid of that unneeded field. Granted, it would be actually better design to remove that property at the client-side since the repeat password validation is done there. But I wanted to illustrate the fact that the JSON data is really flowing as is from the HTML form directly to the database, with no mappings or translations of any sort needed. 

So far, so good. Let's move on the UI side.

Client-Side Application

In addition to AngularJS, I've incorporated the Bootstrap front-end library by Twitter. It looks good and it has some handy components. I've also included the Angular-UI library which has some extras like form validation and it plays well with Bootstrap.

The whole application resides on one top-level HTML page /html/index.html. The main JavaScript program that drives it resides in the /javascript/app.js file. The idea is to dynamically include fragments of HTML code inside a reserved main area while some portions of the page such as the header with user profile functionality remain stable. By convention, I use the extension '.ht' for such HTML fragments instead of '.html' to indicate that the content is not a complete HTML page. Here's the main page in its essence:

<div ng-controller="MainController">
<div ng-include="ng-include" src="'/ahguser.ht'"></div>
<!-- some extra text here... -->
<a href="#m">main</a> |
<a href="#t">test</a>
<ng-view/>  <!-- this is where HTML fragments get included -->

The MainController function is in app.js and it will eventually establish some top-level scoped variables and functions, but it's currently empty. The user module described above is included as a UI component by the <div ng-include...> tag. The login and register links that you see at the top right of the screen with all their functionality come from that component. Finally, just for the fun of it, I made a small menu with two links to switch the main view from one page fragment to another: 'main' and 'test'. This is just to setup a blueprint for future content.

A Note on AngularJS

Let me explain a bit the Angular portion of that code since this framework, as pretentious as it is, lacks in documentation and it's far from being an intuitive API. In fact, I wouldn't recommend yet. My use of it is experimental and the first impressions are a bit ambivalent. Yes, eventually you get stuff to work. However, it's bloated and it fails the first test of any over-reaching framework: simple things are not easy. It seems built by a team of young people that use the word awesome a lot, so we'll see if it meets the second test, that complicated things should be possible. Over the years, I've managed to stay away from other horrific, bloated frameworks, aggressively marketed by a big companies (e.g. EJBs), but here I may be just a bit too pessimistic. Ok, enough.

Angular reads and interprets your HTML before it gives it to the browser. That opens many doors. In particular, it allows you to define custom attributes and tags to implement some dynamic behaviors. It has the notion of an application and the notion of modules. An application is associated with a top-level HTML tag, usually the 'html' tag itself by setting the custom ng-app='appname' attribute. Then appname is declared as a module in JavaScript:

var app = angular.module('appname', [array of module dependencies])

It's not clear what's special about an application vs. mere modules, presumably nothing. Then functionality is attached to markup (the "view") via Angular controllers. Those are JavaScript functions that you write and Angular calls to setup the model bound to the view. Controller functions take any number of parameters and Angular uses a nifty trick here. When you call the toString method of a JavaScript function, it returns its full text as it was originally parsed. That includes the formal arguments, exactly with the names you have listed in the function declaration (unless you've used some sort of minimization/obfuscation tool). So Angular parses that argument list and uses the names of the parameters to determine what you need in your function. For example, when you declare a controller like this:

function MainController($scope, $http) {
A call to MainController.toString() returns the string "function MainController($scope, $http) { }". Angular parses that string and determines that you want "$scope" and "$http". It recognizes those names and passes in the appropriate arguments for them. The name "$scope" is a predefined AngularJS name that refers to an object to be populated with the application model. Properties of that object can be bound to form elements, or displayed in a table or whatever. The name "$http" refers to an Angular service that allows you to make AJAX calls. As far as I understand it, any global name registered as a service with Angular can be used in a controller parameter list. There's a dependency injection mechanism that takes care of, among other things, hooking up services in controllers by matching parameter names with globally registered functions. I still haven't figured out what the practical benefit of that is, as opposed to having global JavaScript objects yourself....perhaps in really large applications where different clusters of the overall module dependency graph use the same names for different things.

Beginnings of a Data Service

One of the design goals in this project is to minimize the amount of code handling CRUD data operations. After all, CRUD is pretty standard and we are working with a schema-less flexible database. So we should be able to do CRUD on any kind of structured object we want without having to predefine its structure. In all honesty, I'm not sure this will be as easy as it sounds. As mentioned before, the main difficulty are security access rules. It's certainly doable, but we shall see what kind of complexities it leads to in the subsequent iterations. For now, I've created a small class called DataService that allows you to perform a simple JSON pattern lookup as well as all CRUD operations on any entity identified by its HyperGraphDB handle.

One can experiment with this interface by making $.ajax calls in the browser's REPL. The interface is certainly going to evolve and change, but here are a couple of calls you can try out. I've made the '$http' service available as a global variable:

$http.post("/rest/data/entity", {entity:'story', content:'My story starts with....'})
$http.get("/rest/data/list?pattern=" + JSON.stringify({entity:'story'})).success(function(A) { console.log(A); })
The above creates a new entity in the DB and then retrieves it via a query for all entities of that type (i.e. entity:'story'). You can also play around with $http.put and $http.delete.


All right, this concludes the 2nd iteration of eValhalla. We've implemented a big part of what we'd need for user management. We've explored AngularJS as a viable UI framework and we've laid the groundwork of a data-centered REST service. To get it, follow the same steps as before, but checkout the phase2 GIT tag instead of phase1:

  1. git clone https://github.com/publicvaluegroup/evalhalla.git
  2. cd evalhalla
  3. git checkout phase2
  4. sbt
  5. run

Coming Up...

On the next iteration, we will do some data modeling and define the main entities of our application domain. We will also implement a portion of the UI dealing with submission and listing of stories.