PKP 2011 Hackfest

Today there was the kick-off of the hackfest at the PKP 2011 conference. Not many people turned up, but I had the chance to spend some quality (coding) time with PKP developers and to have a sort of personal code sprint  on a side project, that is developing a plugin to integrate a Named Entity Recognition (NER) web service into an OJS installation (see here and there for a more theoretical background).

At the end of the day what I got done was:

  • setup a local instance of OJS (version 2.3.6) using MAMB;
  • give a quick try to the OJS Voyeur plugin, which unfortunately for me is working only with version <=2.2.x;
  • create the bare-bone of the plugin, whose code is up here (for my personal record rather than for other’s use, at least at this early stage);
  • write a PHP class to query a web service (that I’m developing) to extract citations of ancient works from (plain) texts;
  • come up with two possible scenarios for further implementation of the plugin, to happen possibly earlier than next year’s PKP hackfest 😉
The idea of this post, indeed, is to comment a little on these two possible scenarios.

1. Client-side centric

The first scenario looks rather heavy on the client-side. The plugin is packaged as an OJS plugin and what it does is essentially as follows:

  1. after an article is loaded for view, a javascript (grab.js) gets all the <p> elements of the HTML article and send them over ajax to a php page (proxy.php);
  2. a php class act as a proxy (or client) for a 3rd party NER web service;
  3. the data that are received from via the ajax call are passed on to the web service via XML-RPC;
  4. the response is returned by the web service as JSON or XML format…
  5. … and then processed again by the JS script, ideally using a compiled template based on jquery’s template capability. Finally, the citations that were extracted are display as a summary box alongside the article.

2. Server-side centric

Instead, in the second scenario that I envisaged most of the processing happens on the server-side.

  1. before being displayed, the article is processed to extract <p> elements;
  2. the main plugin class (plugin.php) takes care of sending the input to and receiving a response from the NER service;
  3. the response is then ran through a template (template.tpl) by exploiting OJS’s templating functionalities;
  4. the formatted summary box is injected into the HTML which is now ready to be displayed to the user.

All in all, I think that I came up with (1) mainly because my PHP is rather rusty at the moment ;). Therefore, although I’m quite reluctant to admit so, I might decide to go for (2). However, a good point to opt for the former is the case where the user can decide for each paper whether to enable this feature or not.

One thought on “PKP 2011 Hackfest

Leave a comment