Tag: pkp2011

PKP 2011 Hackfest

Today there was the kick-off of the hackfest at the PKP 2011 conference. Not many people turned up, but I had the chance to spend some quality (coding) time with PKP developers and to have a sort of personal code sprint  on a side project, that is developing a plugin to integrate a Named Entity Recognition (NER) web service into an OJS installation (see here and there for a more theoretical background).

At the end of the day what I got done was:

  • setup a local instance of OJS (version 2.3.6) using MAMB;
  • give a quick try to the OJS Voyeur plugin, which unfortunately for me is working only with version <=2.2.x;
  • create the bare-bone of the plugin, whose code is up here (for my personal record rather than for other’s use, at least at this early stage);
  • write a PHP class to query a web service (that I’m developing) to extract citations of ancient works from (plain) texts;
  • come up with two possible scenarios for further implementation of the plugin, to happen possibly earlier than next year’s PKP hackfest 😉
The idea of this post, indeed, is to comment a little on these two possible scenarios.

1. Client-side centric

The first scenario looks rather heavy on the client-side. The plugin is packaged as an OJS plugin and what it does is essentially as follows:

  1. after an article is loaded for view, a javascript (grab.js) gets all the <p> elements of the HTML article and send them over ajax to a php page (proxy.php);
  2. a php class act as a proxy (or client) for a 3rd party NER web service;
  3. the data that are received from via the ajax call are passed on to the web service via XML-RPC;
  4. the response is returned by the web service as JSON or XML format…
  5. … and then processed again by the JS script, ideally using a compiled template based on jquery’s template capability. Finally, the citations that were extracted are display as a summary box alongside the article.

2. Server-side centric

Instead, in the second scenario that I envisaged most of the processing happens on the server-side.

  1. before being displayed, the article is processed to extract <p> elements;
  2. the main plugin class (plugin.php) takes care of sending the input to and receiving a response from the NER service;
  3. the response is then ran through a template (template.tpl) by exploiting OJS’s templating functionalities;
  4. the formatted summary box is injected into the HTML which is now ready to be displayed to the user.

All in all, I think that I came up with (1) mainly because my PHP is rather rusty at the moment ;). Therefore, although I’m quite reluctant to admit so, I might decide to go for (2). However, a good point to opt for the former is the case where the user can decide for each paper whether to enable this feature or not.

Advertisements