PKP 2011 Hackfest

27 Sep

Today there was the kick-off of the hackfest at the PKP 2011 conference. Not many people turned up, but I had the chance to spend some quality (coding) time with PKP developers and to have a sort of personal code sprint  on a side project, that is developing a plugin to integrate a Named Entity Recognition (NER) web service into an OJS installation (see here and there for a more theoretical background).

At the end of the day what I got done was:

  • setup a local instance of OJS (version 2.3.6) using MAMB;
  • give a quick try to the OJS Voyeur plugin, which unfortunately for me is working only with version <=2.2.x;
  • create the bare-bone of the plugin, whose code is up here (for my personal record rather than for other’s use, at least at this early stage);
  • write a PHP class to query a web service (that I’m developing) to extract citations of ancient works from (plain) texts;
  • come up with two possible scenarios for further implementation of the plugin, to happen possibly earlier than next year’s PKP hackfest ;)
The idea of this post, indeed, is to comment a little on these two possible scenarios.

1. Client-side centric

The first scenario looks rather heavy on the client-side. The plugin is packaged as an OJS plugin and what it does is essentially as follows:

  1. after an article is loaded for view, a javascript (grab.js) gets all the <p> elements of the HTML article and send them over ajax to a php page (proxy.php);
  2. a php class act as a proxy (or client) for a 3rd party NER web service;
  3. the data that are received from via the ajax call are passed on to the web service via XML-RPC;
  4. the response is returned by the web service as JSON or XML format…
  5. … and then processed again by the JS script, ideally using a compiled template based on jquery’s template capability. Finally, the citations that were extracted are display as a summary box alongside the article.

2. Server-side centric

Instead, in the second scenario that I envisaged most of the processing happens on the server-side.

  1. before being displayed, the article is processed to extract <p> elements;
  2. the main plugin class (plugin.php) takes care of sending the input to and receiving a response from the NER service;
  3. the response is then ran through a template (template.tpl) by exploiting OJS’s templating functionalities;
  4. the formatted summary box is injected into the HTML which is now ready to be displayed to the user.

All in all, I think that I came up with (1) mainly because my PHP is rather rusty at the moment ;). Therefore, although I’m quite reluctant to admit so, I might decide to go for (2). However, a good point to opt for the former is the case where the user can decide for each paper whether to enable this feature or not.

About these ads

One Response to “PKP 2011 Hackfest”

  1. Alec Smecher October 1, 2011 at 4:44 pm #

    Matteo, big thanks for being the kind of participant we hoped to attract to the Hackfest. I’m at your service if and when you want to push this further.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 369 other followers

%d bloggers like this: