Category: conferences

Linked Open Data for the Ancient World at CAA 2012

This year the Computer Applications and Quantitative Methods in Archaeology (CAA) conference will be held in Southampton (26-30 March 2012). I will be chairing, together with Dr. Felix Schäfer (Deutsches Archäologisches Institut, Berlin) and Dr. Prof. Reinhard Förtsch (CoDArchLab University of Cologne), a session on Linked Open Data for the Ancient World. 

This session aims to explore the opportunities, challenges and methodological consequences related to the Linked Open Data approach for the study of the ancient world. We welcome multi-disciplinary submissions dealing with the following or related aspects of Linked Open Data: URIs for Cultural Heritage objects, methodological consequences of LOD, projects publishing data as LOD, relevant tools and live applications based on LOD, digital libraries and their content in relation to ancient world objects, other approaches of making data interoperable and interlinked.

The deadline for submission has been extended to Dec 7 (11:59pm GTM). Here you can find more details about the conference and read the call for paper, and there you can submit your abstract.

Linked Open Data for the Ancient World (abstract)

[session code: Data1]

The study of the Ancient World is by nature a rich soil for the adoption and exploitation of the Linked Opden Data (LOD) approach. Indeed its long tradition, the diversity of materials and resources as well as the high level of disciplinary specialisation lead to a situation where silos of knowledge, even when available online and under open access licenses, are isolated from each other. This situation is also reflected by the segmentation that the study of the Ancient World has reached with the inevitable tendency to favour one single perspective in despite of others. On the contrary, the LOD approach allows us to integrate heterogeneous sources of information by means of links and persistent identifiers while preserving the disciplinary specificity of data.

The recent adoption of the LOD principles by projects such as Pelagios [1], SQPR [2] and the British Museum [3], in acceptance of the CIDOC-CRM’s Linked Open Data Recommendation for Museums [4], are important steps towards a future of interoperable data in archaeology and classics. There is a variety of ways in which different resources are related to each other: an inscribed stone, for instance, will be linked to the edition of the text, to the building and location it belonged to, to different photographs of the object, to a record in the museum catalog and to related literature. Having those different pieces of information interconnected would allow us to overcome to some degree the mentioned fragmented view on antiquity by rendering a more wholistic image of the past.

In this session we shall discuss the advantages and disadvantages of LOD for the study of the Ancient World, look at available data, existing tools and live applications (beyond the status of being testbeds) and question which steps should be taken to overcome existing obstacles to increase the amount of LOD. Furthermore we welcome reflections on the opportunities, challenges and methodological consequences for the disciplines involved. In continuity with past sessions of the conference on related topics, this section addresses issues including but not limited to:

* URIs for Cultural Heritage objects

* methodological reflections on consequences of LOD

* experiences of projects publishing their data as LOD

* discussion of relevant tools and live applications based on LOD

* digital libraries and their content in relation to Ancient World objects

* other approaches of making data interoperable and interlinked

 References

[1] http://pelagios-project.blogspot.com/

[2] http://spqr.cerch.kcl.ac.uk/

[3] http://collection.britishmuseum.org/About

[4] http://www.cidoc-crm.org/URIs_and_Linked_Open_Data.html

“The World of Thucydides” at CAA 2011

I’m at Heathrow airport waiting to board on a flight to Beijing (via Amsterdam) where I’ll be attending the CAA 2011 conference. To get into the conference mood I though it may be a good idea to post the abstract of the paper that myself and my colleague Agnes Thomas (CoDArchLab, University of Cologne) are going to give within a session entitled Digging with words: e-text and e-archaeology. [This version is slightly longer than the one that we submitted and has been accepted.]

The World of Thucydides: from Texts to Artifacts and back

The work presented in this paper is related to the Hellespont project, an NEH-DFG founded project aimed at joining together the digital collections of Perseus and Arachne [1]. In this paper we present ongoing work aimed at devising a Virtual Research Environment (VRE) that allows scholars to access to both archaeological and textual information [2].

An environment integrating together these two heterogeneous kinds of information will be highly valuable for both archaeologists and philologists. Indeed, the former will have easier access to literary sources of the historical period an artifact belongs to, whereas the latter will have at hand iconographic or archaeological evidences related to a given text. Therefore, we explore the idea of a VRE combining archaeological and philological data with another kind of textual information, that is secondary sources and in particular journal articles. To develop new modes of opening up and combining those different kinds of sources, the project will focus on the so called Pentecontaetia of the Greek historian Thucydides (Th. 1,89-1,118).

As of now, we do not dispose (yet) of an automatic tool capable of capturing passages of Thucydides’ Pentecontaetia that are of importance to our knowledge of Athens and Greece during the Classical period. For the identification of such “links” we totally rely on the irreplaceable, manual and accurate work of scholars. For this reason some preliminary work has been done by A. Thomas to manually identify within the whole text of Thucydides’ Pentecontaetia entities representing categories in the archaeological and philological evidence (e.g. built spaces, topography, individual persons, populations). However, what instead can be done at some extent by means of an automatic tool is extracting and parsing both canonical and modern bibliographic references that express the citation network between ancient texts (i.e. primary sources) and modern publications about them (i.e. secondary sources).

As corpus of secondary sources the journal articles available in the JSTOR and made recently available to researchers via the Data for Research API [3] are being used. Apart from JSTOR classification of such articles into the separate categories of archaeology and philology, those articles are likely to contain references to common named entities that make them overlap at some extent. As an example of what we are aiming to, in Th. I 89 the author refers to the rebuilding of the Athenian city walls – after the Persian War in the beginning of the 5th century BC – as a result of the politics of the Athenian Themistocles. Within our VRE, the corresponding archaeological and philological metadata [4,5] will be presented to the user along with JSTOR articles from both archaeological and philological journals related to the contents of this text passage.

From a technical point of view, we are applying Named Entity Recognition techniques to JSTOR data accessed via the DfR API. References to primary sources, that are usually called “canonical references”, and bibliographic references to other modern publications are to be extracted and parsed from JSTOR articles and will be used to reconstruct the above mentioned citation networks [6,7]. Semantic wise, the CIDOC-CRM will provide us with a suitable conceptual model to express the semantics of complex annotations about texts, archaeological findings, physical entities and abstract concepts that scholars might want to create using such a VRE.

References

[1] The Hellespont Project, <http://www.dainst.org/index_04b6084e91a114c63430001c3253dc21_en.html>.

2] Judith Wusteman, “Virtual Research Environments: What Is the Librarian’s Role?,” Journal of Librarianship and Information Science 40, no. 2 (n.d.): 67-70.
[3] John Burns et al., “JSTOR – Data for Research,” in Research and Advanced Technology for Digital Libraries, ed. Maristella Agosti et al., vol. 5714, Lecture Notes in Computer Science (Springer Berlin / Heidelberg, 2009), 416-419 http://dx.doi.org/10.1007/978-3-642-04346-8_48.

[4] Themistokleische Mauer, http://arachne.uni-koeln.de/item/topographie/8002430

[5] http://www.perseus.tufts.edu/hopper/text?doc=Thuc.+1.89&fromdoc=Perseus:text:1999.01.01999

[6] Matteo Romanello, Federico Boschetti, and Gregory Crane, “Citations in the Digital Library of Classics: Extracting Canonical References by Using Conditional Random Fields,” in Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (Suntec City, Singapore: Association for Computational Linguistics, 2009), 80–87, http://portal.acm.org/ft_gateway.cfm?id=1699763&type=pdf.

[7] C Lee Giles Isaac Councill and Min-Yen Kan, “ParsCit: an Open-source CRF Reference String Parsing Package,” in Proceedings of the Sixth International Language Resources and Evaluation (LREC’08) (Marrakech, Morocco: European Language Resources Association (ELRA), 2008), http://www.comp.nus.edu.sg/~kanmy/papers/lrec08b.pdf.

(Very Asynchronous) Highlights from the “III incontro di Filologia Digitale” (Verona 3-5 marzo 2010)

3-5 March 2010 in Verona was held the third edition of the “Incontro di Filologia Digitale”, a three day meeting with more than 15 presentations totally organized by Adele Cipolla, Paola Cotticelli, Roberto Rosselli del Turco.

The asynchronous highlights from the conference here presented were selected according to my personal interests. For a complete overview please refer to the program and the full list of presentations.

A bunch of presentations was related to epigraphy: Anelli, Muscariello and Sarullo talked about “The Digital Edition of Epigraphic Texts as Research Tool: the ILA Project”; Farina presented an “Electronic Analysis and Organization of the Syro-Turkic Inscriptions of China and Central Asia” and finally …

Barbera (hand out not available) and Tomatis presented the advancements of the Corpus Taurinense project, a corpus of texts written in XIII century Italian. After Barbera’s brilliant introduction to the corpus, Tomatis focussed on the problem of disambiguating POS tagging.