Keywords

1 Introduction

Shape-based classification and content-based retrieval of 3D data is an active topic in research domains such as computer vision, mechanical CAD, cultural heritage and archaeology, molecular biology, paleontology, medicine, etc. [1]. Classification and retrieval experiments are applied on benchmark datasets that compose a strong groundtruth of classified data (in our case 3D objects) in terms of morphological features, shape, color, etc.

In this work, we present the Orion Pottery Repository (http://orion.ee.duth.gr), a publicly available database that currently holds a total of 160 textured 3D digital replicas of ancient Greek pottery which have been classified based on their shape. The replicas in the current version of the database are primarily derived from the database of the Archaeological Museum of Chania while others belong to the Archeological Museum of Abdera. Orion offers to users the ability to expand the contents of the repository by adding their own 3D objects as well as their metadata that comply with the CARARE schema. The reason behind selecting ancient Greek pottery as Orion’s theme relies on the fact that ancient Greek pottery and in general pottery is one of the few artefact categories that come with texture, which provides supplemental information in relation to 3D shape. Moreover, morphological features such as a vessel’s main body symmetry, are primary features that 3D content-based retrieval algorithms, like the ones that Koutsoudis et al. [2] and Osada et al. [3] proposed, can be exploited to achieve better performance results in terms of retrieval and classification performance.

Since, the 3D objects hosted in Orion derive from the archaeological domain, their categorization dictated the collaboration with archaeologists [4]. One of the main approaches in classifying ancient Greek pottery is based on their use, but this alone would not serve the purpose of this repository. The morphological and technical properties of pottery as well as their decoration are determined up to a certain extent by their intended content and conditions of use and vice versa. Additionally, the attribution of pottery for the execution of certain tasks is determined by their construction features [5]. Thus, such a classification is considered generic and not applicable for shape classification and content-based retrieval. Another classification approach could be based on their illustration technique, but this was also rejected for the same reason. In addition, the presented database can be exploited in other research areas such as classification, retrieval and reconstruction of ancient vases from fragments [6].

Apart from downloading the benchmark dataset, Orion’s functionality includes the ability of viewing the 3D digital replica and archaeological information organized according to the CARARE metadata schema. CARARE is a harvesting schema intended to deliver metadata to the CARARE service environment about an organization’s online collections, monument inventory database and digital objects [7].

In addition, Orion’s registered users can upload their 3D content which can be organized in their own personal collections. This is achieved through a wide range of data management capabilities. Accessing and using Orion’s features is performed through a Web-based user interface. Orion is developed using open source technologies such as PHP, AJAX, HTML5, JavaScript, XML, XQuery and eXist-db.

The remainder of the paper is organized as follows: Sect. 2 briefly discusses on recent 3D object benchmarking databases provided to the research community. Section 3 presents a thorough description of the construction workflow related to the Orion Pottery Repository that consists of three parts: (i) 3D objects’ acquisition, (ii) the organization of the cultural heritage metadata and (iii) the development of the proposed thesauri. Section 4 further develops the classification of the 3D dataset and finally in Sect. 5 conclusions are drawn and possible future work is discussed.

2 Related Work

The benefits of using a database system (repository) to store and process efficiently digital assets have already been proven by many applications throughout the years. The same also applies for archeological cultural heritage assets. For instance, archeological museums that provided us with 3D digital replicas of their artifacts use a database system to manage such assets in an efficient way. Any categorization of its entries would be considered optimal if it is performed automatically and not manually. Content-based retrieval and classification algorithms can be used to perform such an important and complex task. For the performance evaluation of such algorithms benchmark databases are needed, where both classification and retrieval experiments can be performed. Numerous benchmark datasets have been proposed by different research groups. Many of these have been used in annual competitions such as the SHREC contest [8].

The existence of different benchmark datasets is related to the wide range of applications of 3D content-based retrieval and classification algorithms. For example, in computer architecture, the SPEC benchmarks [9] have been used successfully for evaluating processor performance. For facial recognition there is the GavadDB: 3D Face Benchmark [10]. Additionally, SHREC’07 Protein Retrieval Challenge [11] provides a benchmark dataset of 3D objects that represent proteins structures.

Even though there are numerous benchmark datasets available, there is a limited number of benchmarks that are related to the Cultural Heritage and archaeological domains. Table 1 gathers a few general-purpose benchmark datasets while Table 2 shows domain-specific benchmark datasets. For each dataset, the table contains the domain coverage, the number of objects (cardinality) and whether they provide texture information or not.

Table 1. General purpose benchmark databases for 3D object retrieval
Table 2. Domain-specific benchmark databases for 3D object retrieval

In most cases, the texture of 3D objects does not offer essential information for their retrieval or classification. For this reason, the majority of the retrieval or classification algorithms are based on features derived only from their 3D geometry, neglecting the information that is contained in their texture. However, there are cases that texture offers significant information. 3D models of human faces and pottery are such examples. Figure 1, depicts three examples of vases where it is obvious that the texture holds additional vital information. As it can be observed even though the first and third vases belong to the same classification their texture distinguish them from each other significantly. While there is a database for human faces (see Table 2), to the best of our knowledge there is not a database for textured pottery suitable for testing retrieval and classification algorithms. The proposed work attempts to offer such a database. Some of the innovative features of the proposed database are:

Fig. 1.
figure 1

Examples of pottery with texture. From left to right, based on shape, the vases are classified as amphora, lekythos and amphora

  • It is domain specific (Ancient Greek Vases).

  • All of its models were processed and they are 2D manifoldsFootnote 1 [12].

  • All 3D models have bitmap based texture information.

  • There is a proposed groundtruth schema suitable for classified classic antiquity pottery (Sect. 3.3). The repository’s GUI offers dropdown lists to assist the classification of new entries. Additional groundtruth schemas will be available for other types of vases.

Even though many benchmark datasets have more 3D objects than Orion, we propose a domain-specific database that deals with textured 3D digital replicas of actual ancient Greek pottery artifacts. Therefore, the Orion enables 3D objects content-based retrieval and classification algorithms to be evaluated on real-world case scenarios offering objective evaluation of their performance, while assisting the development of the recent idea of matching 3D objects at the texture map level.

3 Orion Pottery Repository Construction Workflow

Cultural heritage information along with digitization information for each vase, as well as corresponding classification information are stored into an instance of the eXist-db database while with the aid of HTML5, PHP and JavaScript a Web interface enables users to access the benchmark database’s capabilities. The query language that enables Web interface-database communication is the XQuery language.

The classification topic will be developed in detail in Sect. 4. It is important to mention that it was performed by expert humans, based on the vase shape, with the collaboration of a specialized archaeologist due to the nature of the database’s entries.

3.1 3D Objects Acquisition

Orion currently contains 160 3D objects, obtained under the Creative Commons License from the 3D database of the aforementioned museums, the archaeological museum of Chania [32], and 3D Europeana entries from the archaeological museum of Abdera [33]. This is considered the first version of the dataset and we expect that in the near future users will add more textured 3D models.

3.2 Cultural Heritage Metadata Organization

All information derived from the cultural heritage domain are organized using the CARARE metadata schema (Version 2.0.4). The same schema provides solutions for storing information related to the digitization procedures being followed to create the 3D digital replicas of the pottery artefacts.

The metadata schema was developed by CARARE, a project aiming towards the collection of unique archaeological monuments, architecturally important buildings, historical centers of cities, industrial monuments and landscapes for users of the European digital library Europeana [7]. CARARE was later modified by the European Program 3D Icons, in order to be more suitable for 3D digital replicas of cultural objects. A set of tools were also provided to allow users to map metadata following the CARARE schema to the Europeana Data Model (EDM which is used to describe the content presented in Europeana).

As in the EDM, there are also entities in the CARARE schema that allow a vast range of information to be described. Such entities are, for example, the top entities of the CARARE metadata schema, which may include a number of entities. The CARARE schema follows an object-centric model for approaching metadata information. Figure 2 presents a visualization of the entities mentioned.

Fig. 2.
figure 2

A basic visualization of the CARARE metadata schema [7]

The second edition of the CARARE metadata schema, as it was developed by the 3D Icons project, dictates the structure of the XML files that are used to hold our database’s vases metadata. The top element entities of the schema are presented below [34]:

  • CARARE WRAP - Contains CARARE. There may be more than one record that follows the CARARE model.

  • CARARE - The CARARE start entity. Includes Heritage Asset, Collection information, Digital Resource and Activity entities:

    • Collection information - contains the description of the collection.

    • Heritage Asset - holds the metadata for a monument, building or cultural object including printed matter and digital objects, including descriptive and metadata management.

    • Digital resource - holds metadata about a digital resource, including its location on the Internet.

    • Activity - contains the metadata for an event or activity related to the heritage asset.

Figure 3 shows the top entities of the CARARE metadata schema version 2.

Fig. 3.
figure 3

Top entities of the CARARE metadata schema version 2

3.3 Proposed Thesauri Development

As mentioned in Sect. 1 archaeologists contributed for the classification of the dataset.

The taxonomic system applied is based on shape, one of the main criteria for pottery typology. There is a vast amount of pottery manuals, let alone bibliography on specific topics of pottery analysis, useful for endeavors like this. We are mainly based on manuals of Boardman [35,36,37], Cook [38], Schreiber [39], Trendall [40] and Avramidou and Tsiafaki [41]. We also broadly used the Classical Art and Research Center website of the University of Oxford [42] and the Perseus Digital Library of Tufts University [43].

Variable levels of classification (depths) are used. The first level (depth) of classification is the basic distinction between open (e.g. dish, pinakion/plate, phiale), closed (e.g. amphora, pithos, oinochoe) shapes and small objects. The second level includes all the basic shapes of classical antiquity, whose names are usually associated with their use (e.g. oinochoe from oino- = wine + choē = action of pouring out). A further grading, containing the variations within each shape, constitutes the third level (e.g. geometric oinochoe, lagynos, attic oinochoe etc.), whereas in quite a few occasions the taxonomy reaches a fourth level and an even more detailed typology (e.g. attic oinochoe type I, chous, etc.). We have also included few examples of prehistoric pottery as well as lamps, whose contour renders them a satisfactory criterion for comparison, although technically they are considered minor objects, not pottery. The application of the archaeological taxonomy resulted a total of 147 different categories in which a hosted item can belong. It must be noted that each item may belong to one and only one of those categories.

The goal therefore was to create a thesauri, such as the Getty Research Institute [44] and the European Heritage Network (HEREIN) [45]. The utilization of a controlled vocabulary for describing cultural heritage assets offers great assistance to the various researcher communities, such as content-based retrieval and classification algorithms development, due to the fact that the textual annotation can introduce ambiguities.

4 Classification

According to the classification described above, there are four levels (depths) of classification based on a vase’s shape. The first level of classification includes three types of ancient objects: (i) open vases, (ii) closed vases and (iii) small objects. These three main categories are subdivided into a set of sub classifications and those sub classes have two additional levels of classification related again with their shape.

The second level of classification includes the following classes: dish, exaleiptron, kantharos, klepsydra/ water clock, krater, kyathos, kylix, lakaina, lebes/ cauldron, louterion, pinakion/plate, skyphos, stamnos, phiale, alabastron, amphora, aryballos, askos, feeding bottle, lekythos, lydion, oinochoe, pithos, pithamphora, pyriatiri, pyxis, rhyton, hydria, phormiskos, psykter, lamp.

In total, the second level of classification has 31 different categories, whereas the third level has 106 categories. Finally, the fourth level, concerning few selected shapes, has only 29 different categories. Table 3 shows all the different classes that derived from the four-level classification that was applied.

Table 3. Orion vases database classification

Moreover, in order to utilize the shape-based classification, a set of XML files was created where each one represents one category of the classification. Each file has a unique identifier (ID), as well as four other fields indicating the use of the vase of this category, its dating, a paragraph with a short description and a field containing the location of an image an example of a vase of the category appears. There is a total of 147 such files, all complying with the XSD file built to validate the specific XML files.

The parent entity of each XML file starts with the <vase> tag, which includes the id entity that is the vessel category identifier, in general_type is contained information about the top class (closed vases, open vases, small items). In vase_name the shape is described which is the second level of classification and in the type and type_1 entities respectively the third and fourth level of classification. The description field is a description of a vase in this category, the thumbnail is the URL of the photo, and is also given the option to implement bibliography for the description entity. Finally, in addition to id, all entities can be replicated within an XML and recorded information in different languages. The lang attribute is responsible for recording the language in which the information is written in the field.

Figure 4 presents a visualization of the previously described schema.

Fig. 4.
figure 4

Classification schema visualization

5 Conclusions

In summary, this paper describes the Orion Pottery Repository, a publicly available benchmark dataset that hosts 3D objects of real ancient Greek vases associated with texture information with the possibility for registered users to download them, along with a groundtruth file, for content-based retrieval and classification algorithms performance evaluation. All data and source code are freely available on the Web (http://orion.ee.duth.gr/).

The main contribution of this work is that we provide textured 3D objects that are also classified based on their shape. We hope that retrieval researchers will use the benchmark in future experiments, further developing content-based retrieval and classification algorithms technology.

Future work for further developing of our benchmark database would be ideal to upload and classify more 3D objects of vases across the different classes that we have suggested in order to make more challenging and objective any performance evaluation using our data.