Introduction

The eSciDoc Infrastructure is a middleware for eScience applications. The infrastructure encapsulates a repository (Fedora Commons) and implements a broad range of services. Its service-oriented architecture fosters the creation of autonomous services, which can be re-used independently from the rest of the infrastructure. The multi-disciplinary nature of the MPS ensures the coverage of a broad range of generic and discipline-specific requirements.

The eSciDoc Infrastructure is an "enabling technology": Scholars and Scientists can focus on domain-specific application logic when building new applications. It provides them with an existing and proven implementation of common functionality, thus ensuring interoperability and compliance with important standards. Additionally, it allows for the operation of a production environment by a dedicated unit, e.g., a fully-fledged data center. The institutes do not have to care about managing the production services, but can rather concentrate on their scientific and scholarly work.

Key Features of the eSciDoc Infrastructure

  • Flexible content models
  • Arbitrary metadata profiles
  • Application-independent design
  • Support for object relations and multiple ontologies
  • Search (OpenSearch, SRW/SRU)
  • Distributed Authentication/Authorization (Shibboleth, CAS)

Services of the eSciDoc Infrastructure

Currently, the eSciDoc Infrastructure includes the following services:

Additional software

The Core Infrastructure is mainly built out of existing open-source software packages. Main components are PostgreSQL, JBoss Application Server, and Tomcat Servlet Container. The eSciDoc Content Repository is based on Fedora (Flexible Extensible Digital Object Repository Architecture). Fedora comes with a Semantic Store (Kowari Triplestore or MPTStore), which allows for the efficient administration of statements about objects and their relations, expressed in RDF (Resource Description Framework). Related objects form a graph, which can then be queried or used to infer new facts, based on existing RDF.

Components of the Basic Services Layer

The eSciDoc Infrastructure is implemented as a Java Enterprise Application (J2EE). It can be roughly differentiated into the Enterprise Context and a Persistence Layer. The Enterprise Context is deployed to the JBoss Application Server and the Tomcat Servlet Container. The Spring Framework provides a centralized, automated configuration and wiring of the application objects by Dependency Injection and Inversion of Control. The service layer offers web services with REST and SOAP interfaces. The Persistence Layer encompasses specialized solutions for the different types of data: an RDBMS for structured data, Fedora for unstructured data, and MPTStore for semantic data.

The Basic Layer implements a set of resource handlers. Each resource handler is responsible for handling of a specific type of a resource. The most important resources are Items, Containers, and Contexts. Items are basic objects that represent content entities within the repository, e.g. articles, images, or videos. Containers are aggregation objects that allow for arbitrary grouping of items and other containers. Whereas the general layout of Item and Container resources remains the same, they can be further specialized by content types. Content types impose constraints on objects (e.g. allowed metadata schemas, required metadata, allowed file types and mime types for the binary content and specify a set of content type specific properties). Contexts represent units of administration for a set of Items and Containers. They are associated with an institutional body responsible for the management of the content.

Related Information: