Speaker
Description
In the framework of ARIADNEplus task NA4 and EOSC-Pillar working packages 6.5 and 6.9, a cloud system, dubbed Tools for HEritage Science Processing, Integration, and ANalysis (THESPIAN), was developed, which offers multiple microservices to the researchers of the Cultural Heritage Network (CHNet) of INFN, from storing their raw data to reuse them by following the FAIR principles for establishing integration and interoperability among shared information.
The CHNet cloud currently offers three web services: THESPIAN-Mask, a service for assisted metadata generation and data storage, based on an ad-hoc developed ontology called CRMhs; THESPIAN-NER, a tool based on a deep neural network for Named Entity Recognition, which can interpret Italian-written archeological documents and annotate them extracting named entities, that are used for devising custom CHNet database queries; and XRF analyser, a tool for on-line, real-time elaboration of raw data of X-Ray Fluorescence imaging analysis performed on Cultural Heritage.
The platforms have a modular architecture based on containers: the first one hosts a web server offering a graphical web application to input and search data and institutions; the second one is a server processing requests from the web application and interacting with the database deployed within a third container. In front of these a fourth container hosts a reverse proxy that acts as an SSL/TLS terminator and enforces access control policies through a token-based authorisation mechanism relying on the IAM service developed by the INDIGO project. The container architecture allows easy deployment of the whole cloud service. Also, the front-end part was developed using Angular with TypeScript, while the various Restful APIs are either written using Node.js and/or Django; the latter was chosen to easily embed the deep neural network developed with Python. To persist data storage, the NoSQL database MongoDB is employed.
The digital infrastructure, with all its services, is currently in the pre-production stage, where only a few researchers can access the system, mainly to test the digital infrastructure and web services, ingest preliminary data and check if the design of the system fits their needs. The system will be open to all the researchers of the network for further general tests and to ingest generic scientific data on cultural heritage; the orders of magnitude the system will handle are about O(100) data report per network’s node inserted in the beta testing phase, with O(10) nodes in the nodes, each of it consisting of O(10) researchers.