Infrastructure of AI4PublicPolicy services pt.2

In the second and final part of the blog series on the Infrastructure of AI4PublicPolicy services, light is shed on the Compute and Data Federation layer and the Platforms layer.


Compute and Data Federation

The Compute and Data Federation layer provides advanced solutions to manage the resources available in the Federated Resource Providers layer such as the DataHub for the federation and management of data and the cloud orchestrators for the management of the compute resources.

The following figure shows the Federated Resource Providers layer.

DataHub: EGI DataHub is a federated service, integrated with EGI Check-in allowing users to access and share their data from anywhere using either fully restricted access based on access tokens or publicly shared data sets. EGI DataHub is provisioned based on the Onedata distributed data access and management system. DataHub is a globally distributed storage solution, integrating storage services from various providers using possibly heterogeneous underlying technologies.

DataHub enables seamless data sharing between users, with strict access control. Users can share access to individual files as well as spaces by sending automatically generated access tokens.

All DataHub components have APIs defined using OpenAPI specification version 2.0, enabling easy integration and automatic generation of client libraries for most existing programming languages and frameworks.

Infrastructure Manager: The Infrastructure Manager (IM) Dashboard is designed to enable non-advanced users to launch complex virtual infrastructures on top of a wide range of cloud providers (AWS, Google Cloud, Microsoft Azure, EGI Cloud Computing, OpenNebula, OpenStack, and more). Only with a few clicks the user can deploy the set of available topologies expressed through templates written in TOSCA (Simple Profile in YAML version 1.0). Then the IM service orchestrates the whole process: deployment of cloud resources, configuration, software installation, monitoring, and update of the virtual infrastructures.

The Infrastructure Manager offers several functionalities to its users. These include OIDC authentication, which ensures secure access to the platform. Users can view their infrastructures and access details, templates, and logs for each infrastructure. Additionally, they can create new infrastructures and add nodes to them, as well as resize virtual machines. The platform also provides access to cloud resources, making it a comprehensive solution for managing infrastructure. Lastly, users have the option to delete infrastructures when they are no longer needed.

The entry web page for the tool is shown in the following screenshot:

Elastic Cloud Computing Cluster: Elastic Cloud Computing Cluster (EC3) is a tool to create elastic virtual clusters on top of Infrastructure as a Service (IaaS) providers. Being based on Infrastructure Manager (IM) EC3 supports the same wide choices of back-ends. It offers recipes to deploy SLURM, Kubernetes, Apache Mesos and others. EC3 creates elastic cluster-like infrastructures that can automatically scale up or down, depending on demand and the configured policies. This creates the illusion of a real cluster without requiring investment beyond the actual usage, delivering a cost-effective elastic Cluster-as-a-Service on top of an IaaS cloud.


Platforms layer

The Platforms layer is built on top of the federation, either by using IaaS APIs or Federated IaaS provisioning, and can provide community-specific data, tools and applications. Two relevant services are Notebooks and the DEEP training facility.

Notebook: Jupyter Notebooks complemented with Binder allows users to create data and code-driven narratives made of interactively executable code, equations, descriptive text, interactive dashboards, and other rich media. Its integration with the other services of the computing platform, from check-in to DataHub, allows to easily build a community interactive notebook environment, with fine-grained control on user access, and the possibility to access and share stored data.