News and Events

From AI Outputs to Searchable Knowledge

The role of SKIM in DaFab system architecture

Introduction: Turning a Copernicus data scale archive into something you can search

Copernicus has grown into an archive where the limiting factor is rarely the availability of pixels, but the ability to find the right ones. To do so, users still have to start from copernicus product descriptors (e.g. Sentinel-2 tile, date, processing level, etc…) and only later test whether the data contains the signal of interest. DaFab system addresses this gap by generating secondary, AI‑derived metadata at scale and exposing it as a discovery surface, so that users can begin with a thematic question instead of beginning with file selection (e.g. “How many agriculture parcels are there ?” in smart-agriculture thematic and “Where can I find water anomalies ?” for water-analysis one).

News and Events

Multi-Cluster Workflow Execution with Karmada and Argo Workflows

In our previous DaFab post, we introduced the overall multi-site orchestration vision. This entry focuses on a specific architectural building block: integrating Karmada with Argo Workflows to enable multi-cluster and multi-site workflow execution driven by rule-based placement. The key outcome is that workflow steps can be dispatched to different Kubernetes clusters (sites) based on explicit rules that reflect data locality and computing resource availability goals, without changing how users author workflows.

News and Events

Rucio’s New Metadata Intelligence

Usability, Impact, and a New Horizon for DaFab and the Global Rucio Community

Over the past year, the DaFab project has become a catalyst for the evolution of the Rucio data management system. While initially designed to support the ATLAS experiment at Cern, today Rucio serves a far wider community of scientific collaborations with complex data needs. The DaFab initiative, centered on extracting value from massive Copernicus Earth Observation archives, has pushed Rucio into new territory, beyond file cataloguing and distributed data placement, and into the realm of rich semantic metadata and powerful filtering.

News and Events

DaFab’s Data Management with DASI

Workflows processing Earth Observation (EO) data have a problem – the body of available EO data is vast. And growing rapidly. Within the DaFab EU project, AI-driven workflows must process massive quantities of EO data, made available by the Copernicus project, in an efficient and reliable manner. This presents a range of problems, including locating the relevant data, decoupling relatively fast and scalable compute tasks from slower data transfers, storing the data in a way that the workflows can use it, and managing the lifetime of any temporary copies required. This is where DASI (the Data Access and Storage Interface) plays a critical role. It provides the smart bridge between storage systems and compute environments. DASI’s semantically driven data management design helps build intelligent, scalable, and optimized AI workflows in the DaFab project.

News and Events

High Performance Kubernetes (HPK)

Bridging Cloud-Native Workflows and HPC for DaFab

One of the main goals of the DaFab project is to enable seamless, multisite scientific workflows—without the need to physically move data between sites. This is crucial in environments where data transfer is restricted by bandwidth, administrative domains, or security policies. One of the main DaFab’s goals is to orchestrate distributed workflows across multiple HPC and cloud sites, letting data stay in place while computation moves to where it’s needed.