Basic Data Governance Terminology

5 minute read

Data Governance

(a.k.a. DG) a consortium based on a Hyperledger Fabric network, setup between multiple organisations (currently only VPF and ThPA) for the explicit purpose of:

  1. exchanging information on datasets provided by the various members, and
  2. granting permissions on these datasets to other members of the consortium.
Internal organisation

an organisation participating in the Data Governance ecosystem that is a fully participating member of the Fabric network, i.e. it runs its own peer nodes.

System organisation

an internal organisation, not corresponding to any real-world entity, which is automatically created at system start-up to support specific system functionality.

External organisation

an organisation that wants to participate in the Data Governance ecosystem, but does not want to (or cannot) be an internal organisation, i.e. it does not want to (or cannot) run its own peer nodes. Therefore, it is not a Fabric peer organisation, but a logical entity.

User types

In data governance, each organisation can have two types of user:

  • organisation admins

    Each organisation starts with a principal organisation admin, who can then proceed to add other administrators/simple users. Each subsequent administrator can also manage and add users to his/her organisation. Administrators can act as data providers (i.e. add dataset metadata) and/or data consumers (i.e. request access to datasets).

  • simple users

    The simple user’s function is to act as a data provider and/or a data consumer.

EXT org (ee-ex-tee)

system organisation created for the explicit purpose of registering external organisations under it. Two administrators are present in the EXT organisation, one for VPF (adminVPFext) and one for ThPA (adminThPAext). These administrators are responsible for creating affiliated organisations, each for its own port. Each external organisation has an administrator and can have multiple simple users. Through their users, external organisations can share their datasets and ask for permissions on the datasets of others, just like internal organisations.

Prosumer organisation

an organisation that participates in Data Governance either to make its own datasets available, or to gain access to datasets that other prosumer organisations have made available. All non-system internal organisations and all external organisations are prosumer organisations.

Data Governance starts with three principal internal organisations: two prosumer organisations representing the ports of Valencia and Thessaloniki (VPF and ThPA respectively) and one system organisation (EXT).

Dataset provider organisation

the organisation which consumers will retrieve the dataset from; this is the organisation of the user that added the dataset’s metadata to Data Governance.

Dataset owner organisation

the organisation that owns the dataset.

A dataset’s owner organisation can also be the provider organisation for that dataset on DG, but not necessarily. It could be that the owner is a separate organisation that is not even a member of DG.

Internal dataset

a dataset is considered internal with respect to the current logged-in user, when they are the ones providing it, i.e. they are the user that has added the relevant dataset metadata to Data Governance.

External dataset

a dataset is considered external with respect to the current logged-in user, when they are not its provider, but are consumers to it, i.e. the current user has already asked for permissions on the given dataset and was granted those permissions by its provider.

As we can understand, a dataset is internal to its provider, but external to its consumers.

Data governance metadata

a set of metadata describing a given dataset, which is entered in and kept solely by the Data Governance component.

Datasource metadata

a set of metadata describing a given dataset, which is generated and kept by the Semantic Interoperability component. Crucially, this metadata is also requested by Data Governance and stored in the blockchain, together with the Data Governance metadata.

Dataset permissions

the access rights that a user can have on a dataset. There are three standard permissions: read, modify, persist, but providers can also define custom access rights on their datasets.

As is evident from the above definition, we use the terms permissions and access rights interchangeably.

Organisational dataset

a dataset is called organisational with respect to the users of the dataset’s owner organisation. If an organisation owns a dataset, that dataset is an organisational dataset to all users of the organisation.

Personal dataset access

this is the basic access level where the provider of the dataset grants access to a specific DG user.

Organisational dataset access

with this access level the provider grants access to every user of an organisation.

Public dataset access

using the public access level the provider grants access to every DG user, irrespective of organisation.

It follows that a user might have access to a dataset through all three access levels, i.e. the provider could have granted the user with personal access to the dataset, while simultaneously having granted organisational access to the user’s organisation and also made the dataset publicly available. Losing one of the access levels does not affect the others, i.e. the user will continue to have those permissions granted by the other two levels.

As an example, let us consider a user with personal read permission on dataset, modify permission due to organisational access, and also read public permission. The aggregate permission for this user on the particular dataset is modify. Now, if the provider revokes the organisational access for the user’s organisation, access to the dataset is not altogether lost, but the aggregate permission now drops to read. If personal access is also revoked, the user still has public read access.

Last modified May 2, 2023: Updated: "How-to" page for DG (0a2107f)