Diagnosis of the existing situation on data management
Considering the multiplicity of data producers and data collection processes, and considering that the information system must be built on the existing situation avoiding duplication of activities and respecting the roles of each partner, a diagnostic phase can help to establish a clear vision of:
- The roles and activities of the actors, including:
- Who is supposed to do what in the field of water data administration?
- Who produces what, specifically?
- Who needs what data and what information?
- What are the characteristics of the existing datasets and existing information systems managed by the main actors?
- What are the existing regular data exchange flows between actors?
- What are the needs of the various actors?
The potential outputs of the diagnostic phase (metadata catalogue, dataflow diagrams, data dictionaries of existing information systems, etc.) generally help to:
- Identify the data producers likely to be involved in the process;
- Select the datasets to be collected in order to produce the expected information;
- Identify the issues of data comparability that will have to be solved when combining;
- Specify the global architecture of the system and procedures for data exchange and dissemination organizing the interoperability between the various systems;
- Define the main tools for data processing and information production/dissemination;
- Agree on the rules of the system’s governance between the partners involved.
Potential outputs of the diagnostic phase
Specific attention should be given to the production of metadata, which are “data about data”.
Indeed, to facilitate traceability and ensure that data are not misused, the assumptions and limitations affecting the creation of data must be fully documented. Metadata allow a producer to describe a dataset fully so that users can understand the assumptions and limitations and evaluate the dataset's applicability for their intended use.