Geologica Belgica Geologica Belgica -  Volume 23 (2020)  number 3-4 - The Neogene stratigraphy of northern Belgium 

A reference dataset for the Neogene lithostratigraphy in Flanders, Belgium

Katrien DE NIL
VPO, Flemish Planning Bureau for the Environment and Spatial Development, Department of Environment & Spatial Development, Koning Albert II-laan 20, 1000 Brussels, Belgium;; corresponding author.
RBINS, Royal Belgium Institute of Natural Sciences, Scientific Heritage Service, collection geology, Vautierstraat 29, 1000 Brussels, Belgium;
VPO, Flemish Planning Bureau for the Environment and Spatial Development, Department of Environment & Spatial Development, Koning Albert II-laan 20, 1000 Brussels, Belgium;


Subsurface research often makes use of information from locations where subsurface investigations occurred or that had temporary outcrops. In time, data and knowledge of these locations increases, though compilation of the former information, and uniquely identifying these in subsequent publications is seldom done. Data quality control and documentation are therefore required, including tracing the data sources to their unique reference in governmental databases. In this paper, a five-step approach is described on how all relevant data can be uniquely combined into a reference dataset for the Neogene of Flanders. The dataset is made available in the online web portal for soil and subsoil in Flanders. The individual data points, as well as the reference dataset, can be consulted and are re-usable in an accessible format by scientists, professionals and citizens with an interest in the subsurface, and even so by machines. The reference dataset approach can be elaborated for other subsurface data collections and is proposed to evolve to a standard practice for open subsurface data in Flanders. It increases the visibility and the quality of the data and the research. Inclusion of a reference dataset URL in research or other portals further contributes to data and knowledge integration. Such an open data approach is pivotal for (governmental) data management institutes providing geological services that facilitate a more sustainable use and management of the subsurface.

Keywords : subsurface, archive, open data management, research portal, research data, quality control

1. Introduction

1The Neogene state-of-the-art compilation by the Subcommission for Neogene and Paleogene Lithostratigraphy of Belgium, formed an excellent opportunity to collect the data points to which the individual publications refer and to identify these data in the federal and regional databases. Two main data holders provide structured information concerning boreholes and outcrops of the geological subsurface of the northern region of Belgium, where Neogene deposits underlie the Quaternary sediments: (1) the regional Flanders Soil and Subsoil Database (Databank Ondergrond Vlaanderen or DOV), and (2) the federal Geological Survey of Belgium of the Royal Belgian Institute of Natural Sciences (GSB-RBINS).

2DOV (; 2020) is the public web portal for open data concerning geology, natural resources, soil, hydrogeology, geotechnical characteristics and groundwater licenses of the (sub)soil of Flanders. The Flemish Planning Bureau for the Environment and Spatial Development (VPO) of the Department of Environment & Spatial Development is responsible for the durable management and use of the subsurface, including the natural resources and the deep subsurface (-500 m TAW) (Departement Omgeving, 2020). VPO is also responsible for the geological data and information available on the DOV-platform. In addition, VPO manages the soil data in cooperation with all relevant stakeholders in that domain and coordinates the development of DOV.

3The GSB-RBINS manages an archive of, amongst others, temporary outcrops and boreholes in the Belgian subsoil (RBINS, 2020).

4Geological reports and scientific publications rely on and refer to, amongst other, data types, boreholes, cone penetrations tests (CPT) and outcrops, often without a unique reference to one of these data collections. Often the points are only referred to by the name of the village, assuming the readers have a-priori knowledge of the data, or have no need for more detailed information. Correct re-use of the data in future research, correlations and projects of any interest becomes a challenge. How can we do better in managing the datasets on which subsoil research in Flanders is based and in making them easily available for the public at large? What is in it for the scientists, the government and data and science portals? Here, a five-step approach is proposed for identifying, collecting and referring borehole and outcrop datasets for the Flanders region.

2. A federal and a regional geological data source for Flanders

5The Geological Survey of Belgium (GSB) was established as a Geological Survey by a Royal Decree of 16 December 1896. Being commissioned with the “study of extraction materials and water theory”, GSB focused on subsurface data collection and management, including physical collections. In a Royal Decree of 1919, the geological mapping of Belgium was officially assigned to the GSB. Still today, the data archives of the Belgium territory can be consulted in physical form and are nowadays also available in digital format.

6This federal archive consists mostly of borehole data collected, end 19th and begin 20th centuries by geological mapping and also more general after 1939 when it was enforced by law to declare each borehole deeper than 30 m to the GSB. In most cases, also lithological descriptions were delivered. Subsequently, a geologist of the GSB added a geological interpretation. Data points collected in geological projects in which the survey is involved are also added. Originally only paper copies were available, accompanied with a map showing the locations of the data points. At the end of the 20th century, all descriptions and coordinates were digitized and transferred to a text file and database. The ascii-files are available on the collection website of the RBINS. This federal archive forms the first data source of many geological data projects, including for the regional DOV.

7In 1980, following Belgian state reformations, the Flanders region became the competent authority for the environment, including groundwater, natural resources and protection of environment. The competence for acquisition, collection, management and publication of subsoil data for the Flanders region combines coherently and has been officially managed in the partnership of DOV.

8DOV was founded in 1996 as a collaboration within the government of Flanders. Today, it is a network organization with many subsurface oriented stakeholders (De Keyzer et al., 2019). The three main partners, all related to Flanders (sub)soil policy, subscribed the DOV cooperation protocol in 2006, namely:

  • Department of Environment & Spatial Development, Flemish Planning Bureau for the Environment and Spatial Development (VPO), responsible for soil and subsoil policy, research, monitoring and evaluation;

  • VMM, Flanders Environment Agency, responsible for groundwater policy;

  • Department of Mobility and Public Works, Geotechnical Division, responsible for geotechnical research and measurements.

9The subsurface policy and advise in Flanders are built on this vast amount of DOV data and on the expertise available through this cross-thematic network. The DOV portal has a central role in geological projects directed by the Flemish regional government, but also serves as the information platform for subsurface data of Flanders for external users (De Nil et al., 2016). During the regional geological mapping program, started in the 90’s, archives of several federal, regional as well as research institutes were digitized and collected in the DOV database (Vandenberghe et al., 2015). Hereby, the archive of the GSB was the major data source. A structured database and the spatial context were key aspects of the new geological map data system (Jacobs et al., 1993). This program was the base for the structured geological information in DOV (De Baets, 2004). To date, historical data and interpretations, but also newly reported subsoil investigations and interpretations are re-used in new mapping and (3D) modelling initiatives. The DOV database serves as a prime source for the 3D geological mapping program of VPO by VITO (Flemish Institute for Technological Research) (Matthijs et al., 2013; Deckers et al., 2019) and for the H3O-cross-border models with the Netherlands (Vernes et al., 2018; Deckers et al., 2014) by the GSB and VITO. Consequently, the collection of old and new data is the backbone of new tools, such as the Virtual Borehole (De Nil et al., 2018). Quality control of both archived and newly delivered data is a big challenge in keeping the database at a qualitative level. Since 2018, VPO also manages a subsurface sample repository in Vilvoorde, with data management supported by DOV.

10DOV is in charge of implementing the requirements from the INSPIRE Directive (Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community) and will provide harmonised data from Flanders to Europe. Standardisation of the DOV database will ensure Flanders to be connected to the wider European geological context (Bartha & Kocsis, 2011), and facilitates querying and visualisation of the data. Recently, pyDOV, a community-based open source package, is developed to support machine-based extraction and conversion of the data. This enables the integration of DOV data in larger data processing pipelines, and supports the reproducibility and/or repeatability of research studies (Haest et al., 2018).

11The DOV map viewer (Fig. 1) allows (non-)geographical search by means of a location, names or other attribute information. The catalogue of webservices ( supports the use of the DOV data in user-GIS systems. Detailed instruction guides for the different modes of access to the data can be found on the DOV website.

Image 100000000000237000001D972A17F61F5B6C5683.jpg

12Figure 1. View of the activated search engine in the DOV map viewer, publicly available on

13Both the DOV database and the GSB archive are an important data source for subsurface projects of Flanders and Belgium. However, both data systems are not interlinked. Combining the information of the federal and the regional data infrastructure creates an added value. The coupling of information implies performing a quality check on the level of the identification of the individual data points as well as the metadata and interpretations available in the two data systems. Figure 2 illustrates the different steps of historical borehole documentation from the original written paper in the GSB archive to a standardised object of DOV in the context of the mapping program, with added interpretations resulting from geological projects carried out after this transformation.

Image 1000000000000E9B00000B2715FF58AE7E8F1B15.jpg

Figure 2. Pathway of information of an archived borehole from the GSB archive to the DOV database, with (1) the scanned written field description from the GSB archive, (2) the typed description in the GSB archive, (3) txt-format of the GSB archive, and (4) standardized DOV report of the same borehole with added interpretations as a result of geological mapping or modelling projects.

2.1. Declaration of the boreholes

14The shift in competent authority from the federal state to the regions for subsurface data, is reflected in different legislation concerning the declaration of boreholes. On the one hand, the Royal Order no. 84 of 28th of November 1939 determines that investigations of the subsoil deeper than 30 metres must be declared to the Geological Survey of Belgium. On the other hand, as from 1984, the groundwater management decree was the base for the current groundwater policy in Flanders. Since 1999, VLAREM (Order of the Flemish Government of 1 June 1995 concerning general and sectoral provisions relating to Environmental Safety) regulates the drilling activities linked to the groundwater licences in Flanders. Since 2013 VLAREL (Order of the Flemish Government of 19 November 2010, establishing the Flemish regulation on recognitions relating to the environment) was extended: all the drilling companies, performing mechanical drillings in the subsoil of Flanders, are subject to a compulsory accreditation. As from 2017, this accreditation goes with an obligation to communicate all the drilling reports to DOV at least every two months. The drilling companies are responsible for the quality of the supplied drilling reports by their company. The VMM is the competent authority.

15Next to this, boreholes in Flanders related to soil remediation and archaeological context also have their own specific legislation.

2.2. Identifiers for geological information points

16Geological observations and measurements, including descriptions and interpretations, are, amongst other data types, the fundament of geological research.

17Identifying this information and using the identifiers in the database in an unambiguous way is critical. The use of Uniform Resource Identifiers (URIs) is a way to uniquely name a data source. The use of stable data identifiers across datasets and publications creates opportunities for further data and knowledge integration. When those identifiers are HTTP URIs, so called URLs, users can discover the data easily (W3C, 2017).

18In order to avoid mistakes when re-using interpretations of borehole and field data from previous work, it is importance to use unique identifiers for data points.

19The GSB and DOV each have their own, independent system of identifying geological information points.

2.2.1. Identifiers in the DOV regional database

20DOV supports different types of names for boreholes and observations. Firstly, the objects have a DOV name. This name can be freely chosen, but must be unique. Mostly, it stems from the originating project of the boreholes in the database. Recent boreholes are named by the drilling companies managing their data in the database. Secondly, an observation can also have alternative names, which refer to names given to the object in other archives or reports. The (partly) freely chosen and unique object name is associated to a unique identifier. This unique, standardized 10-digit reference has the form: ‘XXXX-XXXXXX’.

21These persistent identifiers of a DOV borehole can also be used in URLs as from 2016, giving access to the objects’ available information, compliant with the URI strategy in Flanders (Informatie Vlaanderen, 2017). An example of such a URL, including the persistent identifier, for borehole ‘B/1-1101a’ is It refers to the current state of the digital object that is found online and in the database, independent of the application. The aforementioned URL provides the information on the object by default in HTML format (Fig. 3). However, other common formats, like XML and JSON, are also supported for particular types of use and applications.

Image 10000000000010B100000A39723E001F993D9540.jpg

Figure 3. The URL of a database object, here borehole ‘Arendonk’ (DOV kb9d18w-B81, BGD018w0197,, allows the user to consult the drilling data, descriptions, interpretations, attachments and links, amongst which the link to the digital collection file as managed by the GSB.

22All historical as well as current objects, in particular boreholes, interpretations, samples, cone penetration tests are characterised by this same type of unique identifier in the DOV database, namely “https://{}/data/{object type}/{reference}”.

23Database history of these objects is available and necessary to keep track of changes over time. During quality control or further study, or as a result of mapping or modelling projects, changes or additions can be made to the data objects or to the linked objects of the data points, for example by adding interpretations or additional sample information.

24Related data points can be gathered into a collection, namely by means of linking them to an assignment (‘opdracht’)-object data type. The relation between these data points can be based on the originating project, geological context, geographical distribution or any other relevant information. This assignment object also has a unique DOV-URL, including a persistent identifier. This allows easy reference to a collection of data by using only one URL. An example is

2.2.2. Identifiers in the GSB federal archive

25In the GSB, each geological information point (e.g. borehole, outcrop…) or data cluster (e.g. several boreholes on the same site) has been identified by the number of the geological map and a serial number since the 19th century (e.g., 071E0250). The number of the geological map consists of a number and a letter (W or E). The number of the map refers to the same area as the former topographic maps of the National Geographical Institute on scale 1/25 000. The map 071, for example, stands for the old topographic map 22/7-8. Within this map the east side is indicated with “E”, so this makes 071E, and the west side of the map gets the letter “W”. This map number is nowadays always four digits long. The number after the map name is also four digits long and is given at time of data entry in the GSB archive. Standardisation of the names for the data points is important in digital data management. In exceptional cases, the eight-digit number is followed by a small letter in the case there are several data points with the same name, but this is rather exceptional and nowadays never given to new data. The ascii files in the GSB-archive are available, ranked by map sheet, on This file listing also has a search function in the top right corner of the page which allows the user to search for individual archive numbers and their corresponding files. Nevertheless, the individual archived files can also directly be accessed, using the link as follows f.e. Be aware to always use lower cases. In cases when the ascii-file is not yet available, it is also possible to access a pdf-file, e.g.

3. Identification, documentation and quality control of reference datasets

26Since this Neogene-2020 update volume (Vandenberghe & Louwye, 2020) relies on a combined dataset of legacy and more recent boreholes, CPT and outcrop data, it is an ideal case study for testing the reference dataset approach. As typically done in more regional geological correlation studies, heritage data were used together with recently gathered data from boreholes or outcrops. This also required “upgrading” of the older geological data, for them to be used in a present-day environment (Vearncombe et al., 2017). Re-use of legacy data in a present-day digital and geographical context can be seen as a data rescue effort as well (Griffin, 2015). We aspired for the old, as well as the new data, the same degree of completion in the digital context. It is obvious that tracing back these points is necessary to avoid data mistakes, especially when referring to less recent observations.

27In order to identify and document the reference objects of each paper in this Neogene-2020 update volume, we have applied a five-step approach:

  1. Authors were asked to deliver a table with the base data for their manuscript, if possible based on the GSB number or the DOV name, and otherwise on the name of the data point in another archive, or the informal popular name of a data point, of which the significance is often well known amongst geologists.

  2. Verification of the GSB archive number in the DOV database, together with the content of the data objects in the GSB archive and the DOV database.

  3. Addition of the not yet referenced objects in the federal and/or the regional data system to the DOV database, and output of a standardised table to the authors.

  4. Re-use of the standardised table in the manuscript of the authors. The data of these individual compilations are collected in the general Neogene 2020 volume assignment (‘opdracht’) and also in the assignment of the individual paper of the corresponding authors.

  5. When all data are compiled, a DOI (Digital Object Identifier) is created for the dataset.

28Step 1, the collection phase by the authors, involved compiling the identification, location and documentation of the used data in publications and data sources, and proved a major challenge. As a result of step 1, an individual list of data points, with or without GSB archive numbers and DOV names, was received from each leading author. In Step 2 of the exercise, the connection between the DOV data point and the corresponding number of the GSB archive was verified, based on the metadata, the location, the descriptions and interpretation of the observations in both databases. In this exercise, we encountered several problems. The object-based reference between the two data systems is not always correct. We also encountered ‘double objects’ in the databases, meaning that the same borehole activity was reported twice in a slightly different format, which resulted in two separate data points. Poorly georeferenced boreholes or even missing data objects in the archive or the database were noticed. In scientific publications or reports, boreholes and outcrops are often referred to by means of their location, and not necessarily by a unique code or name. Since, increasingly boreholes are drilled, the uniqueness of the historical borehole names, often based on their location, becomes more relative. The expertise of people involved in the drilling process, or the people involved in earlier scientific publications and reports was required in order to identify or localise some of the original boreholes and outcrops. Also incorrect coordinates in earlier publications proved problematic during this exercise.

29Some data points of the author lists were not yet digitized in DOV or the GSB archive over the past years. In Step 3 all missing objects were entered into the DOV database, linked to the publication they are referred in, and available descriptions and interpretations were added.

30Missing objects, with a specific link to the collections of GSB (e.g. samples, archives, pictures), were added to the GSB database.

31In Step 4 a standardized table with the unique references of both data systems, when applicable, of the data points was handed back to the leading authors of the individual papers during the writing process. As the writing process was then in a more advanced stage, the data system managers asked the leading authors for a quality and completeness check of the standardized table. In some cases, this resulted again in a (minor) change of the lists. In that case, step 1 to 4 was retaken. All data listed were linked to a DOV assignment (‘opdracht’), grouping the data of the compiling Neogene 2020 volume, and complementary to an assignment grouping the base data of each individual paper of this volume.

32Finally, after completing the individual datasets, a DOI will be created for the complete dataset. The DOI includes the URLs to the DOV assignments and a table of the reference data points.

33Next to this five-step approach, guidelines were provided to the leading authors on how to refer to borehole data in the different manuscripts of this volume (Table 1). Authors with a compiled and quality-checked list of the base data points in their papers, can re-use the DOV assignment URL to refer to this compiled base data or they can re-use the URL to the individual data points in the database.

Table 1. Guidelines for the authors to reference the data in their manuscripts.

Guidelines for the use of data references in the text or in the caption of figures


First use of a data point in the text: include the DOV and/or GSB name(s) of the objects, including the unique DOV-URL, for example:


Boring Meerhout (DOV B/1-1117a, BGD 046W0389)


Further along in the text, the alternative name can be used.


Figures should at least contain or mention in the caption the reference to the DOV (including link) or GSB data.


The manuscript will have a compilation table in which a reader can find the data based on location or reference to DOV and GSB

4. Neogene reference data

4.1. Reference boreholes, (temporary) outcrops and CPTs of the Neogene deposits in Flanders

34As a result of this case study, the borehole, CPT and (temporary) outcrop reference data points for the Neogene deposits (Adriaens & Vandenberghe, 2020; De Schutter & Everaert, 2020; Deckers & Louwye, 2020; Deckers et al., 2020; Dusar & Vandenberghe, 2020; Everaert et al., 2020; Goolaerts et al., 2020; Houthuys & Matthijs, 2020; Houthuys et al., 2020; Louwye & Vandenberghe, 2020; Louwye et al., 2020a; Louwye et al., 2020b; Munsterman & Deckers, 2020; Schiltz, 2020; Vandenberghe & Louwye, 2020; Vandenberghe et al., 2020; Verhaegen, 2020; Verhaegen et al., 2020; Wesselingh et al., 2020) in Flanders are identified. They are compiled in a DOV reference subset for each paper in this volume. Access to these subsets is given by the DOV URLs in Table 2. The individual data points in these subsets are also listed in Table 3 (boreholes and outcrops) and Table 4 (CPTs). Since the GSB archive numbers, and their corresponding URLs, are linked in this database, this number can be used in DOV to search for the original GSB archive documents for the objects in Table 3. All individual points from Table 3 and Table 4 are also coupled in the DOV collection object ‘NCS Neogene reference set’ ( (see Table 2 and Figure 4).

Table 2. The general and different (sub)reference datasets resulting from the individual Neogene 2020 papers, with their DOV URL.

Name of the DOV reference datasets of the Neogene

URL to the reference datasets

General dataset of this volume

NCS_Neogene reference set

(Sub)datasets of this volume

NCS_Neogene 2020_Adriaens and Vandenberghe, 2020.

NCS_Neogene 2020_De Schutter and Everaert, 2020.

NCS_Neogene 2020_Deckers and Louwye, 2020.

NCS_Neogene 2020_Deckers et al., 2020.

NCS_Neogene 2020_Dusar and Vandenberghe, 2020.

NCS_Neogene 2020_Everaert et al., 2020.

NCS_Neogene 2020_Goolaerts et al., 2020.

NCS_Neogene 2020_Houthuys and Matthijs, 2020.

NCS_Neogene 2020_Houthuys et al., 2020.

NCS_Neogene 2020_Louwye and Vandenberghe., 2020.

NCS_Neogene 2020_Louwye et al., 2020a.

NCS_Neogene 2020_Louwye et al., 2020b.

NCS_Neogene 2020_Munsterman and Deckers, 2020.

NCS_Neogene 2020_Schiltz, M., 2020.

NCS_Neogene 2020_Vandenberghe and Louwye, 2020.

NCS_Neogene 2020_Vandenberghe et al., 2020.

NCS_Neogene 2020_Verhaegen et al., 2020.

NCS_Neogene 2020_Verhaegen, J., 2020.

NCS_Neogene 2020_Wesselingh et al., 2020.

Image 1000000000003A1F0000127566B79424FB35ECFA.jpgFigure 4. Distribution of the Neogene reference boreholes and outcrops (green) and CPTs (orange), all part of ‘NCS Neogene reference set’ in Flanders. The coloured zone is the Neogene occurrence map, resulting from the Geological 3D Model, version 3 (Deckers et al., 2019). All data and maps are available in DOV.

Table 3. List of the individual boreholes and (temporary) outcrops of the Neogene reference set, with reference to the different individual papers of this collection, sorted by location as mentioned in the papers. *Complete ‘’ with ‘the mapsheet/this unique code’ as explained in section 2.2.2. . **Complete ‘’ with this unique code. 1. Adriaens & Vandenberghe, 2020, 2. Deckers & Louwye, 2020, 3. Deckers et al., 2020, 4. De Schutter & Everaert, 2020, 5. Dusar & Vandenberghe, 2020, 6. Everaert et al., 2020, 7. Goolaerts et al., 2020, 8. Houthuys & Matthijs, 2020, 9. Houthuys et al., 2020, 10. Louwye & Vandenberghe, 2020, 11. Louwye et al., 2020a, 12. Louwye et al., 2020b, 13. Munsterman & Deckers, 2020, 14. Schiltz et al., 2020, 15. Vandenberghe & Louwye, 2020, 16. Vandenberghe et al., 2020, 17. Verhaegen, 2020, 18. Verhaegen et al., 2020, 19. Wesselingh et al., 2020. X, Y and Z coordinates are available via the DOI of the dataset.

Image 100002010000179A00000FBED568DE2B50416667.png

Image 100002010000179A000009B5271891D0A96637DC.png

Table 4. List of the individual CPTs with reference to the different individual papers of this collection, sorted by DOV name with 1. Vandenberghe et al., 2020, 2. Verhaegen et al., 2020, 3. Deckers & Louwye, 2020, 4. Deckers et al., 2020, 5. Schiltz, 2020.


DOV CPT name














DOV permkey/













































































































































































































































































































35This entire collection is now a referable subset of the complete DOV database, linking to the individual data points, as well as to the scientific report(s) it is referred in. Table 2, Table 3 and Table 4 are also accessible via the DOI URL, or by scanning Figure 5. This approach leads to several conclusions:

  1. All data points referred to in this Neogene volume are collected;

  2. As the volume is meant as the state of the art review of the Neogene stratigraphy of Belgium, all essential data points have been reported in the volume;

  3. Of course, other Neogene data points exist, and can be searched for in DOV or the GBS archive (see Section 2).

36More valuable data on the Neogene strata will become available in the future. DOV is the platform for updating this reference set. In consequence, the data set will keep its value as the reference set for the Neogene of Belgium.

Image 10000000000019C8000006047F5B3379E968E499.jpg

Figure 5. QR-code linking to the DOI-dataset (, including Tables 2, 3 and 4 from this paper.

4.2. Easily accessible reference dataset

37Researchers have now at their disposal a data set of the relevant Neogene data points in Belgian geology with well-controlled identifiers and a correspondence between DOV and GSB data systems. All these data have already been used in the individual papers of this 2020 state of the art Neogene volume. These data points are a collection of boreholes, exposures and quarries, and CPTs, and are considered to contain the most relevant information on the Neogene. Each data point can be consulted and shows all geological and stratigraphical information available for that data point. The newly created ‘NCS Neogene reference set offers the possibility to quickly explore these collected datasets. Exploring the data in the GSB archive can be done by accessing the collection website as explained in Section 2. Exploring them in the DOV database can be done in several ways: through the data portal, in your own GIS software using the DOV webservices or by scripts using pyDOV (see Section 2). The different papers, as subject of this case study, in this entire multi-author volume all rely on this dataset and have their own referable subset of reference data points (Table 2). The subsets are not directly linked to each other, but are connected through the individual data points. Each data point is part of one or more individual research publications, and also of the complete Neogene reference set. Every data point can also be part of an unlimited amount of other database collections, referring to different projects, publications or reports.

38A DOI for this specific dataset is created to archive it, in a citable and findable way. It guides users to the complete reference dataset and to the different subsets.

5. Connecting subsurface data and science

39Creating reference datasets reiterates the importance of the quality of data supporting any research project or other initiative. Mostly back tracing of non-uniquely identified data from former published articles or reports for large-scale databases is done via large-scale and intensive inventory and digitalisation efforts, if at all financially possible. Mostly however, this effort is done at the scale of the individual scientist during research phase and is not documented with the aim of re-use by others. During this exercise, we had the opportunity to create the dataset simultaneously with the authors of the different manuscripts. The interaction increases the authors’ responsibility to collaborate in the quality control of the used data during the data collection and research process, and facilitates the update of the documentation of these data in the large-scale databases.

40By documenting the Neogene reference observation dataset in DOV in the scientific context, a dataset of higher quality is created, amongst many other (non-reference) borehole data in the database. For the government, a more qualitative database implies better policy support.

41In the Table-set (see Section 4) a researcher can easily see in which paper(s) a particular reference data point has been used. Consulting the interpretation of a particular data point in the papers will help identifying remaining research issues as the same data point has not necessarily been given the same interpretation or led to the same conclusions in all papers. The papers in this volume have improved our understanding of the Neogene stratigraphy but have also identified important gaps in our understanding (Vandenberghe & Louwye, 2020). Future research on these gaps will benefit from the Table-set as the reference papers dealing with aspects of that issue are listed together with the main data used in each paper. Indeed the Table-set facilitates the composition of a new collection of data points across the present papers to start a new detailed study of a particular issue. In addition to the identifiers of this newly composed research collection, it also allows to select the key areas in which to search for additional data, at present not considered as reference data, by using the appropriate search methods (see Section 2).

42This methodology of creating a documented, open and referable dataset linked to a(n) (open) scientific paper is in line with the Flanders Open Data Charter (Government of Flanders, 2018) and will be the basis for creating datasets according to the FAIR—Findability, Accessibility, Interoperability, and Reusability—principles (Wilkinson et al., 2016). The publication of a subsoil reference dataset as a findable and re-usable dataset is not yet common practice in Flanders, especially in a research context. Web portals are already in place to support this combined publication of datasets and papers. Increasing the interaction of the geoscientific publications on research portals, such as the FRIS portal (FRIS, 2020) amongst others, and their related datasets on data portals, such as the DOV portal for subsurface data, increases the impact of both types of portals in Flanders and highlights the importance of data. This exercise should be elaborated to a standard practice for the publication of subsoil datasets for geoscientific publications in Flanders, combining all relevant data types and maps of the subsurface.

43The use of the URLs and unique keys for referring to the database objects or collections, connects the scientific articles to the wider database. Integrating the link to the scientific article in which the data were used in the database, connects the wide range of users to the scientific background of these data. As a consequence, this exercise also connects users, indirectly and by technical means: the scientific community gets connected to the wide variety of users of the database. If integrating these unique keys and URLs in papers becomes a standard practice for subsurface data, readers from other papers, not even related to this volume, will be able to explore other subsurface data and reference sets, leading them to quality checked data. As such, the original data may serve multiple purposes.

44All of these efforts bring the importance of the created reference sets and their publications to a higher, European or even worldwide level and will further stimulate the Open Science movement.

6. Conclusion

45Highly qualitative research and data are the foundation of a more sustainable and innovative subsurface management. When referring to subsurface observations and measurements, it is important to be able to refer to unique and persistent identifiers. It stimulates the correct re-use of data and increases the interaction between data publishers and data consumers. This is not yet a standard practice in Flanders for subsurface data and it is complicated since two data management institutes may be involved. In this paper, a workable approach is illustrated involving linking to the regional database for Flanders, with connection to the federal archive if applicable. Identification in open governmental archives and databases is important for a consistent (re-)use of the correct subsoil observations and measurements in research projects and resulting articles, and also for the correct geological interpretation based on the location of the geological description of the sedimentary environment. The methodology is developed and tested, together with the authors, to create a reference dataset for the Neogene boreholes, CPTs and (temporary) outcrops in Flanders. Such an interaction proved highly valuable and needed when creating reference dataset that inherently adhere to high quality standards.

46The approach of creating this thematic dataset for the Neogene in Flanders increased the completeness and the quality of the Neogene reference data in the DOV database. The quality of this reference sets should be maintained in the future by keeping it up-to-date with the state of the art descriptions, interpretations, additional research data and new relevant data points. The Neogene dataset is a prime example of a connected data collection, and a standard is set in how to include subsurface identifiers in scientific publications concerning the subsurface of Flanders. In the future, this methodology, preferably fully aligned with the FAIR-principles, should be continued adding more reference data points to the Neogene collection, and adding more reference collections to the DOV database. Including the link to these reference sets in both research and open data portals, raises the awareness of the importance of an unambiguous and qualitative dataset. It further stimulates re-use and connects stakeholders from (geo)scientific research and other initiatives.

7. Acknowledgements

47We would like to thank all the authors of the different papers of this present volume to elaborate with us the exercise of data identification. Without them, we could not have traced back so many of the reference data points. We also thank them for the the newly added data and their documentation. Prof. Noël Vandenberghe is thanked for the discussions during the development of the methodology and this paper. Thanks as well to Dr Vera Van Lancker and Gineke van Putten for their constructive reviews and questions. We thank Martine Van de Voorde of VPO for her assistance during editing of the data in DOV. We are grateful to the IT-teams of the RBINS-GSB and DOV, they keep the data systems up and running.

8. References

48Adriaens, R. & Vandenberghe, N., 2020. Quantitative clay mineralogy as a tool for lithostratigraphy of Neogene Formations in Belgium: a reconnaissance study. Geologica Belgica, 23/3-4, this volume.

49Bartha, G. & Kocsis, S., 2011. Standardization of geographic data: the European INSPIRE Directive. European Journal of Geography, 2/2, 79–89.

50Deckers, J. & Louwye, S., 2020. The architecture of the Kattendijk Formation and the implications on the early Pliocene depositional evolution of the southern margin of the North Sea Basin. Geologica Belgica, 23/3-4, this volume.

51Deckers, J., Vernes, R.W., Dabekaussen, W., Den Dulk, M., Doornenbal, J.C., Dusar, M., Hummelman, H.J., Matthijs, J., Menkovic, A., Reindersma, R.N., Walstra, J., Westerhoff, W.E. & Witmans, N., 2014. Geologisch en hydrogeologisch 3D model van het Cenozoïcum van de Roerdalslenk in Zuidoost-Nederland en Vlaanderen (H3O – Roerdalslenk). Studie uitgevoerd in opdracht van de Afdeling Land en Bodembescherming, Ondergrond, Natuurlijke Rijkdommen van de Vlaamse Overheid, de Afdeling Operationeel Waterbeheer van de Vlaamse Milieumaatschappij, de Nederlandse Provincie Limburg, de Nederlandse Provincie Noord-Brabant, TNO-Geologische Dienst Nederland, VITO/Energyvile, in samenwerking met de Belgische Geologische Dienst. VITO-rapport 2014/ETE/R/1, 205 p., accessed 01/09/2020.

52Deckers, J., Louwye, S. & Goolaerts, S., 2020. The internal division of the Pliocene Lillo Formation: correlation between Cone Penetration Tests and lithostratigraphic type sections. Geologica Belgica, 23/3-4, this volume.

53Deckers, J., De Koninck, R., Bos, S., Broothaers, M., Dirix, K., Hambsch, L., Lagrou, D., Lanckacker, T., Matthijs, J., Rombaut, B., Van Baelen, K. & Van Haren, T., 2019. Geologisch (G3Dv3) en hydrogeologisch (H3D) 3D-lagenmodel van Vlaanderen. Studie uitgevoerd in opdracht van het Vlaams Planbureau voor Omgeving, departement Omgeving en de Vlaamse Milieumaatschappij. VITO, Mol, VITO-rapport 2018/RMA/R/1569, 286 p., accessed 15/07/2020.

54De Baets, 2004. DOV Analyse: Structuur XML importbestand. Version 02.02. AMINAL – Afdeling Water, Ministerie van de Vlaamse Gemeenschap, Brussels, 18 p.

55De Keyzer, M., Dewyngaert, N. & Schepers D., 2019. Visie DOV 2030. Departement Omgeving, Vlaams Planbureau voor Omgeving, Brussels, 45 p., accessed 15/07/2020.

56De Nil, K., Van Damme, M. & Verhaert, G., 2016. Flanders Soil and Subsoil Database (DOV) – The web portal to the geological information of Flanders. Proceedings of the 5th International Geologica Belgica Congress, GB2016.26–29 January 2016, University of Mons, Mons, Belgium., 280–281., accessed 15/07/2020.

57De Nil, K., De Koninck, R., Corluy, J., De Rouck, T. & Van Damme, M., 2018. Explore the subsurface of Flanders with the Virtual Borehole. Abstracts of the 6th International Geologica Belgica Meeting. 12–14 September 2018, Leuven, Belgium., accessed 15/07/2020.

58Departement Omgeving, 2020. Ondergrond en geologie., accessed 15/07/2020.

59De Schutter, P.J. & Everaert, S., 2020. A megamouth shark (Lamniformes: Megachasmidae) in the Burdigalian of Belgium. Geologica Belgica, 23/3-4, this volume.

60DOV, 2020. Flanders Soil and Subsoil Database (Databank Ondergrond Vlaanderen)., accessed 15/07/2020.

61Dusar, M. & Vandenberghe, N., 2020. Upper Oligocene lithostratigraphic units and the transition to the Miocene in North Belgium. Geologica Belgica, 23/3-4, this volume.

62Everaert, S., Munsterman, D.K., De Schutter, P.J., Bosselaers, M., Van Boeckel, J., Cleemput, G. & Bor, T.J., 2020. Stratigraphy and palaeontology of the lower Miocene Kiel Sand Member (Berchem Formation) in temporary exposures in Antwerp (northern Belgium). Geologica Belgica, 23/3-4, this volume.

63FRIS, 2020. Flanders Research Information Space., accessed 15/07/2020.

64Goolaerts, S., De Ceuster, J., Mollen, F.H., Gijsen, B., Bosselaers, M., Lambert, O., Uchman, A., Van Herck, M., Adriaens, R., Houthuys, R., Louwye, S., Bruneel, Y., Elsen J. & Hoedemakers, K., 2020. The upper Miocene Deurne Member of the Diest Formation revisited: unexpected results from the study of a large temporary outcrop near Antwerp International Airport, Belgium. Geologica Belgica, 23/3-4, this volume.

65Government of Flanders, 2018. Open Data Charter v1.0* - 20 Principes. Brussel, 4 p., accessed 15/07/2020.

66Griffin, R.E., 2015. When are old data new data? GeoResJ, 6, 92–97.

67Haest, P. J., Huybrechts, R., Van Hoey, S., Van De Wauw, J., Huysmans, M., Van Baelen, H. & Van Damme, M., 2018. PyDOV brings the data back to the future. Abstracts of the 6th Geologica Belgica Meeting. 12–14 September 2018, Leuven, Belgium., accessed 15/07/2020.

68Houthuys, R. & Matthijs, M., 2020. Reinterpretation of the Neogene sediments of the Bree Uplift, NE Belgium. Geologica Belgica, 23/3-4, this volume.

69Houthuys, R., Adriaens, R., Goolaerts, S., Laga, P., Louwye, S., Matthijs, J., Vandenberghe, N. & Verhaegen, J., 2020. The Diest Formation: a review of insights from the last decades. Geologica Belgica, 23/3-4, this volume.

70Informatie Vlaanderen, 2017. Vlaamse URI-standaard voor data. Informatie Vlaanderen, Brussels, 17 p., accessed 15/07/2020.

71Jacobs, P., De Ceukelaire, M., Sevens, E. & Verschuren, M., 1993. Philosophy and methodology of the new geological map of the Tertiary formations, Northwest Flanders, Belgium. Bulletin de la Société belge de Géologie, 102/1-2, 231–241.

72Louwye, S. & Vandenberghe, N., 2020. A reappraisal of the stratigraphy of the upper Miocene unit X in the Maaseik core, eastern Campine area (northern Belgium). Geologica Belgica, 23/3-4, this volume.

73Louwye, S., Deckers, J., Verhaegen, J., Adriaens, R. & Vandenberghe, N., 2020a. A review of the lower and middle Miocene in northern Belgium. Geologica Belgica, 23/3-4, this volume.

74Louwye, S., Deckers, J. & Vandenberghe, N., 2020b. The Pliocene Lillo, Poederlee, Merksplas and Mol Formations in northern Belgium: a synthesis. Geologica Belgica, 23/3-4, this volume.

75Matthijs, J., Lanckacker, T., De Koninck, R., Deckers, J., Lagrou, D. & Broothaers, M., 2013. Geologisch 3D lagenmodel van Vlaanderen en het Brussels Hoofdstedelijk Gewest – versie 2: G3Dv2. Studie uitgevoerd in opdracht van de Vlaamse overheid, Departement Leefmilieu, Natuur en Energie, Afdeling Land en Bodembescherming, Ondergrond, Natuurlijke Rijkdommen. VITO, Mol, VITO-rapport 2013/R/ETE/43, 21p., accessed 01/09/2020.

76Munsterman, D.K. & Deckers, J., 2020. The Oligocene/Miocene boundary in the ON-Mol-1 and Weelde boreholes along the southern margin of the North Sea Basin, Belgium. Geologica Belgica, 23/3-4, this volume.

77RBINS, 2020. Drill core archive. Royal Belgian Institute of Natural Sciences., accessed 15/07/2020.

78Schiltz, M., 2020. On the use of CPTs in stratigraphy: recent observations and some illustrative cases. Geologica Belgica, 23/3-4, this volume.

79Vandenberghe, N. & Louwye, S. (eds), 2020. The Neogene stratigraphy of northern Belgium. Geologica Belgica, 23/3-4, this volume.

80Vandenberghe, N., De Ceukelaire M. & Welkenhuysen K., 2015. Geologische kaarten 1:50.000 en de Databank Ondergrond Vlaanderen. In Borremans, M. (ed.), Geologie van Vlaanderen. Academia Press, Gent, 176–187.

81Vandenberghe, N., Wouters, L., Schiltz, M., Beerten, K., Berwouts, I., Vos, K., Houthuys, R., Deckers, J., Louwye, S., Laga, P., Verhaegen, J., Adriaens, R. & Dusar, M., 2020. The Kasterlee Formation and its relation with the Diest and Mol Formations in the Belgian Campine. Geologica Belgica, 23/3-4, this volume.

82Vearncombe, J., Riganti, A., Isles, D. & Bright, S., 2017. Data upcycling. Ore Geology Reviews, 89, 887–893.

83Verhaegen, J., 2020. Stratigraphic discriminatory potential of heavy mineral analysis for the Neogene sediments of Belgium. Geologica Belgica, 23/3-4, this volume.

84Verhaegen, J., Frederickx, L. & Schiltz, M., 2020. New insights into the lithostratigraphy and paleogeography of the Messinian Kasterlee Formation from the analysis of a temporary outcrop. Geologica Belgica, 23/3-4, this volume.

85Vernes, R.W., Deckers, J., Bakker, M.A.J., Bogemans, F., De Ceukelaire, M., Doornenbal, J.C., den Dulk, M., Dusar, M., Van Haren, T.F.M., Heyvaert, V.M.A., Kiden, P., Kruisselbrink, A.F., Lanckacker, T., Menkovic, A., Meyvis, B., Munsterman, D.K., Reindersma, R., ten Veen, J.H., van de Ven, T.J.M., Walstra, J. & Witmans, N., 2018. Geologisch en hydrogeologisch 3D model van het Cenozoïcum van de Belgisch-Nederlandse grensstreek van Midden-Brabant / De Kempen (H3O – De Kempen). Studie uitgevoerd door VITO, TNO-Geologische Dienst Nederland en de Belgische Geologische Dienst in opdracht van Vlaams Planbureau voor Omgeving, Vlaamse Milieumaatschappij, TNO, Geologische Dienst Nederland, Nederlandse Provincie Noord-Brabant, Brabant Water, Programmabureau KRW/DHZ Maasregio. VITO rapport 2017/RMA/R/1348, 285 p., accessed 01/09/2020.

86W3C, 2017. Data on the web Best Practices: W3C Recommendation 31 January 2017., accessed 01/07/2018.

87Wesselingh, F.P., Busschers, F.S. & Goolaerts, S., 2020. Observations on the Pliocene sediments exposed at Antwerp International Airport (northern Belgium) constrain the stratigraphic position of the Broechem fauna. Geologica Belgica, 23/3-4, this volume.

88Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A. et al., 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3, 160018.

89Manuscript received 27.02.2020, accepted in revised form 17.07.2020, available online 02.12.2020.

To cite this article

Katrien DE NIL, Marleen DE CEUKELAIRE & Marleen VAN DAMME, «A reference dataset for the Neogene lithostratigraphy in Flanders, Belgium», Geologica Belgica [En ligne], Volume 23 (2020), number 3-4 - The Neogene stratigraphy of northern Belgium, 413-427 URL :