The Data Seal of Approval board hereby confirms that the Trusted Digital repository 4TU.Datacentrum complies with the guidelines version 2010 set by the Data Seal of Approval Board.
The afore-mentioned repository has therefore acquired the Data Seal of Approval of 2010 on January 28, 2013.
The Trusted Digital repository is allowed to place an image of the Data Seal of Approval logo corresponding to the guidelines version date on their website. This image must link to this file which is hosted on the Data Seal of Approval website.
The Data Seal of Approval Board
|Guidelines Version:||2010 | June 1, 2010|
|Guidelines Information Booklet:||DSA-booklet_2010.pdf|
|All Guidelines Documentation:||Documentation|
|Seal Acquiry Date:||Jan. 28, 2013|
|For the latest version of the awarded DSA |
for this repository please visit our website:
|Previously Acquired Seals:||
|This repository is owned by:||
Research disciplines of the technical universities store the research data in a variety of ways. Accordingly, 3TU.Datacentrum develops separate approaches for so called 'simple sets' and large or 'complex collections' of data sets.
Data-model, metadata and ingest procedures are developed in cooperation with the parties involved. 3TU.Datacentrum provides professional assistance in this process involving the information specialists at the university library.
The data sets are supplied with all the information that is essential for sustainable data management and future use.
A minimum set of metadata (exceeding the DataCite guideline) has to be provided by either the data producers or is extracted by 3TU.Datacentrum from the data and documents provided by the data producers before the data can be archived.
Data producers are encouraged to supply additional data description documents or links to publications (using persistent identifiers) about the data. The
publications are stored in institutional (document) repository (given that permissions are granted) while the data description documents are archived in the data repository.
3TU.Datacentrum secures the long-term storage and permanent access -and, with that, reusability -of the data sets, the metadata and the links to external content.
When defining metadata fields, 3TU.Datacentrum follows the guidelines as far as possible of the qualified Dublin Core specifications; see http://dublincore.org/documents/dcmi-terms/ and by the DataCite metadata schema; see http://schema.datacite.org/
Mandatory fields are: Title, Creator, Date published, Publisher, Description
Optional fields are: Contributor, Date created, Subject, Coverage temporal, Coverage spatial, Identifier, Language, URL to publication.
The number of mandatory fields is kept as low as possible in order to make this service user-friendly for data depositors.
In some cases 3TU.Datacentrum also stores the research method or/and research tool related to the data set. In such case, adequate metadata belonging to the method or tool has to be provided.
All simple data sets and relevant objects in special collections are assigned digital object identifiers (DOIs) and metadata for these objects are published in the DataCite Metadata store and harvestable through OAI-PMH (amongst others by the the national portal for research information: NARCIS).
For all objects in the 3TU.Datacentrum resource maps, compliant to linked data standards, are available to users (humans and machines).
Also see guideline 3.
In order to guarantee the use of the data in the future as well, it is important that the data are archived in preferred formats. 3TU.Datacentrum has compiled a list of preferred or accepted formats in which 3TU.Datacentrum has the highest confidence with regard to durability. For that reason, by far most datasets are formatted as NetCDF.
Brief instructions for depositing data are published on the website: http://datacentrum.3tu.nl/en/store-data/upload-form/
Depositors are requested to deliver their data in the formats agreed with 3TU.Datacentrum.
A list of all accepted data formats can be found at: http://data.3tu.nl/repository/resource:repository/object/search?q= (view 'data format' in the list of facets)
In some cases 3TU.Datacentrum is willing to accept research data in other formats, if they are convertable to open and available file formats.
During the ingest process we follow a standard (manual) routine in order to check the validity and quality of these formats.
Having the list of accepted data formats included in the instructions for depositing would be better.
When depositing data, depositors are required to provide adequate metadata by using the upload form. Prior to depositing data the depositor has to log-in.
If the data set is larger than 2 GB or when there are multiple related sets or sets in NetCDF, the data and metadata are ingested by the data librarian according to predefined procedures for every special collection (e.g. metadata in xml files or inside NetCDF data files are automatically extracted and copied to the index and relevant metadata fields).
The data-model of a special collection determines the granularity of the metadata.
Metadata are in rdf with OAI-ORE (http://www.openarchives.org/ore/).
Namespaces used, apart from the obvious for rdf, rdfs, owl and oai-ore:
• http://xmlns.com/foaf/0.1/ for personal names.
• http://www.w3.org/2003/01/geo/wgs84_pos# for point geolocation. Where possible, locations are provided with a sameAs link to their geonames ( http://www.geonames.org/ ) counterpart.
• info:eu-repo/semantics/ for resource types
• http://www.library.tudelft.nl/ns/rdf for specifics that are not covered in the well-known general vocabularies. This namespace is described in an OWL ontology http://www.library.tudelft.nl/ns/repo.owl. Where appropriate, properties and classes are described as specializations of more general properties/classes from well-known vocabularies like the ones represented by the namespaces above and http://openprovenance.org/ (the Open Provenance Model).
Adherence to these standards is 100%, all rdf is validated at ingest.
All metadata provided are checked by the data librarian to ensure that no errors are made. When there are questions left or additional metadata have to be provided, the data librarian always asks the data producer for input.
Also see guideline 1.
3TU.Datacentrum is a collaboration of the libraries of Delft University of Technology, Eindhoven University of Technology and University of Twente, the three universities of technology that together form the 3TU.Federation.
3TU.Datacentrum is to become the foremost facility for the permanent storage and access of technical-scientific research data for the Netherlands. The focus is on (national and international) programmes and projects in which Dutch research groups are involved.
3TU.Datacentrum promotes this mission by attending (inter)national conferences, working groups, training courses etc. We are regularly present on several symposia like IATUL and EGU with presentations. 3TU.Datacentrum organises own symposia to rise the awareness of good data management and works closely together with researchers of all three Dutch universities of technology. The promotion of the the data repository, the namedropping and educating about data management naturally spreads beyond the boundaries of the 3TU.Federation. For this purpose regular newsletters are published on the website of the data repository and communicated using social media among others. 3TU.Datacentrum cooperates with publishers in common efforts to promote data archiving and reuse. 3TU.Datacentrum has developed 'Data Intelligence 4 Librarians', a training for librarians to become a 'data-librarian'.
Our mission statement is published at:
As of January 2012, 3TU.Datacentrum is a service/product of Delft University of Technology, Eindhoven University of Technology and the University of Twente.
Being part of three technical universities, 3TU.Datacentrum is no legal entity on its own.
3TU.Datacentrum distinguishes two types of contracts, both published on the website:
1. License agreement, a standard contract: http://datacentrum.3tu.nl/fileadmin/editor_upload/pdf/License_agreement.pdf
In this License agreetment 3TU.Datacentrum shall ensure, to the best of its ability and resources, that the deposited dataset is archived in a sustainable manner and remains legible and accessible.
2. User agreement - Conditions of use: http://datacentrum.3tu.nl/fileadmin/editor_upload/pdf/ConditionsofUse.pdf
In the Conditions of use a specific section on personal data protection is included:
datasets that contain personal information as referred to in the Personal Data Protection Act (Wet Bescherming Persoonsgegevens) may be used only for historical, statistical or scientific research. Persons who use datasets containing personal data are required to comply with the Code of Practice for the use of personal data in scientific and scholarly research (Gedragscode voor gebruik van persoonsgegevens in wetenschappelijk onderzoek) published by the VSNU (Association of Universities in the Netherlands). The user undertakes to maintain confidentiality of all personal data that he/she processes.
The three universities as partners in this product provide adequate pool of expertise in issues like copyright and intellectual property.
Data storage of the 3TU.Datacentrum is managed by the IT department of the Delft University of Technology according to their procedures.
The stored research data are backed up (and stored) on hard disks (RAID6) and synchronized (one way) daily. Once a month a backup is made on disks at another location and retained for two months.
In order to ensure restore procedures the root-filesystems are backed up incrementally on a daily basis and once a week full backups are made. These backups are saved on tapes and will be kept for three months on another location. A restore can be carried out upon request.
On regular basis security updates and patches will be installed when approved.
These preservation procedures are outsourced to the ICT department of Delft University of Technology and put down in a service level agreement. Development of additional consistency checks are ongoing.
Most of the datasets in 3TU.Datacentrum are formatted in NetCDF which minimizes the probability of having to convert large volume of data in the near future. As much as possible open file formats are used. 3TU.Datacentrum has the capability to convert, when needed, from other formats to NetCDF or convert NetCDF data to xml using libraries in use by the communities. 3TU.Datacentrum has the required expertise to perform these actions.
Small volume conversions due to obsolescence of file formats will be handled.
Currently, a small number of objects archived in the repository includes:
- text: txt, xml, pdf
- images: png, jpeg, gif
- video: mpeg
For textual resources, XML formats are used whenever possible, to make
future interpretation of the files possible even if the tool that was used to create them no longer exists.
For converting images and video we make an appeal to archives who are specialized on these kind of resources.
It would be good if a concrete agreement with an archive specialising in audiovisual material could be made regarding accepted formats and future conversions.
The entire archival process from ingest through to access is predefined. Some ad hoc procedures are needed to deal with specific special cases, including non-standard file formats.
The ingest procedure in use by data producers and staff are well documented. Other processes, due to long-term preservation (data conversions, version control...) still need to be described in detail and are currently in their draft form. More information on the ingest procedure: http://datacentrum.3tu.nl/en/store-data/upload-form/
The ingest procedure and workflow for library staff is available as document for internal use.
3TU.Datacentrum encourages skills development for all staff and has developed a training which supports this: http://dataintelligence.3tu.nl. TU Delft Library includes staff with adequate IT skills for all technical issues.
Also see guideline 7.
The data producer, i.e. the depositor will always remain the proprietor. 3TU.Datacentrum does in fact get a copy of the data of which it must take good care, according to the terms of the license contract and the terms and conditions for use.
3TU.Datacentrum also makes copies, for example for the benefit of backup and looks after them well.
In case of an emergency we are able to build up an entirely new database composed of all files we backup-ed and stored safely at other locations.
With the aid of online archiving system, 3TU.Datacentrum gives researchers across the whole world the possibility to search for and access the deposited files. In this way the 3TU.Datacentrum adopts responsibility for access to research data.
Research data are currently available in formats suggested, required and frequently used by the data consumers or in formats in which 3TU.Datacentrum has the highest confidence with regard to durability.
In case of bulk ingest, 3TU.Datacentrum is willing to convert the deposited data into a preferred format (csv -> netCDF). This occurs occasionally.
The Data browser (http://data.3tu.nl/repository/) allows access to all data collections available.
All visible metadata and some metadata extracted from datafiles are indexed and searchable. Queries can contain special operators, fieldnames, wildcards etc. and results can be refined using facets by the user.
All objects maintained by the 3TU.Datacentrum are assigned a locally unique identifier. As TU Delft Library is a registering agency of DataCite, a DOI will be assigned to each dataset.
All metadata can be harvested via the OAI-PMH protocol.
See also guideline 1.
To ensure the integrity of the data sets, for every deposited file a checksum (md5 type) is made which allows us to check the files for defects in later years.
Once deposited, files in data sets are never changed and only minor changes to the metadata are allowed. For example: correction of spelling, minor changes in documentation, additional documentation added, etc. Changes to the data themselves will be issued as a new version of the dataset and will obtain a new persistent identifier. These changes are only made in narrow collaboration with the producer of the dataset.
All changes are logged in the Fedora Audit trail.
3TU.Datacentrum does not allow depositors to implement changes themselves once their data is deposited.
-We do not change data, except to add metadata if required. Data producers sign over the material to us and so far trust that we are looking after their material properly. See also Guideline 11.
-If applicable, we create collection-level objects which provides a context for the data within the collection. E.g. http://data.3tu.nl/repository/collection:cabauw
Audit trails for each digital object are kept and maintained automatically within the Fedora system.
-The repository maintains links to other relevant materials (e.g. article, thesis, documentation, data elsewhere) and to metadata of measuring instruments etc whenever applicable.
We use RDF to create semantic links between digital objects or to external resources.
-The repository doesn't compare the essential properties of different versions of the same file.
-The unique identity of a depositor is ensured by the required login using NetID or any OpenID for identification.
3TU.Datacentrum explicitly follows the broad guidance given in the OAIS reference model across the whole of the archival process. Moreover, the archival system of 3TU.Datacentrum, Fedora, has incorporated the OAIS model.
The only exception to the compliance with the OAIS model refers to preservation planning. More detailed preservation plans are currently being developed.
The TU Delft IT center is currently considering adding taking up (Big)Data Archiving as a strategic topic. Their decision will determine the plans for infrastructure development.
All data stored in 3TU.Datacentrum are freely accessible thus following the principles of Open Access.
At this moment we are working on access regulations for datasets deposited with restricted or other access.
Data consumers are required to register and login for downloading data except for special collections where direct access by tools is offered.
At registration users have to accept the conditions of use:
The repository currently does not have the tools to monitor how the data are being re-used. If misuse of data will be detected, the repository will examine it on a case by case basis including the necessary measures.
All metadata are provided under CC0 License.
The general terms and conditions incorporate the principles of “Open Access” and the The Netherlands Code of Conduct for Scientific Practice, see: http://www.vsnu.nl/Media-item-1/Code-of-conduct-for-scientific-practice-2004.htm).
A user has to agree with the 'Code of use' in order to download the data; however, 3TU.Datacentrum does not carry out any structured control of complience. For users of data sets with personal data, 3TU.Datacentrum has an obligatory procedure according to the national Code of Conduct for Use of Personal Data in Scientific Research (VSNU).
The general terms and conditions incorporate the terms and conditions of the license agreement.
In the metadata of individual datasets additional licenses can be specified (eg. GNU, Creative Commons, etc.)
The general terms and conditions of use refer to the following:
See also Guideline 14.