DSA logo

 

Data Seal of Approval

Guidelines version 2017-2019
November 10, 2016

Introduction

Background & General Guidance

The Core Trustworthy Data Repositories Requirements are intended to reflect the characteristics of trustworthy repositories. As such, all Requirements are mandatory and are equally weighted, standalone items. Although some overlap is unavoidable, duplication of evidence sought among Requirements has been keep to a minimum where possible. The choices contained in checklists (e.g., repository type and curation level) are not considered to be comprehensive, and additional space is provided in all cases for the applicant to add ‘other’ (missing) options. This and any comments given may then be used to refine such lists in the future.

Each Requirement in the Catalogue is accompanied by guidance text to assist applicants in providing sufficient evidence that their repositories meet the Requirement, outlining the types of information that a reviewer will expect in order to perform an objective assessment. Furthermore, the applicant must indicate a compliance level for each of the Requirements:

0 – Not applicable
1 – The repository has not considered this yet
2 – The repository has a theoretical concept
3 – The repository is in the implementation phase
4 – The guideline has been fully implemented in the repository

Compliance levels provide a useful part of the self-assessment process, but all applicants will be judged against statements supported by appropriate evidence; not against self-assessed compliance levels. In this regard, if the applicant believes a Requirement is not applicable, the reason for this must be documented in detail. Note also that compliance levels 1 and 2 can be valid for internal self-assessments, while certification may be granted if some guidelines are considered to be at level 3—in the implementation phase—since the Requirements include an assumption of a repository’s continuous improvement.

Reponses must be in English. Although attempts will be made to match reviewers to applicants in terms of language and discipline, this is not always possible. If evidence is in another language, an English summary must be provided in the self-assessment.

Because core certification does not involve a site visit, the Requirements should be supported by links to public evidence. Nevertheless, it is understood that for reasons such as security, it may not always possible to include all information on an organization’s website, and provisions are made within the certification process for repositories who want sensitive parts of their evidence to remain confidential.

Repositories are required to be reassessed every three years. It is recognized that while basic systems and capabilities evolve continuously according to technology and user needs, they may not undergo major changes in this timeframe. However, the Trusted Digital Repository ISO standard (ISO 16363) has a five-year review cycle, and a shorter period is considered necessary for a core trust standard to allow for possible modifications and corrections. Hence, an organization with well-managed records and business processes should reasonably expect to be able to submit an application with only minimal revisions after three years, unless the Requirements themselves have been updated within the intervening period.

____________________________________________________________________

Completing the Self-Assessment

Topics for discussion and inclusion are suggested in supporting text below each requirement, but they are neither exhaustive nor prescriptive.

Abbreviations should be expanded the first time they are used.

The following fields appear below each requirement:

____________________________________________________________________

Self-Assessment Statement:

Your self-assessment statement should directly address the requirement and its supporting text. Your statement should indicate which documentation (linked or draft) provides support. Please mention why it supports your evidence statement and mention which section is applicable, especially for long documents or those with a broad scope.

If part of a requirement is ‘Outsourced’ to a third party (where applicable) identify the partner and provide evidence for the parts of the process you are responsible for and for those the Outsource Partner is responsible for.

____________________________________________________________________

 Linked Documentation and/or documentation deadline (English summary if applicable):

The self-assessment statement for each requirement must be supported by a link to relevant publicly available documentation (preferably in English). If the documentation is not in English a short summary in English should be provided. If documentation is in draft and/or not yet publicly available a deadline should be provided which would be monitored in subsequent applications for the DSA. Self-assessed compliance levels would be expected to be lower if supporting evidence is only available in draft. These public documents are considered critical to the transparency of the DSA process.

Please add hyperlinks to public documentation used to support your self-assessment.

Each document here should have been mentioned in the self-assessment statement above.

____________________________________________________________________

Self-Assessed Compliance Level:

For each DSA requirement there is a minimum level of compliance reflecting how mature repository practices should be to receive the Seal. As best practices emerge, the DSA Board will re-evaluate the minimum compliance level.

____________________________________________________________________

Peer Reviewers Comments:

This space will be used by the reviewers to add comments or ask for further clarification. ____________________________________________________________________

Reference for Peer Reviewers

____________________________________________________________________

Liability

By starting the application process for the DSA you agree to the following terms:

All parties concerned including the DSA Board and designated reviewer will treat self-assessment evidence and compliance levels as sensitive until the DSA is awarded; after this time the final version of evidence, self-assessment levels and reviewer comments will be made publically available via the DSA Tool.

The DSA Board will handle all data provided with the utmost care but accepts no liability for any damage or losses resulting from the use of these data.

 

Guidelines

Guideline Minimum requirement
0 Context N/A
1 Mission/Scope 0
2 Licenses 0
3 Continuity of access 0
4 Confidentiality/Ethics 0
5 Organizational infrastructure 0
6 Expert guidance 0
7 Data integrity and authenticity 0
8 Appraisal 0
9 Documented storage procedures 0
10 Preservation plan 0
11 Data quality 0
12 Workflows 0
13 Data discovery and identification 0
14 Data reuse 0
15 Technical infrastructure 0
16 Security 0
17 Comments/feedback 0

 

Assessment

Statement of Compliance Means Comments and/or URLs
0 N/A: Not Applicable. Provide an explanation
1 No: We have not considered this yet. Provide an explanation
2 Theoretical: We have a theoretical concept. Provide a URL for the initiation document.
3 In progress: We are in the implementation phase. Provide a URL for the supporting document.
4 Implemented: This guideline has been fully implemented for the needs of our repository. Provide a URL for the supporting document.

0. Context

Applicant manual

Please provide context for your repository.

– Repository Type. Select all relevant types from:

Comments

– Brief Description of the Repository’s Designated Community

– Level of Curation Performed. Select all relevant types from:

A.    Content distributed as deposited
B.    Basic curation – e.g., brief checking, addition of basic metadata or documentation
C.    Enhanced curation – e.g., conversion to new formats, enhancement of documentation
D.    Data-level curation – as in C above, but with additional editing of deposited data for accuracy

Comments

– Outsource Partners. If applicable, please list them.

Other Relevant Information

Guidance:

To assess a repository, reviewers need some information about the repository to set it in context. Please select from among the options and provide details for the items that appear in the Context requirement.

(1) Repository Type. This item will help reviewers understand what function your repository performs. Choose the best match for your repository type (select all that apply). If none of the categories is appropriate, feel free to provide another descriptive type. You may also provide further details to help the reviewer understand your repository type.

(2) Repository's Designated Community. This item will be useful in assessing how the repository interacts and communicates with its target community. Please make sure that the response is specific—for example, ‘quantitative social science researchers and instructors’.

(3) Level of Curation. This item is intended to elicit whether the repository distributes its content to data consumers without any changes, or whether the repository adds value by enhancing the content in some way All levels of curation assume initial deposits are retained unchanged and that edits are only made on copies of those originals. Annotations/edits must fall within the terms of the licence agreed with the data producer and be clearly within the skillset of those undertaking the curation. Thus, the repository will be expected to demonstrate that any such annotations/edits are undertaken and documented by appropriate experts and that the integrity of all original copies is maintained. Knowing this will help reviewers in assessing other certification requirements. Further details can be added that would help to understand the levels of curation you undertake.

(4) Outsource Partners. Please provide a list of Outsource Partners that your organization works with, describing the nature of the relationship (organizational, contractual, etc.), and whether the Partner has undertaken any Trusted Digital Repository assessment. Such relationships may include, but are not limited to: any services provided by an institution you are part of, storage provided by others as part of multicopy redundancy, or membership in organizations that may undertake stewardship of your data collection when a business continuity issue arises. Moreover, please list the certification requirements for which the Partner provides all, or part of, the relevant functionality/service, including any contracts or Service Level Agreements in place. Because outsourcing will almost always be partial, you will still need to provide appropriate evidence for certification requirements that are not outsourced and for the parts of the data lifecycle that you control. Qualifications/certifications—including, but not limited to, the DSA or WDS certifications—are preferred for outsource partners. However, it is not a necessity for them to be certified. We understand that this can be a complex area to define and describe, but such details are essential to ensure a comprehensive review process.

(5) Other Relevant Information. The repository may wish to add extra contextual information that is not covered in the Requirements but that may be helpful to the reviewers in making their assessment. For example, you might describe:

 

Reviewer manual

NA

1. Mission/Scope

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository has an explicit mission to provide access to and preserve data in its domain.

Guidance:

Repositories take responsibility for stewardship of digital objects, and to ensure that materials are held in the appropriate environment for appropriate periods of time. Depositors and users must be clear that preservation of, and continued access to, the data is an explicit role of the repository.

For this Requirement, please describe:

Reviewer manual

NA

2. Licenses

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository maintains all applicable licenses covering data access and use and monitors compliance.

Guidance:

Repositories must maintain all applicable licenses covering data access and use, communicate about them with users, and monitor compliance. This Requirement relates to the access regulations and applicable licenses set by the data repository itself, as well as any codes of conduct that are generally accepted in the relevant sector for the exchange and proper use of knowledge and information. Reviewers will be seeking evidence that the repository has sufficient controls in place according to the access criteria of their data holdings, as well as evidence that any relevant licences or processes are well managed.

For this Requirement, please describe:

Note that if all data holdings are completely public and without conditions imposed on users—such as attribution requirements or agreement to make secondary analysis openly available—then it can simply be stated.

This Requirement must be read in conjunction with R4 (Confidentiality/Ethics) to the extent that ethical and privacy provisions impact on the licenses. Assurance that deposit licences provide sufficient rights for the repository to maintain, preserve, and offer access to data is covered under R10 (Preservation Plan).

Reviewer manual

NA

3. Continuity of access

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository has a continuity plan to ensure ongoing access to and preservation of its holdings.

Guidance:

This Requirement covers the measures in place to ensure access to, and availability of, data holdings, both currently and in the future. Reviewers are seeking evidence that preparations are in place to address the risks inherent in changing circumstances.

For this Requirement, please describe:

Evidence for this Requirement should relate more to governance than to the technical information that is needed in R10 (Preservation plan) and R14 (Data reuse), and should cover the situation in which R1 (Mission/Scope) changes. This Requirement contrasts with R15 (Technical infrastructure) and R16 (Security) in that it covers full business continuity of the preservation and access functions.

Reviewer manual

NA

4. Confidentiality/Ethics

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms.

Guidance:

Adherence to ethical norms is critical to responsible science. Disclosure risk—for example, the risk that an individual who participated in a survey can be identified or that the precise location of an endangered species can be pinpointed—is a concern that many repositories must address. Evidence sought is concerned with not only having good practices for data with disclosure risks, but also the necessity to maintain the trust of those agreeing to have personal/sensitive data stored in the repository.

For this Requirement, responses should include evidence related to the following questions:

Evidence for this Requirement should be in alignment with provisions for the procedures stated in R12 (Workflows) and for any licenses in R2 (Licences).

Reviewer manual

NA

5. Organizational infrastructure

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository has adequate funding and sufficient numbers of qualified staff managed through a clear system of governance to effectively carry out the mission.

Guidance:

Repositories need funding to carry out their responsibilities, along with a competent staff who have expertise in data archiving. However, it is also understood that continuity of funding is seldom guaranteed, and this must be balanced with the need for stability.

For this Requirement, responses should include evidence related to the following:

Full descriptions of the tasks performed by the repository—and the skills necessary to perform them—may be provided, if available. Such descriptions are not mandatory, however, as this level of detail is beyond the scope of core certification.

Reviewer manual

NA

6. Expert guidance

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository adopts mechanism(s) to secure ongoing expert guidance and feedback (either in-house, or external, including scientific guidance, if relevant).

Guidance:

An effective repository strives to accommodate evolutions in data types, data volumes, and data rates, as well as to adopt the most effective new technologies in order to remain valuable to its Designated Community. Given the rapid pace of change in the research data environment, it is therefore advisable for a repository to secure the advice and feedback of expert users on a regular basis to ensure its continued relevance and improvement.

For this Requirement, responses should include evidence related to the following questions:

This Requirement seeks to confirm that the repository has access to objective expert advice beyond that provided by skilled staff mentioned in R5 (Organizational infrastructure).

Reviewer manual

NA

7. Data integrity and authenticity

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository guarantees the integrity and authenticity of the data.

Guidance:

The repository should provide evidence to show that it operates a data and metadata management system suitable for ensuring integrity and authenticity during the processes of ingest, archival storage, and data access.

Integrity ensures that changes to data and metadata are documented and can be traced to the rationale and originator of the change.

Authenticity covers the degree of reliability of the original deposited data and its provenance, including the relationship between the original data and that disseminated, and whether or not existing relationships between datasets and/or metadata are maintained.

For this Requirement, responses on data integrity should include evidence related to the following:

Evidence of authenticity management should relate to the follow questions:

This Requirement covers the entire data lifecycle within the repository, and thus has relationships with workflow steps included in other requirements—for example, R8 (Appraisal) for ingest, R9 (Documented storage procedures) and R10 (Preservation plan) for archival storage, and R12–R14 (Workflows, Data discovery and identification, and Data reuse) for dissemination. However, maintaining data integrity and authenticity can also be considered a mindset, and the responsibility of everyone within the repository.

Reviewer manual

NA

8. Appraisal

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users.

Guidance:

The appraisal function is critical in determining whether data meet all criteria for inclusion in the collection and in establishing appropriate management for their preservation. Care must be taken to ensure that the data are relevant and understandable to the Designated Community served by the repository.

For this Requirement, responses should include evidence related to the following questions:

This Requirement addresses quality assurance from the viewpoint of the interaction between the depositor of the data and metadata and the repository. It contrasts with R11 (Data quality), which addresses metadata and data quality from the viewpoint of the Designated Community.

Reviewer manual

NA

9. Documented storage procedures

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository applies documented processes and procedures in managing archival storage of the data.

Guidance:

Repositories need to store data and metadata from the point of deposit, through the ingest process, to the point of access. Repositories with a preservation remit must also offer ‘archival storage’ in OAIS terms.

For this Requirement, responses should include evidence related to the following questions:

This Requirement deals with high-level arrangements in respect of continuity. Please refer also to R15 (Technical infrastructure) and R16 (Security) for details on specific arrangements for backup, physical and logical security, failover, and business continuity.

 

Reviewer manual

NA

10. Preservation plan

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository assumes responsibility for long-term preservation and manages this function in a planned and documented way.

Guidance:

The repository, data depositors, and Designated Community need to understand the level of responsibility undertaken for each deposited item in the repository. The repository must have the legal rights to undertake these responsibilities. Procedures must be documented and their completion assured.

For this Requirement, responses should include evidence related to the following questions:

Reviewer manual

NA

11. Data quality

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository has appropriate expertise to address technical data and metadata quality and ensures that sufficient information is available for end users to make quality-related evaluations.

Guidance:

Repositories must work in concert with depositors to ensure that there is enough available information about the data such that the Designated Community can assess the substantive quality of the data. Such quality assessment becomes increasingly relevant when the Designated Community is multidisciplinary, where researchers may not have the personal experience to make an evaluation of quality from the data alone. Repositories must also be able to evaluate the technical quality of data deposits in terms of the completeness and quality of the materials provided, and the quality of the metadata.

Data, or associated metadata, may have quality issues relevant to their research value, but this does not preclude their use in science if a user can make a well-informed decision on their suitability through provided documentation.

For this Requirement, please describe:

Provisions for data quality are also ensured by other Requirements. Specifically, please refer to R8 (Appraisal), R12 (Workflows), and R7 (Data integrity and authenticity).

 

Reviewer manual

NA

12. Workflows

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

Archiving takes place according to defined workflows from ingest to dissemination.

Guidance:

To ensure the consistency of practices across datasets and services and to avoid ad hoc and reactive activities, archival workflows should be documented, and provisions for managed change should be in place. The procedure should be adapted to the repository mission and activities, and procedural documentation for archiving data should be clear.

For this Requirement, responses should include evidence related to the following:

This Requirement confirms that all workflows are documented. Evidence of such workflows may have been provided as part of other task-specific Requirements, such as for ingest in R8 (Appraisal), storage procedures in R9 (Documented storage procedures), security arrangements in R16 (Security), and confidentiality in R4 (Confidentiality/Ethics).

Reviewer manual

NA

13. Data discovery and identification

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository enables users to discover the data and refer to them in a persistent way through proper citation.

Guidance:

Effective data discovery is key to data sharing, and most repositories provide searchable catalogues describing their holdings such that potential users can evaluate data to see if they meet their needs. Once discovered, datasets should be referenceable through full citations to the data, including persistent identifiers to ensure that data can be accessed into the future. Citations also provide credit and attribution to individuals who contributed to the creation of the dataset.

For this Requirement, responses should include evidence related to the following questions:

Reviewer manual

NA

14. Data reuse

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository enables reuse of the data over time, ensuring that appropriate metadata are available to support the understanding and use of the data.

Guidance:

Repositories must ensure that data can be understood and used effectively into the future despite changes in technology. This Requirement evaluates the measures taken to ensure that data are reusable.

For this Requirement, responses should include evidence related to the following questions:

The concept of ‘reuse’ is critical in environments in which secondary analysis outputs are redeposited into a repository alongside primary data, since the provenance chain and associated rights issues may then become increasingly complicated.

Reuse is dependent on the applicable licenses covered in R2 (Licenses).

Reviewer manual

NA

15. Technical infrastructure

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The repository functions on well-supported operating systems and other core infrastructural software and is using hardware and software technologies appropriate to the services it provides to its Designated Community.

Guidance:

Repositories need to operate on reliable and stable core infrastructures that maximizes service availability. Furthermore, hardware and software used must be relevant and appropriate to the Designated Community and to the functions that a repository fulfils. Standards such as the OAIS reference model specify the functions of a repository in meeting user needs.

For this Requirement, responses should include evidence related to the following questions:

Reviewer manual

NA

16. Security

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

The technical infrastructure of the repository provides for protection of the facility and its data, products, services, and users.

Guidance:

The repository should analyze potential threats, assess risks, and create a consistent security system. It should describe damage scenarios based on malicious actions, human error, or technical failure that pose a threat to the repository and its data, products, services, and users. It should measure the likelihood and impact of such scenarios, decide which risk levels are acceptable, and determine which measures should be taken to counter the threats to the repository and its Designated Community. This should be an ongoing process.

For this Requirement, please describe:

This Requirement describes some of the aspects generally covered by others—for example, R12 Workflows)—and is supplementary to R9 (Documented storage procedures).

Reviewer manual

NA

17. Comments/feedback

Required statement of compliance:

0. N/A: Not Applicable.

Applicant manual

These requirements are not seen as final, and we value your input to improve the core certification procedure. To this end, please leave any comments you wish to make on both the quality of the Catalogue and its relevance to your organization, as well as any other related thoughts.

 

Reviewer manual

NA

Data Seal of Approval Board
Wwww.datasealofapproval.orgEinfo@datasealofapproval.org