QDR curates, archives, durably preserves, and provides access to digital data used in qualitative and multi-method social inquiry. This digital preservation policy describes the technological and institutional steps that QDR takes to ensure long-term preservation of the data it holds.
This policy draws on the Trusted Digital Repositories: Attributes and Responsibilities (PDF) report created through a collaboration between the Research Libraries Group (RLG) and Online Computer Library Center (OCLC). Specifically, it discusses the seven “Attributes of a Trusted Digital Repository” outlined in Section 2 of the report:
- OAIS compliance
- Administrative responsibility
- Organizational viability
- Financial sustainability
- Technological and procedural suitability
- Systems security
- Procedural accountability.
1. OAIS Compliance
QDR manages and preserves digital data based on the principles of the Open Archival Information System (OAIS) Reference Model (2012). In following OAIS and related initiatives (such as offering Findable, Accessible, Interoperable, and Re-usable[FAIR] data), QDR seeks to assure that its processes and technology conform to community best practices.
QDR’s OAIS Implementation
On receipt, QDR checks data files and metadata for completeness and integrity and, as needed, communicates with depositors to ask questions, confirm information, and/or solicit updated or additional files. The complete initial deposit (i.e., the Submission Information Package, SIP) is then committed to archival storage. The SIP is deposited with QDR’s long-term storage partner (the Digital Preservation Network, DPN), where file integrity is periodically monitored. QDR is committed to preserve data for at least 20 years from the point of deposit.
QDR recommends particular file formats and engages in file migration to protect against file format obsolescence. Following receipt of a data deposit, files are converted to recommended storage formats and ingested into the Dataverse repository system. The file formats follow recommendations from the Library of Congress as well as other data repositories with significant holdings of qualitative data such as UK Data and DANS. All changes that QDR makes to any files deposited with us are recorded in a readme file accompanying the data. In the future, QDR plans to record such preservation action in PREMIS metadata. All file formats are monitored for obsolescence using the Library of Congress’s Sustainability of File Formats pages as well as the UK National Archive’s PRONOM service. Files in formats threatened by obsolescence are converted to suitable replacement formats. Converted files are stored in the repository as well as committed to long-term storage as an Archival Information Package (AIP). QDR does not prepare separate dissemination copies of files, so the Dissemination Information Package (DIP) is identical to the AIP.
QDR performs file integrity checks to protect against bitrot. On ingest, QDR’s repository software (Dataverse) automatically creates an MD5 checksum for every ingested file. The checksum is stored to allow for checking file integrity manually including by users and third parties. Files are stored on Amazon Web Service Simple Server Solution (AWS S3), where redundant copies of each file are stored on distributed servers and integrity checks at rest are performed using content-MD5 checksums and cyclic redundancy checks (CRCs). AWS also performs integrity checks during data transfer.
Data files are stored with descriptive metadata enhanced during curation. QDR’s metadata application profile is based on Data Documentation Initiative (DDI), the de-facto standard for social science metadata.
Following FAIR requirements, QDR metadata are available without restrictions in human- and machine-readable formats and through an API. Access to data files is restricted to registered users and, where necessary, additional access controls for files are set.
2. Administrative Responsibility
QDR launched in 2014 with a mission to “curate, store, preserve, publish, and enable the download of digital data generated through qualitative and multi-method research in the social sciences.” QDR seeks to fulfill this mission by offering training in managing and sharing data generated through and analyzed in qualitative and multi-method social science research, and by building infrastructure dedicated to curating, archiving, preserving, and making available those data. The repository is led by social scientists at Syracuse University and Georgetown University as well as information scientists at the University of Washington at Seattle
QDR strives to follow best practices for digital repositories as specified in ISO Standard 16363. The repository is currently seeking certification for compliance with international standards for digital repositories under the Core Trust Seal. A Technical Advisory Board, the members of which are specialists from libraries and data repositories, guides QDR on technical questions including development, curation, and digital preservation.
QDR works closely with social scientists to ensure that it is fulfilling its functions within its designated community. Its Research Advisory Board, whose members are leading social scientists, assures that the repository serves the interests of its main constituency, practicing social scientists.
Each of these boards meets twice a year via videoconference. QDR also reports to both boards at the end of each quarter and solicits advice and feedback.
QDR’s relationship with depositors is governed by its General Terms and Conditions of Use, and by its Standard and Special Deposit Agreements, which clearly delineate the rights and responsibilities of the repository and grants it the necessary authority to ensure long-term preservation of materials while depositors retain ownership of the data.
3. Organizational Viability
QDR is an operating unit of Syracuse University. The scope of QDR’s preservation activity is specified in the collections development and appraisal policy. To do justice to the qualitative data under its custody, QDR curators apply subject-area expertise to data curation, and all curation is supervised by senior staff with graduate degrees in social science. QDR personnel regularly attend and present at national and international workshops, assuring that the organization is aware of and follows latest best practices in repository management. QDR is also in constant dialog with users in its designated community, as well as its Research Advisory Board, and periodically updates its policies and practices to meet their needs. Succession plans are in place to assure continued access to data should QDR cease operations, both through QDR’s membership in DPN and the Data Preservation Alliance for the Social Science (Data-PASS).
4. Financial Sustainability
At present QDR is mainly grant-funded. Syracuse University demonstrates its ongoing commitment to QDR through the provision of administrative and logistical support and by serving as its host organization.
4.1 Institutional Commitment
Starting in July 2018, QDR will diversify its funding stream by beginning to accept institutional memberships. Between July 2018 and June 2019, affiliates of institutional members (e.g., faculty, students, and research staff in the case of universities) will not be charged for curation and preservation services for data projects that they deposit with QDR (with the qualifications noted on QDR’s Institutional Membership page, and in its Institutional Membership Agreement). Beginning in July 2019, affiliates of member institutions will not be charged for curation and preservation services (with the same qualifications) performed on a limited number of deposits commensurate with the institutional membership tier they have elected. In addition, representatives from institutional members will shape QDR’s direction through election of the two advisory boards.
4.2 Cooperation and Collaboration
QDR collaborates with other data repositories and university libraries through developing joint grant applications, events, and other initiatives. QDR’s most important partnerships are with the other repositories that form part of the Data-PASS consortium. Data-PASS member repositories have entered into an agreement to cooperate to achieve common objectives to archive social science data, provide access to these data in a shared catalog, participate in a preservation network, and engage in digital preservation best practices. The Data-PASS partnership is integral to QDR’s succession plan as described above.
5. Technological and Procedural Suitability
QDR follows a normalization and migration strategy to preservation as described above and in its curation policy. The processes are documented both internally and publicly and new staff members are trained in them within a few weeks of joining QDR.
QDR curates and stores metadata based on the Data Documentation Initiative (DDI) standard and participates in efforts to improve DDI’s applicability to qualitative data and its implementation in the Dataverse repository software.
QDR relies on widely used and reliable infrastructure. Its repository functions are provided by the Dataverse software, widely used open source software developed at the Institute for Quantitative Social Science (IQSS) at Harvard University. Our servers are run on Amazon Web Service S3, providing 99.99% reliability and 99.999999999% durability of objects.
6. System Security
The security of its holdings is of paramount importance to QDR. Data are stored on our main AWS servers, AWS stores redundant copies of all data and automatically checks MD5 checksums both at rest and on transfer. Additional “hot” system back-ups are kept on AWS. Regular file back-ups are also kept on local servers at Syracuse University. DPN, of which QDR is a member, provides long-term, decentralized storage for our holdings.
For each server that QDR provisions we create a virtual private cloud (VPC) through AWS. The VPC is achieved through private IP subnets, as well as a virtual private network (VPN) that secures access to the VPC.
Sensitive data are managed by curation staff according to dedicated protocols and stored with additional AES-256 encryption where warranted.
7. Procedural Accountability
QDR is transparent in its processes and structures, documenting its main policies and workflows. QDR’s policies and workflows are periodically reviewed and the repository’s two advisory boards are consulted for major changes. The two advisory boards also receive quarterly roadmaps from QDR’s technical team and the content/curation team. Each team evaluates its own performance and that of the other team on a monthly basis.
QDR’s web services are monitored using industry-standard monitoring and notification tools such as Pingdom, Nagios, and Pagerduty to assure swift responses to outages and has escalation protocols in place should such outages occur.