You are here

Handling Sensitive Data

This policy describes how QDR handles sensitive data. Some data garnered from human participants are sensitive. We classify data as sensitive based on the degree to which it they contain personally identifiable information (identifiability) and the degree of harm to which research participants could be exposed if the data were matched to them (risk). Based on these two dimensions, we specify three sensitivity levels for data (low, medium, high), each of which has a different technical workflow (see table below). Data that contain no personally identifiable information, and data with no risk or very minimal risk (e.g., on the record interviews or documents collected at a public archive) do not fall under this policy.

QDR strongly believes that data usage agreements (Special Deposit/Download Agreements in QDR’s case) should be tailored to the data at hand; as such, different conditions may be placed on access to data within the same sensitivity category.

Data is classified from low to high-sensitivity depending on two factors: identifiability and risk

The subsequent sections of the policy outline the process for signing a Special Deposit Agreement with QDR, and discuss how sensitive data are securely transferred to QDR, curated, stored, and made accessible to QDR’s registered users.

Special Deposit Agreement

Depositors who wish to limit access to some or all of the data they deposit with QDR choose the conditions they wish to apply to the data, and sign a Special Deposit Agreement with QDR that indicates those access conditions. Where QDR and the depositor disagree on what access conditions to impose, the repository reserves the right not to publish the data.

Data Transfer

Depositors are instructed to encrypt files containing de-identified data using AES-256 encryption with an open-source tool such as 7-Zip. The password is transmitted from the depositor to QDR staff in a separate channel, typically by phone. Under most circumstances, QDR does not see or handle re-identification keys.

Files are transferred using QDR’s Business Dropbox account, and are placed in a dedicated folder. Dropbox encrypts files at rest using AES-256 encryption and secures file transfer using SSL (for more on Dropbox’s security architecture, see here). All QDR staff with access to the files (in most cases, no more than three people per project) have multi-factor authentication enabled for Dropbox to reduce the likelihood of unauthorized access.

Data Curation

Curation is performed by QDR curation staff located at Syracuse University. As part of curation, QDR performs a review of potential de-identification issues. While QDR advises depositors on best practices, the responsibility for decisions concerning how data are de-identified remains with the depositor. Moreover, QDR can only review de-identification concerns for material in select languages.

Low Sensitivity: During curation, files are stored unencrypted on a dedicated drive provided by Syracuse University’s Information and Communication Technology Services, accessible only to QDR staff and Syracuse University system administrators. Access is only possible from within the Syracuse Internet Protocol (IP) address range.

Medium or High Sensitivity: During curation, files are stored encrypted using VeraCrypt (or a similar encryption software) on a dedicated drive provided by Syracuse University’s Information and Communication Technology Services, accessible only to QDR staff and Syracuse University system administrators. Access is only possible from within the Syracuse Internet Protocol (IP) address range. All staff with access to the data receive additional security training.

Permanent Data Storage

Low Sensitivity: The primary data storage is QDR’s instance of the Dataverse software, run on Amazon Web Services (AWS). Restricted files are stored unencrypted but accessible only to users specifically authorized to access the files. Access to restricted files can only be granted by QDR administrators. The repository software logs all file downloads to allow for security audits.

Real-time and daily back-ups are made to the AWS and Syracuse servers. Access to back-ups is limited to QDR administrators and restricted via an AWS Virtual Private Cloud (VPC).

Medium or High Sensitivity: As the Dataverse software used by QDR does not currently support encryption at rest, files are stored encrypted on QDR’s AWS servers outside the Dataverse software. The Dataverse catalog holds placeholder files with file-level metadata. Access to encrypted files and their back-ups is restricted using an AWS Virtual Private Cloud (VPC).

All Sensitivity Levels: QDR uses long-term preservation through the Digital Preservation Network (DPN). In accordance with DPN recommendations, restricted files are encrypted using AES-256 before deposit to DPN.

Access to Restricted Data

QDR registered users who wish to access restricted data request to do so through QDR’s Dataverse repository. Those users must complete and sign a Special Download Agreement. The agreement requires, among other things, that users describe the research project for which they wish to use the data, and that they destroy all local copies of all downloaded data upon completion of the specified project.

Low Sensitivity: QDR authenticates the identity of the user by a) confirming their use of an institutional email account associated with them and b) speaking with the user via video-conferencing software. Access to restricted data is not granted to users who are not affiliated with a research institution without express permission from the depositor. Additional requirements or restrictions may be specified in the Special Download Agreement for the data. Once access is granted, authenticated users can download the restricted files from QDR’s Dataverse installation.

Medium Sensitivity: In addition to the authentication required for low-sensitivity data, QDR requires that users submit a data security plan describing how the data will be stored securely locally, how access will be regulated, and how the data will be disposed of once analysis is complete. Further, an agreement signed by both the user and an Authorized Institutional Officer from the user’s home institution, as well as approval from the Institutional Review Board at the user’s home institution, are required before the restricted data can be downloaded. Additional requirements or restrictions may be specified in the Special Download Agreement for the data. If access is granted, encrypted data are made available to the requesting party.

High Sensitivity: In addition to fulfilling authentication and access requirements for medium-sensitivity data, users may only access the data physically at QDR at Syracuse University in a supervised room on a dedicated computer without internet access. Notes can only be taken on that computer and are subject to review by QDR staff as is any mention of the data in publications.