You are here

Annotation for Transparent Inquiry (ATI) at a Glance

QDR and Hypothes.is (https://hypothes.is/) have partnered to develop a new approach to transparency in qualitative and multi-method research: Annotation for Transparent Inquiry (ATI). ATI is based on the concept of “open annotation,” which allows for the generation, sharing, and discovery of digital annotations across the web (Sanderson et al. 2017). Using ATI empowers social scientists to develop “data supplements” that can be linked directly to articles published on multiple platforms.

Data supplements consist of a set of digital annotations that elucidate the data and analysis on which research is based. Each annotation is anchored to a segment of article text published on the web, and contains one or more of the following elements:

  • A full citation to the underlying data source
  • An analytic note: discussion that illustrates how the data were generated and how they support the empirical claim or conclusion being annotated in the text;
  • A source excerpt: typically 100 to 150 words from a textual source; for handwritten material, audiovisual material, or material generated through interviews or focus groups, an excerpt from the transcription;
  • A source excerpt translation: if the excerpt is not in English, a translation of the key passage(s);

Data supplements are housed in a data repository, which allows them and their component parts to be “FAIR” (findable, accessible, interoperable, and reusable, Wilkinson et al. 2016), and to be protected if sharing needs to be restricted.

We offer an example of ATI here – an annotation served by Hypothes.is on the PDF of a working paper hosted by the Kellogg Institute for International Studies at the University of Notre Dame.

When mapped across the span of a book or article and linked to discrete passages in the text where the data source is deployed, the annotations serve as a digital exoskeleton, dramatically strengthening publications, augmenting the main text and notes and potentially obviating the need for block quotations and discursive footnotes (providing more space for substantive content).

Why Do We Need a “New Approach to Transparency”?

Rigorous social science requires open data and materials. Accordingly, where possible, scholars should describe how their data were generated and analyzed, explain how their analysis substantiates their claims and conclusions, and share the data themselves (when it is ethical and legal to do so) (Miguel et. al 2014).

While openness is relevant for all types of social science, the diversity of data and analytic methods that are employed in different scholarly traditions requires that different mechanisms be used to achieve that goal. Whichever mechanism is used, however, it must be able to meet three challenges. It must make relevant data and analytic information immediately available in tandem with the particular knowledge claim they were used to generate (proximity); those materials must be FAIR, and the mechanism must be capable of addressing concerns about the ethical and legal complications that constrain openness (protection). ATI facilitates the achievement of openness in publications based on the analysis of qualitative data and meets all three of these important challenges.

Quantitative and computational social science analyze numeric data arranged in a matrix and approached as an aggregate body of information. In published work, the analysis is typically summarized in tabular form in the text or appendix (see Figure 1), and the study dataset (and relevant information about its creation) and the do-file used for analysis are customarily provided as supplemental materials.


Figure 1: Quantitative Research – Matrix Data

Qualitative data and analysis work differently. Qualitative data are often richer and more abstract, and are typically analyzed, and used to support claims, individually or in small groups. In this type of analysis, the content of each cited source (e.g., book, archival document, interview transcript, newspaper article, video clip, etc.) serves as a distinct input. Moreover, data, analysis, and conclusions are densely interwoven across the span of a book or article (see Figure 2).

Figure 2: Qualitative Research – Individual Pieces of Data

For qualitative research, optimizing proximity entails making digital data sources (e.g., archival documents, audio-recordings, interview transcripts, ethnographic field notes) and annotations containing relevant analytic information immediately available where the data sources were invoked in the narrative. Proximity requires that relevant data and materials be available across the span of an article, and accessible from the article as it is displayed on a journal’s web page (i.e., on the publisher’s platform).

The second challenge is rendering data and analytic information as FAIR. Accomplishing this goal involves a set of related technologies and policies, and requires substantial expertise and investment. For instance, the data need to be described with the proper metadata, and digital preservation strategies that assure the usability of data in the long run need to be applied to them. Domain repositories like QDR are the stakeholders best equipped to comply with these standards (and to continue to do so as the standards evolve over time).

Finally, the mechanism employed to achieve openness must address the ethical and legal complications that attend much social science data, in particular protecting human participants and respecting copyright law. In the words of one well-known epigram, the goal should be to make data “as open as possible [but] as closed as necessary” (ERAC 2016, 15). The most promising way to maximize transparency in the face of such constraints is through differential access to the evidentiary base of published articles.

Where Are My Data and Materials?

QDR hosts the annotations and the underlying data sources. As a trusted digital repository, QDR is committed to long term preservation and to maintaining the links between web-based publications and their annotations and data sources (e.g., providing updated versions of data source files when the software designed to access them changes). The same best practices of open science that trusted digital repositories apply to underlying data sources should be applied to annotations (both their data elements and their analytic notes): they should be curated (e.g., metadata that allow effective querying and discovery should be attached) and they should be prepared for long-term preservation. In addition, we make the compilation of annotations associated with a particular article openly accessible at QDR as a single, searchable, data supplement; it is also assigned a DOI and carefully preserved.

Open annotations are served from a third party annotation server and mapped to the text of digital articles presented on a publisher’s web site. Displaying annotations requires client software, and the most widely used such software is provided by Hypothes.is. The annotation service is easily enabled on a journal web page via a Hypothes.is bookmarklet, a Hypothes.is Chrome add-on (which requires no action or participation by journal publishers), or an enabled button on the page. When annotation is enabled, annotated passages in an article are highlighted and annotations are immediately available to anyone viewing the page.

Open annotation thus effectively mobilizes the comparative advantage of publishers who publish articles; technology firms that build annotation platforms; and domain repositories that curate, preserve, protect, and distribute FAIR data and materials in the form of open annotations.

ATI has the potential to greatly enhance the clarity with which descriptive and causal claims are made in qualitative research; to transform how transparency is achieved; and by empowering replication and reproduction, profoundly impact the way qualitative social science scholarship is evaluated. This new approach to transparency should thus have a direct and significant impact on the credibility and legitimacy of qualitative scholarship and its utility for evidence-based policy.