You are here

Annotation for Transparent Inquiry (ATI) at a Glance

QDR and Hypothesis (https://hypothes.is/) have partnered to develop a new approach to transparency in qualitative and multi-method research: Annotation for Transparent Inquiry (ATI). ATI builds on “active citation,” an earlier approach to achieving transparency in qualitative research pioneered by Moravcsik (e.g., 2010, 2012a, 2012b, 2014a, 2014b, 2014c, 2016). ATI employs “open annotation,” which allows for the generation, sharing, and discovery of digital annotations across the web (Sanderson et al. 2017). Using ATI empowers social scientists to develop “data supplements” that can be linked directly to articles published on multiple platforms.

An ATI Data Supplement includes two sections: an “ATI Data Overview” and a set of digital annotations that elucidate the data and analysis on which research is based.

The ATI Data Overview, of approximately 1,000 words in length, introduces the ATI Data Supplement. The Overview describes the associated publication’s empirical base and offers a general discussion of how the data were produced and analyzed. The Overview should not repeat any information already included in the main text of the publication. If the manuscript has a bibliography or list of references, it should be included in the Data Overview.

In most cases, the bulk of the ATI Data Supplement will consist of a set of digital annotations that elucidate the data and analysis on which research is based. Each annotation is anchored to a segment of article text published on the web, and contains one or more of the following elements:

  • A full citation to the underlying data source(s) mentioned in the annotated portion of the text (when a full citation was not included in the bibliography), and, where useful, additional location information;

  • A source excerpt: typically 100 to 150 words from a textual source (e.g., an excerpt from the transcription for handwritten material, audiovisual material, or material generated through interviews or focus groups);

  • A source excerpt translation: if the excerpt is not in English, a translation and indication of its source;

  • An analytic note: discussion that illustrates how the data were generated and/or analyzed and how they support the empirical claim or conclusion being annotated in the text;

  • Data source: the file name(s) of the corresponding data source(s) linked to the source(s) themselves when these are digital and can be shared ethically and legally.

QDR hosts the ATI Data Overview, the underlying data sources, and a preservation copy of the annotations. Hosting ATI Data Supplements in a data repository allows them and their component parts to be “FAIR” (findable, accessible, interoperable, and reusable, Wilkinson et al. 2016), and to be protected if sharing needs to be restricted.

We offer an example of ATI here – an annotation served by Hypothesis on the PDF of a working paper hosted by the Kellogg Institute for International Studies at the University of Notre Dame.

When mapped across the span of a book or article and linked to discrete passages in the text where the data source is deployed, the annotations serve as a digital exoskeleton, dramatically strengthening publications, augmenting the main text and notes and potentially obviating the need for block quotations and discursive footnotes (providing more space for substantive content).

Why Do We Need a “New Approach to Transparency”?

Rigorous social science requires open data and materials. Accordingly, where possible, scholars should describe how their data were generated and analyzed, explain how their analysis substantiates their claims and conclusions, and share the data themselves (when it is ethical and legal to do so) (Miguel et. al 2014).

While openness is relevant for all types of social science, the diversity of data and analytic methods that are employed in different scholarly traditions requires that different mechanisms be used to achieve that goal. Whichever mechanism is used, however, it must be able to meet three challenges. It must make relevant data and analytic information immediately available in tandem with the particular knowledge claim they were used to generate (proximity); those materials must be FAIR, and the mechanism must be capable of addressing concerns about the ethical and legal complications that constrain openness (protection). ATI facilitates the achievement of openness in publications based on the analysis of qualitative data and meets all three of these important challenges.

Quantitative and computational social science analyze numeric data arranged in a matrix and approached as an aggregate body of information. In published work, the analysis is typically summarized in tabular form in the text or appendix (see Figure 1), and the study dataset (and relevant information about its creation) and the do-file used for analysis are customarily provided as supplemental materials.


Figure 1: Quantitative Research – Matrix Data

Qualitative data and analysis work differently. Qualitative data are often richer and more abstract, and are typically analyzed, and used to support claims, individually or in small groups. In this type of analysis, the content of each cited source (e.g., book, archival document, interview transcript, newspaper article, video clip, etc.) serves as a distinct input. Moreover, data, analysis, and conclusions are densely interwoven across the span of a book or article (see Figure 2).

Figure 2: Qualitative Research – Individual Pieces of Data

For qualitative research, optimizing proximity entails making digital data sources (e.g., archival documents, audio-recordings, interview transcripts, ethnographic field notes) and annotations containing relevant analytic information immediately available where the data sources were invoked in the narrative. Proximity requires that relevant data and materials be available across the span of an article, and accessible from the article as it is displayed on a journal’s web page (i.e., on the publisher’s platform).

The second challenge is rendering data and analytic information as FAIR. Accomplishing this goal involves a set of related technologies and policies, and requires substantial expertise and investment. For instance, the data need to be described with the proper metadata, and digital preservation strategies that assure the usability of data in the long run need to be applied to them. Domain repositories like QDR are the stakeholders best equipped to comply with these standards (and to continue to do so as the standards evolve over time).

Finally, the mechanism employed to achieve openness must address the ethical and legal complications that attend much social science data, in particular protecting human participants and respecting copyright law. In the words of one well-known epigram, the goal should be to make data “as open as possible [but] as closed as necessary” (ERAC 2016, 15). The most promising way to maximize transparency in the face of such constraints is through differential access to the evidentiary base of published articles.

Where Are My Data and Materials?

QDR hosts the ATI Data Overview, the underlying data sources and a preservation copy of the annotations. As a trusted digital repository, QDR is committed to long term preservation and to maintaining the links between web-based publications and their annotations and data sources (e.g., providing updated versions of data source files when the software designed to access them changes). The same best practices of open science that trusted digital repositories apply to underlying data sources should be applied to annotations (both their data elements and their analytic notes): they should be curated (e.g., metadata that allow effective querying and discovery should be attached) and they should be prepared for long-term preservation. In the near future, QDR will make ATI Data Supplements openly accessible at QDR as single, searchable, data supplements, each with its own DOI.

Open annotations are served from a third party annotation server and mapped to the text of digital articles presented on a publisher’s web site. Displaying annotations requires client software, and the most widely used such software is provided by Hypothesis. The annotation service is easily enabled on a journal web page via a Hypothesis bookmarklet, a Hypothesis Chrome add-on (which requires no action or participation by journal publishers), or an enabled button on the page. When annotation is enabled, annotated passages in an article are highlighted and annotations are immediately available to anyone viewing the page.

Open annotation thus effectively mobilizes the comparative advantage of publishers who publish articles; technology firms that build annotation platforms; and domain repositories that curate, preserve, protect, and distribute FAIR data and materials in the form of open annotations.

ATI has the potential to greatly enhance the clarity with which descriptive and causal claims are made in qualitative research; to transform how transparency is achieved; and by empowering replication and reproduction, profoundly impact the way qualitative social science scholarship is evaluated. This new approach to transparency should thus have a direct and significant impact on the credibility and legitimacy of qualitative scholarship and its utility for evidence-based policy.