Why Annotation for Transparent Inquiry (ATI)?

QDR and Hypothesis have partnered to develop a new approach to achieving transparency in qualitative and multi-method research: Annotation for Transparent Inquiry (ATI). ATI builds on “active citation,” an earlier approach pioneered by Andrew Moravcsik.

Using ATI empowers social scientists to develop data supplements that can be linked directly to digital publications on multiple platforms. An ATI Project includes two sections: a “Data Overview” and a set of digital annotations (potentially linked to underlying data sources). The annotations elucidate the data and analysis on which the publication is based. ATI employs “open annotation,” which allows for the generation, sharing, and discovery of digital annotations across the web (Sanderson et al. 2017). The ATI Instructions offer more information on the ATI Project.

When linked to discrete passages throughout a digitally published book or article, annotations serve as a digital exoskeleton, dramatically strengthening publications, augmenting the main text and notes and potentially obviating the need for block quotations and discursive footnotes (providing more space for substantive content).

We offer a wide range of examples of ATI here.

Why Do We Need a “New Approach to Transparency”?

Rigorous social science requires open data and materials. Accordingly, where possible, scholars should describe how their data were generated and analyzed, explain how their analysis substantiates their claims and conclusions, and share the data themselves (when it is ethical and legal to do so) (Miguel et. al 2014).

While openness is relevant for all types of social science, the diversity of data and analytic methods that are employed in different scholarly traditions requires that different mechanisms be used to achieve that goal. Whichever mechanism is used, however, it must be able to meet three challenges. It must make relevant data and analytic information immediately available in tandem with the particular knowledge claim they were used to generate (proximity); those materials must be FAIR (findable, accessible, interoperable, and reusable, Wilkinson et al. 2016); and the mechanism must be capable of addressing concerns about the ethical and legal complications that constrain openness (protection). ATI facilitates the achievement of openness in publications based on the analysis of qualitative data and meets all three of these important challenges.

Quantitative and computational social science analyze numeric data arranged in a matrix and approached as an aggregate body of information. In published work, the analysis is typically summarized in tabular form in the text or appendix (see Figure 1), and the study dataset (and relevant information about its creation) and the do-file used for analysis are customarily provided as supplemental materials.

Figure 1: Quantitative Research – Matrix Data

Qualitative data and analysis work differently. Qualitative data are often richer and more abstract, and are typically analyzed, and used to support claims, individually or in small groups. In this type of analysis, the content of each cited source (e.g., book, archival document, interview transcript, newspaper article, video clip, etc.) serves as a distinct input. Moreover, data, analysis, and conclusions are densely interwoven across the span of a book or article (see Figure 2).

Figure 2: Qualitative Research – Individual Pieces of Data

For qualitative research, optimizing proximity entails making digital data sources (e.g., archival documents, audio-recordings, interview transcripts, ethnographic field notes) and annotations containing relevant analytic information immediately available. That is, the data and materials should be directly linked to the relevant passage in a digital publication, and accessible from a journal’s web page (i.e., on the publisher’s platform).

The second challenge is rendering data and analytic information as FAIR. Accomplishing this goal involves a set of related technologies and policies, and requires substantial expertise and investment. For instance, the data need to be described with the proper metadata, and digital preservation strategies that assure the usability of data in the long run need to be applied to them. Domain repositories like QDR are the stakeholders best equipped to comply with these standards (and to continue to do so as the standards evolve over time).

Finally, the mechanism employed to achieve openness must address the ethical and legal complications that attend much social science data, in particular protecting human participants and respecting copyright law. In the words of one well-known epigram, the goal should be to make data “as open as possible [but] as closed as necessary” (ERAC 2016, 15). The most promising way to maximize transparency in the face of such constraints is through differential access to the evidentiary base of published articles.

Where Are My Data and Materials?

QDR hosts the ATI Data Overview, the underlying data sources, and a preservation copy of the annotations. As a trusted digital repository, QDR is committed to long term preservation; to applying appropriate access controls if sharing needs to be restricted; and to maintaining the links between web-based publications and their annotations and data sources (e.g., providing updated versions of data source files when the software designed to access them changes). The same best practices of open science that trusted digital repositories apply to underlying data sources should be applied to annotations (both their data elements and their analytic notes): they should be curated (e.g., metadata that allow effective querying and discovery should be attached) and they should be prepared for long-term preservation. ATI Projects are accessible at QDR as single, searchable, data supplements, each with its own DOI.

Open annotations are served from a third party annotation server and mapped to the text of digital articles on a publisher’s web site. Displaying annotations requires client software, and the most widely used such software is provided by Hypothesis. The annotation service is easily enabled on a journal web page via a Hypothesis bookmarklet, a Hypothesis Chrome add-on (which requires no action or participation by journal publishers), or an enabled button on the page. When annotation is enabled, annotated passages in an article are highlighted and annotations are immediately available to anyone viewing the page.

Open annotation thus effectively mobilizes the comparative advantage of article publishers; technology firms that build annotation platforms; and domain repositories that curate, preserve, protect, and distribute FAIR data and materials in the form of open annotations.

ATI has the potential to greatly enhance the clarity with which descriptive and causal claims are made in qualitative research; to transform how transparency is achieved; and by empowering replication and reproduction, profoundly impact the way qualitative social science scholarship is evaluated. This new approach to transparency should thus have a direct and significant impact on the credibility and legitimacy of qualitative scholarship and its utility for evidence-based policy.