Complexity and uncertainty in Digital Humanities projects: a co-design approach around data visualisation

Text originally published with Eveline Wandl-Vogt and Roberto Theron for ProvideDH on 13/07/2019

In the context of the PROVIDEDH project (PROgressive VIsual DEcision-Making in Digital Humanities), and amidst the important and still emerging study of uncertainty in DH research (Therón et al., 2018), during the DH2019 conference in Utrecht as members of the PROVIDEDH team we organised a day-long, hands-on workshop with the Exploration Space team of the ACDH Austrian Academy of Sciences and the VisUsal research group of the University of Salamanca.

The session consisted of a series of short presentations, addressing topics like data visualization in DH, Open Innovation for transdisciplinary research or uncertainty in the DH research process (slides here), followed by group discussions and iterations in order to produce a series of paper prototypes and informational canvas addressing the inputs from the 11 participants in the workshop.

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

For this, the sequence was based on the logic of co-creation and participatory design, enabling shared tasks and discussions around the mentioned topics, as well as on an open and transparent approach by demonstrating the current status and features of the ProvideDH annotation tool and other designs of the project as a work in progress.

1. Accreditation based on roles

Participants, when arriving to the exploration space, had to choose three background defining themselves individually, among several options related to transdisciplinary research, in order to generate a personalized badge for the rest of the day. This allowed to activate a first round of presentations, as well as a preliminary identification or areas of expertise and interest from their side. With a significant majority of roles around communication, experimentation, qualitative and ICT skills,participants were not only familiar with data visualization in DH but also in some cases with projects under development in the area.

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

2. Identification of data visualization tools for DH and evaluation criteria

The second part of the workshop consisted in a short presentation focusing on data visualization principles and its connection with DH, moving from the basic question of how data visualization can refer to information and knowledge through to visual encodings, and how its purpose can be diverse, to communicate, investigate or understand better a given phenomena. In this sense, based on the work by VisUsal it was also important to stress how data visualization needs to be attached to the user and to specific tasks, in this case related to the research process, and especially when addressing issues of interaction among users.

This part was followed by a shared discussion on important indicators for evaluating data visualization tools, following a co-design sequence where participants first selected an example of data visualization tool, in small groups, and afterwards presented them and discussed some of its characteristics. The criteria selected for this derived from a previous survey in which 49 respondents answered diverse questions about complexity and uncertainty in DH, and also from the group discussions at this stage of the workshop. The selected criteria, applied to the three cases selected were “usability”, “innovation” and “reproducibility”.

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

The method in this case was to use an “evaluation” thermometer for discussing each example against the mentioned criteria, where participants could explain the rationale behind giving more or less value to each data visualization tool. The tools discussed in this sense where from very different type, which made very challenging to compare them, but at the same time raised several ontological considerations about visualization and the humanities. From the Breve tool from Stanford, for visualizing incomplete and messy data, to the Topotime for developing digital models that merge space and time, but also a tool such as a standard digital camera, which captures data in qualitative ways but can also be used for DH-oriented research purposes.

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

The comparisons led to address hot the concept of usability varies depending on people’s background and familiarity with coding and programming, while innovation as a concept is open to several interpretations of novelty, value and appropriation. Regarding reproducibility, it was addressed not only in relation to openness and transparency, but also in relation to science and the need for opening up the methods and results.

3. Open innovation and mapping data uncertainties with (un)knowledge matrix

The third part of the session started with an introduction to Open Innovation (OI) from the Exploration Space team of the ACDH and how it can relate to concepts of co-creation, and specially to collaborative research practices beyond the academic domain. Strategically opening up processes during the research and exploration-led innovation process can lead to better inclusiveness, access and transparency of results, on the one hand. On the other one, OI also represents an emerging set of practices for accessing widely distributed knowledge, in which the workshop itself tried to demonstrate the approach by sharing the ProvideDH platform development and testing of co-creation materials.

In this sense, the next co-creation iteration was based on an adaptation of the Johari Window (Luft & Ingham, 1961) for identifying the types of uncertainty and knowledge experienced by participants when dealing with DH-related data. For this, a big scale canvas covered four areas related to “knowns” and “unkowns” in relation to dimensions of (1) Assumptions (when the user knows the data he or she is looking for, and also has good access to the answer); (2) Gaps (when the answer lies somewhere in the data, but users don’t know how to access or analyse it); (3) Tacit knowledge (the user knows the question, and just needs to collector visualize the right data to answer it clearly) and finally (4) Discoveries (as an explorative and sometimes playful opportunity to access new data, generating new questions and insights). Answers related to that framework related especially about DH data for research, with some of them summarised as:

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

Assumptions (“known knowledge”):
“When I work with my clear data about my research on a spreadsheet”
“When I know where to look for specific sources of information based on the context of subject I am working on”
“When the underlying dataset is coherent and detailed enough, and the query results are meaningful”

Gaps (“known unknowns”):
“When I am working with different sources of data and some sources are incomplete, missing or destroyed”
“When I want to answer research questions with metadata produced by others but I don’t understand the framework of the metadata given (e.g. digital heritage) or disagree with it”
“When I treat data visualization as computational and quantitative medium but I’m not sure about the story behind it”
“When I feel that despite gaps in the data, a valid answer can still come out”
“When the answer is not in the data because it is ‘lost in history’ and no one will ever know”
“When I got an idea but is hard to get physical data. How, where…”
“When I want to study a research question but don’t know what data could answer it”
“When I start a project I always feel the need to identify all the gaps in my dataset because I am afraid (uncertain) about how that will affect the way I understand my data”

Tacit knowledge (“unknown knows”):
“When I search in someone’s data visuals and I detect useful patterns for my own work”
“When I try to map a pre-modern confederacy like the Holy Roman Empire to things like latitude/longitude coordinates or state (political) boundaries”
“When I know a pattern for individual case studies (in the industry) and I look for scaling in quantitative data”
“When I don’t know what possible answers within my data gathered so far”
“When I evaluate results of experiments / surveys to find answers to my expectations”

Discoveries (“unknown unknowns”)
“When I can freely and playfully explore new data with an easy to use tool, comparing stuff”
“When I make (positive) serendipitous encounters”
“When I have got new questions emerging from data but I don’t know where I can answer them (and with what data)”
“When I’m surrounded by external and extensive missing of insights (tragic)”
“When I wonder about perspectives from people different from me that I can’t even imagine in advance”
“When I use disparate sources of data to create a combined visualization”

4. Uncertainty in DH, ProvideDH annotation tool demo and personas

The last part of the workshop evolved from a presentation of key concepts of uncertainty and complexity in DH, based on a new taxonomy result of the VisUsal and the rest of the ProvideDH partners’ research, to a collective formulation of potential users beyond the academic research domain who could be interested in the data visualization of DH, as well as a round of presentations from participants in relation to their own research processes.

Regarding the key concept of uncertainty, based on the real context of the ProvideDH project and a demo processing data of the 1641 Irish Depositions, the discussion was centered on the key distinction between types of uncertainty, namely (1) “Ignorance” (lack of context); (2) “Non credibility” (error or bias seen to be entering the system); (3) “Variation” (differing values within a contested category); and (4) “Gaps” (outlier record where information is missing). The discussion, evolving around the ProvideDH approach to uncertainty, raised questions about plausibility and/or credibility as a potential additional category (based on the propagation of uncertainty in a given corpus), as well as uncertainty in relation to context, metadata etc.

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

After the ProvideDH platform demo and discussion, participants worked on a series of “personas” or potential end-users for data visualization in DH and its level of uncertainty in relation to it, following the OI perspective for opening up the Dh research process to wider audiences (from people interested in culture, to artists or policymakers). Results centered around the following personas:

Photo by Eveline Wandl-Vogt (CC_BY 4.0)

Geography teacher: in this case in the field of geographical data and DH, needing knowledge about existing tools for teaching, for example allowing user generated content. Especially with the challenge that uncertainty is not usually well represented in maps, which is something needed for transmitting knowledge to students (and for them, it should be easy to use).
“Casual” user: in this case the audience is potentially very large, with any person that visiting a space or accessing by serendipity a source of visual information (museum visitors, tourists, pedestrians, generalist social media users, etc). In this case the experience has to be very immediate, where this type of persona does not have specific tasks to perform. Diverse concepts and approaches have been developed for these type of users in the last 10 years, based on a lot of assumptions, but not specifically how they could deal with uncertainty regarding data.
Art curator: a type of user that can relate to different types of digital materials: non-digitial, digitized and born-digital. In this case, the type of user requires to think about art in a way that could also be interested in uncertainty in DH, especially the type of discoveries that can raise when significant gaps or variations are identified.
Policymaker: imagining a type of public servant that works at the cultural department of a given city council, he or she would probably have a good background or experience in data visualization for engaging in culture through DH. This persona works with the need to focus also on diversity issues in the arts, with a preference for local data and representativeness, but could also be a type of user with a low level of certainty regarding visualizations.

In addition to final presentations of projects such as GeoBlob and WWI or Stereoscope by participants, for the organisers the workshop was an OI experience itself as a work in progress, where apart from the discussions and visualizations, it was possible to advance in the session methodology and detect future improvements, thanks to the involvement and knowledge of all the participants.