Interference 2020: Data viz for democracy

October 9, 2020 | Tara Todras-Whitehill

If you ask 100 people what “democracy” meant, I’d be willing to bet 98 of them would say something like “the ability to vote in fair elections”.

The integrity of elections anywhere and everywhere is of crucial importance, but when we consider the US- the global ramifications of elections are unparalleled.

But there are some scary boogey monsters out there trying to keep us hiding under the sheets.

Oogie Boogie The Nightmare Before Christmas GIF from Oogieboogie GIFs

Our new generation needs new ways to reinforce democratic ideals.

So, step forward, guys and girls of the Atlantic Council’s DFR Lab.

The latest weapon in the armory of those fighting disinformation is the 2020 Interference Tracker.

It’s an interactive data visualization where you can query to study allegations of foreign meddling in the 2020 US election, and cross-reference that against a timeline.

FOR EXAMPLE: How many allegations of Chinese interference were made before Covid-19 became a thing? This tracker shows us the answer is none. From that, we can start to draw conclusions about the integrity of the allegations.

Other queries show us overwhelming trends of interference. I encourage you to poke around with the tool.

I had the chance this week to sit down with Matthias Stahl, a data visualization designer and bioinformatician, and Emerson Brooking, a resident fellow at the Digital Forensic Research Lab at the Atlantic Council, two members of the team that built Interference 2020.

We talked about their motivations for making the dataviz, what they wished they’d done differently, and what we as visual storytellers can do to help protect democracy.

And here’s how the chat went down…

TTW – To start, if you want to tell me, like what was the impetus to this, why did you decide to do it and what did you hope to achieve with it?

EB – Sure, I can start off – the intent of this project is really to contextualize this foreign interference issue. The fact is that it increasingly dominates U.S. political discourse, which is a big contrast with 2016, but there’s no way if you get keyed into one of these stories, there wasn’t an easy way to see previous instances of this taking place, or to sort by particular countries.

And also something we really wanted to communicate with this project is that – are these number of claims increasing? There’s sometimes less evidence or credibility being made in these accusations. And not only that, but some of these claims receive much more media attention than others. So this is an accounting of foreign interference claims, but it is specifically a study of the American media environment, and how these claims permeate the system – [and] which ones get attention, which ones don’t.

And then to set the groundwork for something we really wanted to do was, as we took this dataset we were collecting, we wanted to ensure that we just didn’t do a world map where Russia was bright red, and maybe China and Iran.

It was very important that whatever design we came up with, users weren’t just landing on a map that would help [or] essentially reinforce all of the perceptions that we wanted to move past.

TTW – Interesting. So how did you decide on this kind of model for it?

MS – So you mean overall, how we did this? Yeah, so the DFRLab [the Digital Forensic Research Lab at the Atlantic Council] contacted me and they already had some imagination about how this should look like. So for example, they wanted to have a timeline where all these cases were put on, they wanted to have some kind of world map to show the sources. And yeah, I initially got the data and then I played a little bit around it.

And there was one thing which was, in the beginning, quite a challenge – and that was that a lot of cases are located very [closely] on the timeline. So like you have, I don’t know, a time range of three days, but you have like five or six or seven cases that you need to visualize there. So the first question was how to visualize that when you have a lot of cases clustering together so that it’s still easy and accessible for a user.

And another thing was having the timeline and the map. So that was initially thought to be on two different views or on two different pages of this website. And so I had the idea to just combine that to deliver the full picture because I thought it might be important to see everything on the same page, everything at first sight. And so, yeah, this is how I came up to put, like, the timeline on top and then on the bottom the map, and then just connect everything by these links or tails that cross, then the timeline and then indicate when this event has taken place or when it was attributed, actually.

And then on top of the timeline, I applied this like forced directed graph where all these cases are pulled apart from each other. And initially I didn’t have the circles, these red circles like are there now. I initially had like a flame shape, actually, because I thought that’s quite a nice analogy because it’s like going like a fire around in social media. But it turned out in the discussion with the follow-up team that this was maybe a little bit hard to grasp, a little bit hard to understand.

And I think it was your idea in the end, Emerson, that because I also showed circles and showed different shapes and then Emerson said, well, this looks like the balloons that you see everywhere in an election year. And so we went with that.

So it was, at the beginning, there was like the decision putting the timeline and the map on the same page, linking everything together. And then it was a very iterative process. So I think we had like two or three meetings per week where we showed the new developments and then I got the new input, new ideas, and I revised it and then I presented the new version.

TTW – How long did it take you, from start to finish to do this?

MS – Well, I think it would in total for me, it was like four weeks, correct me if I’m wrong, but yeah, that was around four weeks. So for me it was quite tough. So it was really fast, given the fact that I’m coming from academia where everything is quite slow, and you take like two years for a project or something like that – it was a really cool experience, from start to finish in such a short time.

TTW – So what do you think is the narrative that you’re hoping to illustrate with this visualization?

EB – So… As time goes on, especially, I think narratives are illustrated by different filter views.

We have a few presets – I hope that we’ll have more as time goes on – but something that stood out to me, for instance, is this design enables us to juxtapose claims of foreign interference against China with rising Covid-19 cases in the United States

And there may be – like when I think about having a view, for instance, that’s specifically foreign policy events regarding Iran – [comparing] with these accusations of foreign interference.

We want to communicate that these sorts of claims and discussions don’t take place in a vacuum, that there are other events going on in the US, and that very much dictates the tempo of these sorts of allegations – then how much traffic they receive as well.

But I really like the China view, which is one of the preset filters accessible on the main page, because it shows you that a high volume – that actually no foreign interference allegations were made against China in the US context prior to Covid-19. And that based on the pretty rigorous criteria we put together for what warrants a credible allegation on that scale, these allegations against China are quite low quality. They typically do not have evidence associated with them.

So that’s the sort of story we want to tell.

TTW – I like that a lot. I love that you can see the difference in transparency and the objectivity and credibility scores for each of the different claims.

I think that that’s incredibly smart to help other people assess a credible claim when they’re looking at events.

EB – For our construction of that system and of other systems, regarding the impact of a case like that, that goes all the way back to April, when we first started to conceive of this project. And I felt bad when we brought Matthias in, and he did everything so quickly and so on time – he was by far the easiest part of this process to work with, even with all of our processes in place, [we’d end up] getting data to him much later than we’d intended, because quite frequently, you know, we’d find a bug, say, in the way we were assessing media impact, then we’d have to redo our analysis of like 30 cases.

This happened multiple times, which I think is just part of this process. But it was very difficult to align everything.

TTW – I can only imagine. So do you think that you’re hoping for this tool to provide a data set for academic researchers or as an evidence base for political activists to effect change, or for other journalists? What is the audience that you’re really looking for?

EB – I think this will be most impact intended for journalists and policymakers as a reference tool – for understanding and basically providing a comprehensive accounting of these accusations which can so quickly disappear into the sands of time.

For instance, I had forgotten how many of the allegations of foreign interference were made in 2019 right around President Trump’s impeachment hearings, because that feels like nine years ago. This is a good way to put it all on one timeline. And I hope for the academic community, there will be the most interest in the system we developed for assessing these claims and then a debate among others, whether we’ve come up with an appropriate system.

Also figuring out whether our media impact scoring is appropriate and how to add to it, because something else that, for me, was the most interesting research lead, which I think deserves a lot more follow-on assessment, is that the case that received the greatest media impact was a story that Senator Sanders’ Democratic primary campaign had been briefed on Russian efforts to help them.

This story was dropped on a Friday night by the Post, one day before the Nevada caucuses, which Sanders would win quite resoundingly – but that received, just using our methodology, we found 2.3 million social media shares of stories that were linked to that.

We are trying now to add additional dimensions of that impact assessment, like broadcast news coverage and cable news coverage. But we see that it’s probably a similar pattern there. So the big takeaway is that because there was no additional information and nothing else came out of that story – there was really nothing the Sanders campaign could do, for instance, other than say “we don’t support these efforts helping us, but we don’t know what these efforts are” – but it nonetheless seems like it became a significant political story, that crucial week in the Democratic primary contest, which helps lend credence to this idea that the foreign interference allegations disproportionately harm left-leaning causes.

And there’s been a lot of discussions that the Russia debate has intensified in the U.S. – that it’s a new McCarthyism, that’s even sometimes promoted by individuals in the liberal party of the United States. And some of the information here would support that, those concerns that have been voiced by activists in the U.S.

TTW – Do you think that in terms of the stuff that you created, is there anything that you wish that you had done differently or plan to add to? Do you see something right now you’d like to change?

EB – So I’ll go first. But there’s the research and then there’s the design itself. In terms of the research, the biggest thing is finding additional dimensions for media impact, because I’m really interested in charting how the conversation in the U.S. has evolved around this issue.

We have additional ideas because there are so many things you can do with this layout. And if you’ve seen, you know, it’s a contextual data set, you can have those Covid-19 cases. You can do other things with that second Y-axis. Like something I really want to do is come up with – we actually have a tool available where we’ll be able to figure out the usage of “Russian bot” on Twitter, which just generally is a shorthand for the sort of paranoia that sometimes accompanies foreign interference stuff, and you can see the frequency of the use of that kind of pejorative over time, and how it aligns with the frequency of cases.

Something else we’d probably do would be to have a dimension of voter fraud, and search interest using Google Trends to just see, again, whether there’s any correlation between some of these allegations.

TTW – Interesting.

MS – From the design point of view, I think one thing we are adding to this in the future is to share these different narratives that are in the data, so that, for example, you have set some filters and you found some very interesting stuff. So like the one Emerson told us before, we have the Chinese allegations that are only present during the Covid-19 time, that you are easily able to share this across your social networks.

And the second thing that I wanted to do is to add a mobile view so that you have like a vertical timeline for mobile screens, so that it also can be used on mobile screens, that would have been very nice.

TTW – So what impact do you think this disinformation has on politics?

EB – What seems clear to me – we still have a month left before the elections, who knows what happens? What seems clear to me is that since 2018, the biggest political impact of foreign interference has been in the way US media coverage conceives of it, not in the interference itself.

Another dimension that I hope we include more formally is in each of these cases, whenever we know the number of assets and the rough social media reach of these campaigns, we included that information.

***But the bottom line is that many of these pieces that receive hundreds of thousands of social media shares about them generally concern pages that maybe were liked by 200 or 300 Americans. Now, with the benefit of more time and an understanding of the social media landscape in the US, even the Russian efforts in 2016 to run these sockpuppet accounts, it is unclear to me that they had much impact at all. ***

What was decisive in 2016 was the different hack and release operations that were conducted by Russian intelligence services. But this sockpuppet/troll farm stuff has always been interesting, but I think it was secondary even then. And now that’s much more manifestly clear now. And since this foreign interference issue isn’t going to go away, regardless of who wins the election, we do need to start taking pains to right-size this debate.

So, you know, a huge amount of American political discourse isn’t consumed by talking about, you know, say, Iranian state bureaucrats who ban one Facebook page that gets liked by some Americans, because that’s simply too much outlay of attention and resources. And we also see where such focus on this issue can then lead to bad outcomes in U.S. political discourse, like with the George Floyd protests, this would seem to be a perennial temptation to say that the Floyd protests were due in part to the interference of foreign actors.

***But it wasn’t a foreign actor who killed George Floyd, right? So you can cast about for adversaries overseas, and you can pay less attention to your own problems. We want to move beyond that.***

TTW – I mean, I’ve lived abroad now for 15 years, and I can’t tell you the number of times I’ve heard “foreign hands, foreign hands” about things that have happened that I’m covering. It’s an easy way out for sure, even if that may be true in some points. So it’s really helpful to use these kind of tools.

I have one last question – as visual storytellers and people that are doing this work, what can we do to strengthen this kind of work and to renew the faith and the truth for, as I say, big data versus big anecdote?

MS – Well, I think one very important component of this work is transparency. So how was the methodology to assess all these cases? How were these scores calculated? And we also tried to not put it only in a method section on the web page. It is there in the method section, obviously, but it’s also directly on the visualization where we have this little pop-up, this question mark, where you get all the explanations how they came up with these scores. I think that’s a really important point, that you put the transparency of your methods directly into the visualization so that it’s easily accessible.

TTW – I agree completely. I thought that was such a great way to see that in real time. When I first clicked on it, I was like, oh, that’s very cool. I like that a lot where, you know, those are kind of your points and your data points. I thought that was very, very well done.

EB – Yeah, same thing. In this design, providing as many avenues as possible for readers to get to our methods and explanation as swiftly as possible.

To make nothing opaque, to make everything transparent.

Because I’m reminded of, you know, you there have been a few interesting tools in this space or adjacent to this space. But they always like to use machine learning stuff and they can’t explain their methods, basically. I won’t name it, but there is a popular platform for searching Russian and Iranian state media and it is unclear how it works; for a long time, it was also unclear, even in their methods, what they were capturing.

So it then led a number of times to journalists using this tool; they would improperly interpret their findings, so the tool inadvertently was spreading misinformation itself.

And we very much took the opposite approach in trying to develop this and make our stuff as clear as possible, that as a result, it might, we might, have given up some easy press coverage because, you know, there aren’t that many salacious findings in this thing – and we’re not leading with, you know, the Russians are hijacking US politics, which would be a great way to get coverage.

Instead, this is a more sober-minded, I think, a slower burn, but should be a more [valuable] contribution to this field.

MS – And maybe to add on the storytelling side that I think it’s quite important that you don’t only show this visualization to just say “hey, here, this is it, have a look”. But it’s very important to let the users explore this visualization and to find their own stories in there. And it’s much better when you have someone saying, “hey, I have discovered this or that in the data set” than to show actually this or that in the data set.

So this exploration step by the user himself is really important.

EB – Yeah, and this was where I think there was the most sort of clash of cultures because I wanted to give Matthias an opportunity to basically have free rein in this aspect of it, but DC think-tanks have, I think, a more conservative culture when it comes to, well, first off, having visualizations at all – and then how they conceive of telling stories.

So there is an inclination from some to basically have all the methods written up top, like paragraphs and paragraphs of description before you get to the tool. And it was important to push back against that internally for me to ensure that, you know, stuff as simple as all our methods starting out minimized. So when you navigate to the tool, chances are you can see the tips of the balloons under the brief description. Anything we could do to have users navigate to it on its own, on their own and experiment on their own.

But that does run counter to a lot of inclinations of older, more conservative institutions, which would have you explain everything before you showed it.

TTW – That’s great. I love that. Thanks so much!

If you liked this post, check out these:

Filming Transition: Do ethical storytelling rules apply to the Taliban?

Balking at free advertising: Why many non-profits aren’t taking full advantage of Google’s offer

37840b64-1213-4f3f-8bce-c32804d1efdabanner

A visual storyteller in Portugal dissects guns and white privilege in a new children’s book.

Posted in General