|Word frequency cloud (architects only, responses to all questions) with substantial control panel for filtering at right|
|Word frequency cloud with correlations to 'wanted' highlighted and all occurrences of 'wanted' listed on right|
All of these filtering options end up in a large control panel, which took a bit of juggling to fit on screen. It may have been neater to hide it in drop down or pop up menus. However I think it was important to highlight the current view position within in the entire data set.
Mousing over a word highlights corresponding words that occur in proximity and brings up a scrollable list of all occurrences of the highlighted word in fragmentary context of the five words pre and post it.
An appropriate way to understand and navigate data?
So this is another example of a show everything and zoom in visualisation. However the reason I posted it is primarily to make a brief observation about the appropriateness of visualisation techniques to understand/navigate data. A distinction between understanding and navigation is perhaps important.
In the case of Mitchell Whitelaw's A1 Explorer the word cloud visualises item titles in the National Archives A1 Series. Titles generally are specific and succinct, and considered. The A1 Explorer is a visualisation that reveals some of the topics and relationships in the series, but it is also an interface to the digitised items themselves.
Similarly a word cloud of a carefully crafted speech, such as Obama's inauguration speech, reveals succinctly some of the themes. It is probable that some speeches are written with word cloud analysis in mind. Political rhetoric noticeably employs frequently repeated, memorable, mantras. Of course, as Jodi Dean writes, a word cloud is in many ways a very superficial analysis that ignores sentences, stories and narratives.
A different example, designed specifically for visualisation as a word cloud, was curated by the ABC who to mark Julia Gillard's first year as Prime Minister called for the public to submit 3 words that characterise their perceptions of Gillard and also of opposition leader Tony Abbot. Not surprisingly the most frequently submitted words aligned closely with the rhetoric that had been most prominent in the media.
Even if visualising words by themselves are appropriate, a critical challenge for word clouds and like visualisation techniques is to be able to locate the small, hidden, items, because they are perhaps the most interesting or important. It might be that quantitative data analysis can only ever take us so far, and that curation is necessary to go beyond? However when it comes to big data, quantitative might be our only way in - a starting point for exploration.
Andrew MacKenzie has said that the word clouds were very helpful as a research tool and their revelations support his observations during and other analysis subsequent to the interviews. My feeling is that there was substantial noise because of the nature of the raw survey data. The responses were not carefully crafted like an Obama speech or considered even like a title or a 3 word perception of Gillard - they were spontaneous and people thought as they spoke. The word cloud doesn't distinguish initial response from more considered closing summary remark. It doesn't take account of rambles, tangents or emphasis placed on particular ideas. That said the quantitative analysis also ignores any bias the researcher might have had in looking for particular ideas.