Monday, September 5, 2011

The analytic map as interface

Proposal for this semester's Master of Digital Design project, which can be followed by the unit tag 8199.

I propose to build a simple analytic map to contextualise and make navigable in a browsable way the National Museum of Australia’s digital catalogue. Beginning with an overview and allowing zooming in to detailed tiles, maps assist the location and navigation of data by succinctly visualising complex relationships and structures. Additional context can be provided by simple analytic charts that further reveal relationships within data sets.

With the current online interface to the vast catalogue it is difficult to know where to begin browsing, it is impossible to comprehend the whole collection (scale, structure etc) and there is little context to an individual object.

My principles will be to start with viewing everything in a way that reveals structures and relationships to suggest themes to narrow viewing focus and filter the data set, and once viewing subsets or individual objects, provide context to locate them within the data set and suggest other related items to browse.

I don’t propose to build an interface such as this because I think it is particularly original – but because I am genuinely interested in personally exploring the NMA collection myself, and because I am curious to study how visualisation techniques scale.

A vast collection

The NMA collection is vast – both in total items (more than 200,000 objects) and in variety of content. On their website the NMA describes the themes of their collection as Aboriginal and Torres Strait Islander cultures and histories, Australian history and society since 1788 and people's interaction with the Australian environment, which are sufficiently broad to cover just about anything.

NMA's current online catalogue home page
NMA's object record view - often there is little information about the object or the collection it is a part of 
I previously observed that the online catalogue is not curated, and that most objects and collections are not given a contextual description that explains their significance. However the NMA does have a separate section of the website where recent acquisitions and the highlights of the collection listed under the three broad themes above are given significant contextual narrative documentation. Identifying and visualising this subset would be great as mashed up addition to an interface because it is in the Museum’s opinion the most interesting content, and more critically it is the most completely catalogued. It therefore might also be a useful home/landing page, particularly if the fully zoomed out view of the entire set is not legible.

Mitchell Whitelaw has been developing visualisations of similarly large and diverse data sets – the National Archives and Flickr Commons. Here ranking assists us to find top and bottom items, but unless already zoomed into a small subset, it can be difficult to locate middle items. Word clouds that visualise the most frequently used words in object titles, are useful in narrowing focus on content themes – Mitchell says that coverage can be between 75% and 95%, but there are outliers that are invisible. How do you locate these hidden objects?

Questions of organisation

I intend to organise browsing and zooming in around questions that I am personally interested in such as:
  • Which are the biggest/smallest objects? 
  • Which are the oldest objects? 
  • Which objects are there the most of? 
  • Which are the largest collections? 
Some questions that I would like to ask, but I doubt the public data set will have answers for, include:
  • Which objects are on exhibition? 
  • Which objects have never been on exhibition? 
  • Which objects are the most fragile? 
  • Which objects are currently the subjects of restoration work? 
  • Which records are newly added to the catalogue or have been recently updated? 
Finer grain filtering can be facilitated at the intersection of these questions – for example ‘show me old small objects’. I hope that using multiple filters in conjunction will help to find hidden objects.

Two data types that I suspect can provide interesting browsing links between collections are object material/s and associated location/s – both are linked from the current online catalogue records, but would be much more useful if they were visual and had an indication of quantity - for example ‘other objects associated with this location: 5’.

Ultimately I would love to end up with a unique visualisation. However I dont have anything particular in mind at the moment and am not going to try to think of something arbitrarily. I would like to let visualisations emerge from exploring the data. My plan is to start very simply, with what I have outlined above, and then let the data prompt subsequent questions.

A native of the web

After encouragement from Mitchell, I have decided that rather than work for most of the semester in Processing, where I am confident I could achieve a well resolved visual interface, it would be better to migrate early to native web formats that I have not worked previously with and risk less resolution but benefit from the significant challenge of learning and plugging together back end technical systems.

So I will need to translate from Processing to HTML5, CSS and JavaScript. Then I will need to ensure the large data set does not crash the browser, which can only work with limited memory. I suspect that I will have to set it up to load dynamically, which will require a MySQL database queried with PHP or Django. I am leaning toward using Django because it is built on Python, which I think I am likely to learn anyway in the future for Rhino 5 or other applications.

Ben Ennis Butler has suggested some clever potential work arounds for interactive web implementations of static visualisations (ie visualisations that dont require access to a database and are not redrawn dynamically), which I can fall back to if I get stuck. He did this for the histogram he designed to show the Australian prints collection at the National Gallery of Australia.

Ben Ennis Butler, histogram of Australian prints collection at NGA

This visualisation is exceptionally browsable and well suited to the scale of the collection. I am tempted to do a similar visualisation first as a test of how well it can work for a dataset the scale of the NMA collection.

Show everything

The 'show everything' approach has been advocated by Stamen, as well as Mitchell. The approach is to start with a view of everything and then zoom in and filter to subsets and individual items, facilitating a better comprehension of the scale of the entire data set and the position of an individual item within it and encouraging browsing by showing related items.

Stamen's SFMOMA Artscape does this very well, but only for a collection of 3,500 items.

SFMOMA Artscape by Stamen - zoomed out
SFMOMA Artscape by Stamen - zoomed in
Constructing the visualisation like a map with pre-generated tiles, the interface is slick. However this set up appears to limit dynamic rearrangement of tiles, leaving the user stuck with the preset ordering by acquisition date and not able to filter to a subset - searching or following keywords, artists etc allows you to zoom to items one at a time, but not able to see all subset items next to each other or skip ahead to particular items.

An interface for users

Finally, at the end of this project, if I have a working interface, I would like to do some user testing. Documenting how users explore the data would be a significant outcome that would assist developing design approaches to future visualisations, both in general terms and specific to the NMA collections.

No comments:

Post a Comment