I propose to build a simple analytic map to contextualise and make navigable in a browsable way the National Museum of Australia’s digital catalogue. Beginning with an overview and allowing zooming in to detailed tiles, maps assist the location and navigation of data by succinctly visualising complex relationships and structures. Additional context can be provided by simple analytic charts that further reveal relationships within data sets.
With the current online interface to the vast catalogue it is difficult to know where to begin browsing, it is impossible to comprehend the whole collection (scale, structure etc) and there is little context to an individual object.
My principles will be to start with viewing everything in a way that reveals structures and relationships to suggest themes to narrow viewing focus and filter the data set, and once viewing subsets or individual objects, provide context to locate them within the data set and suggest other related items to browse.
I don’t propose to build an interface such as this because I think it is particularly original – but because I am genuinely interested in personally exploring the NMA collection myself, and because I am curious to study how visualisation techniques scale.
A vast collection
The NMA collection is vast – both in total items (more than 200,000 objects) and in variety of content. On their website the NMA describes the themes of their collection as Aboriginal and Torres Strait Islander cultures and histories, Australian history and society since 1788 and people's interaction with the Australian environment, which are sufficiently broad to cover just about anything.
NMA's current online catalogue home page |
NMA's object record view - often there is little information about the object or the collection it is a part of |
Mitchell Whitelaw has been developing visualisations of similarly large and diverse data sets – the National Archives and Flickr Commons. Here ranking assists us to find top and bottom items, but unless already zoomed into a small subset, it can be difficult to locate middle items. Word clouds that visualise the most frequently used words in object titles, are useful in narrowing focus on content themes – Mitchell says that coverage can be between 75% and 95%, but there are outliers that are invisible. How do you locate these hidden objects?
Questions of organisation
I intend to organise browsing and zooming in around questions that I am personally interested in such as:
- Which are the biggest/smallest objects?
- Which are the oldest objects?
- Which objects are there the most of?
- Which are the largest collections?
- Which objects are on exhibition?
- Which objects have never been on exhibition?
- Which objects are the most fragile?
- Which objects are currently the subjects of restoration work?
- Which records are newly added to the catalogue or have been recently updated?
Two data types that I suspect can provide interesting browsing links between collections are object material/s and associated location/s – both are linked from the current online catalogue records, but would be much more useful if they were visual and had an indication of quantity - for example ‘other objects associated with this location: 5’.
Ultimately I would love to end up with a unique visualisation. However I dont have anything particular in mind at the moment and am not going to try to think of something arbitrarily. I would like to let visualisations emerge from exploring the data. My plan is to start very simply, with what I have outlined above, and then let the data prompt subsequent questions.
A native of the web
After encouragement from Mitchell, I have decided that rather than work for most of the semester in Processing, where I am confident I could achieve a well resolved visual interface, it would be better to migrate early to native web formats that I have not worked previously with and risk less resolution but benefit from the significant challenge of learning and plugging together back end technical systems.
So I will need to translate from Processing to HTML5, CSS and JavaScript. Then I will need to ensure the large data set does not crash the browser, which can only work with limited memory. I suspect that I will have to set it up to load dynamically, which will require a MySQL database queried with PHP or Django. I am leaning toward using Django because it is built on Python, which I think I am likely to learn anyway in the future for Rhino 5 or other applications.
Ben Ennis Butler has suggested some clever potential work arounds for interactive web implementations of static visualisations (ie visualisations that dont require access to a database and are not redrawn dynamically), which I can fall back to if I get stuck. He did this for the histogram he designed to show the Australian prints collection at the National Gallery of Australia.
Ben Ennis Butler, histogram of Australian prints collection at NGA |
This visualisation is exceptionally browsable and well suited to the scale of the collection. I am tempted to do a similar visualisation first as a test of how well it can work for a dataset the scale of the NMA collection.
Show everything
The 'show everything' approach has been advocated by Stamen, as well as Mitchell. The approach is to start with a view of everything and then zoom in and filter to subsets and individual items, facilitating a better comprehension of the scale of the entire data set and the position of an individual item within it and encouraging browsing by showing related items.
Stamen's SFMOMA Artscape does this very well, but only for a collection of 3,500 items.
SFMOMA Artscape by Stamen - zoomed out |
SFMOMA Artscape by Stamen - zoomed in |
An interface for users
Finally, at the end of this project, if I have a working interface, I would like to do some user testing. Documenting how users explore the data would be a significant outcome that would assist developing design approaches to future visualisations, both in general terms and specific to the NMA collections.
No comments:
Post a Comment