Dr. Timothy Cribbin
BSc (Hons), MSc, PhD, PGCert

Lecturer
Department of Computer Science

Brunel University
UK

Office: +44(0)1895 266046  
Email: timothydotcribbinatbruneldotacdotuk (decode if you're human)

 

Quick links for searchers...

ScienceDirect | ACM Digital Library | IEEExplore | ISI WOK | Arnetminer

Google Scholar

Last changes published on 25th June 2014

Brief CV

2001-present : Lecturer, Department of Computer Science, Brunel University
2007 : PGCert (Brunel) Learning and Teaching in Higher Education
2005 : PhD (Brunel) "Classifying complex topics using spatial-semantic document visualization: an evaluation of an interaction model to support open-ended search tasks"
BURA
2000-2001 : LIC funded research assistant at DISC, Brunel University (PI: Dr Chaomei Chen)
1998-2000 : EPSRC funded research assistant at Psychology Institute, Aston University (PI: Dr Stephen Westerman)
1996 : MSc (Hull) Industrial Psychology
, awarded Tom Hoyes memorial prize for dissertation (DOI)
1994 : BSc (Portsmouth) Psychology
 

Research Interests

Visual text analytics, social media analysis, bibliometric analysis, search user interfaces, information visualisation, human-computer interaction

My main interest is in the field visual text analytics (VTA). VTA is sub-class of data mining that aims to create visualizations of semantic structure that lies latent within unstructured or semi-structured collections of textual data. Most of my work has focused on the spatial-semantic or spatialization approach. Spatialization invokes a spatial-semantic (distance-similarity) metaphor, to produce point maps or node-link graphs that summarise the general semantic relationships between documents in a corpus. Spatializations provide the searcher or analyst with both a thematic overview of the corpus and an intuitive context in which to search and explore.

Recent projects have investigated ways to optimise the cognitive plausibility (validity) of spatial-semantic structures as well as to improve the efficiency/scalability of content-based similarity computation. My earlier work explored the usability of spatializations for a variety of information seeking tasks (e.g. see Westerman and Cribbin, 2000; Cribbin and Chen, 2001; Chen et al., 2002).

I use a 'bag of words' vector space model approach as the basis for document similarity modelling. There are many ways to improve both the structural properties of both the dis(similarity) matrix and the resulting spatial layout.  A particularly effect method for improving the similarity matrix is second-order similarity analysis (SOSA). SOSA transforms the first-order (i.e. term overlap) document similarity matrix to one of mutual neighbourhood (second-order) similarities. This transformation uncovers latent relationships that are not detected by first-order  similarity metrics. The more near neighbours two documents share, the more likely they are to be about the same topic, which is reflected in the SOS coefficient. Recent experimental results (Cribbin, 2011) showed that SOSA can produce significantly better topic clustering than latent semantic analysis (LSA) whilst being simpler to apply because it is parameter free. A drawback of SOSA is that the run-time grows cubically with N. However, my experiments also showed that it is possible to reduce run-times significantly, without harming similarity measures, by truncating the similarity vectors prior to matrix multiplication.

Regardless of the quality of the similarity matrix, projecting these complex, high-dimensional spaces onto a 2D plane is a difficult problem. My work has focused on graph-theoretic approaches, which help to deal with the non-linearity and violation of metric assumptions associated with these spaces. I have observed good results using the Isomap method, in which the original dissimilarities are transformed to geodesic distances. These distances are estimated by computing shortest paths within a neighbourhood graph of some kind. Traditionally this might be a k-nn graph, although this presents the problem of selecting the k-parameter - too large and the graph will contain disruptive "short-circuits"; too small and the graph becomes disconnected. My work has shown that minimum-spanning trees (MST) or minimal (q=N-1) pathfinder networks tend to produce equally good spatial-semantic results without the need to tune any parameters (Cribbin, 2006, 2010).  

Concept signposts and pulsesIn my PhD thesis (Cribbin, 2006), I also proposed two novel interactive interactive techniques (see image) to support visual navigation and exploration. Concept signposts are contextually relevant key words that are used to dynamically label neighbours of a selected document. The idea is that while the spatial-semantic structure tells the user which documents are neighbours, signposts explain why they are related. Concept pulses, on the other hand, allow the user to see quickly how locally salient words and phrases are distributed more globally across the document map by dynamically inflating then deflating document nodes according to their degree of match.

MST spatialization, incorporating semantic signposts and concept pulses

 

 A second, current line of my research is Citation Chain Aggregation (CCA: Cribbin, 2011). CCA is an interaction model I developed to support search and analysis tasks within citation networks. Traditionally, citation chaining activity (footnote chasing and citation searching) is conducted within page-based hypertext interfaces. This results in the focus and context problem, whereby the user is attempting to gain an overview of a complex network of citations (i.e. to find relevant items and determine their relations and relative importance) but can only see a small part of that network at any one time. CCA attempts to solve this problem by means of a three-list view, which displays the aggregation of first-order citation chains (cited<-article<-citing) surrounding a set or 'pearl' of known relevant articles (see below). As more items are added to pearl, differences in the incidence of overlap between their cited and citing articles provide a form of relevance feedback, drastically reducing the size of the search space and avoiding the need to 'navigate', node by node, through the network. See the paper and poster for a more detailed explanation of the concept. You can try CCA for yourself by downloading Oyster search here.

Oyster CCA tool

I have recently begun to explore the application potential of VTA, including sentiment analysis, as a means of supporting the analysis of social media data (e.g. Tweets, blogs, forums etc.). Our Chorus tool suite comprises the Tweetcatcher, for managing data collection and Tweetvis for analysing the retrieved datasets. So far, the focus has been within the domain of health informatics, looking for example at public perceptions and responses to health-crises such as the 2011 e-coli outbreak (see the foodRisC project) and recent/current h1n1 flu pandemics. We are also interested in the potential of social media as a source of information to support the evaluation of medical devices (see the MATCH project). A key long-term goal here is to develop tools for social (and mainstream) media analysis that enable social scientists to readily leverage the powerful methods developed by the text mining and information visualization communities within the last two decades. Progress can be followed on the Chorus site and a video introduction can be found here.

Tweetvis Timeline Explorer Tweetvis Cluster Explorer

I currently belong to the Centre for Intelligent Data Analysis (CIDA) within the Department of Computer Science. I was previously a co-investigator on the MATCH project. If you are interested in reading for a PhD in any of the topic areas mentioned above, please email me using the address at the top of the page.


Publication history

Brooker, P, Barnett, J, Cribbin, T, Lang, A & Martin, J (2013). Locating and Analysing Twitter Conversation About Cystic Fibrosis Without Keywords. In: SAGE Research Methods Cases. London : Sage Publications Ltd DOI
de Folter, J & Cribbin, T (2012). Facilitating insight into a simulation model using visualization and dynamic model previews. Journal of Visual Languages and Computing, 23(6), 344-353. DOI BURA
Cribbin, T (2011). Citation Chain Aggregation: an interaction model to support citation cycling. In the proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM '11), Glasgow (October 24-28, 2011). DOI BURA

Cribbin, T (2011) Discovering latent topical structure by second-order similarity analysis. Journal of the American Society for Information Science and Technology, 62(6), 1188-1207. DOI BURA  

Westerman, S.J., Cribbin, T. & Collins, J. (2010). Human assessments of document similarity. Journal of the American Society for Information Science and Technology, 61(8), 1535-1542.DOI BURA (post-print)

Cribbin, T. (2010). Visualising the structure of document search results: a comparison of graph theoretic approaches. Information Visualization, 9(2), 83-97. DOI

Cribbin, T. (2006). Classifying complex topics using spatial-semantic document visualization: an evaluation of an interaction model to support open-ended search tasks. Doctoral dissertation, Brunel University, Uxbridge, UKBURA

Westerman, S. J., Collins, J., & Cribbin, T. (2005). Browsing a document collection represented in two- and three-dimensional virtual information spaces. International Journal of Human-Computer Studies, 62(6), 713-736. DOI

Morar, S. S., Macredie, R., & Cribbin, T. (2002). An investigation of visual cues used to create and support frames of reference and visual search tasks in desktop virtual environments. Virtual Reality, 6(3), 140-150. BURA DOI 

Chen, C., Cribbin, T., Kuljis, J., & Macredie, R. (2002). Footprints of Information Foragers: Behaviour Semantics of Visual Exploration. International Journal of Human-Computer Studies, 57(2), 139-163. DOI BURA (post-print)

Chen, C., Cribbin, T., Morar, S. S., & Macredie, R. (2002). Visualizing and Tracking the Growth of Competing Paradigms: Two Case Studies. Journal of the American Society for Information Science and Technology, 53(8), 678-689. DOI

Westerman, S. J., Cribbin, T., & Wilson, R. (2001). Virtual information space navigation: Evaluating the use of head tracking. Behaviour and Information Technology, 20(6), 419-426. DOI

Morar, S. S., Macredie, R., & Cribbin, T. (2001). Perceiving depth in desktop virtual environments: Effects of motion parallax and object placement. Paper presented at  INTERACT 2001, Tokyo, Japan.

Morar, S. S., Macredie, R. D., & Cribbin, T. (2001). A Study of the Relative Importance of Visual Cues in Desktop Virtual Environments. Paper presented at HCI International 2001, New Orleans, USA.  

Cribbin, T., & Chen, C. (2001, 5-10 August). A study of navigation strategies in spatial-semantic visualisations. Paper presented at the HCI International 2001, New Orleans, USA. BURA (post-print)  

Cribbin, T., & Chen, C. (2001, 9-13 July). Exploring Cognitive Issues in Visual Information Retrieval. Paper presented at the Eighth IFIP TC.13 Conference on Human-Computer Interaction, INTERACT 2001, Tokyo, Japan. BURA (pre-print)

Cribbin, T., & Chen, C. (2001, January 21-26). Visual-Spatial Exploration of Thematic Spaces: A Comparative Study of Three Visualisation Models. Paper presented at Electronic Imaging 2001: Visual Data Exploration and Analysis VIII, San Jose, CA. PDF (post-print)

Chen, C., & Cribbin, T. (2001). Visualising and animating visual information foraging in context. Paper presented at HCI International 2001, New Orleans. BURA (post-print)

Westerman, S. J., & Cribbin, T. (2000). Cognitive ability and information retrieval: When less is more. Virtual Reality, 5(1), 1-7. DOI

Westerman, S. J., & Cribbin, T. (2000). Mapping semantic information in virtual space: Dimensions, variance, and individual differences. International Journal of Human-Computer Studies, 53(5), 765-788. DOI

Westerman, S. J., & Cribbin, T. (1999). Navigating Virtual Information Spaces: Individual Differences in Cognitive Maps. Paper presented at UK Virtual Reality Special Interest Group Conference, Salford, England.

Cribbin, T., & Westerman, S. J. (1999, August 30-September 3). Spatial Data Management Systems: Mapping Semantic Distance. Paper presented at INTERACT 99, IFIP TC.13 International Conference on Human-Computer Interaction, Edinburgh, Scotland.

Cribbin, T. (1999, August 30-September 3). Spatial Data Management Systems: Human Factors Perspectives. Paper presented at INTERACT 99, IFIP TC.13 International Conference on Human-Computer Interaction, Edinburgh, Scotland.

Westerman, S. J., & Cribbin, T. (1998). Individual differences in the use of depth cues: Implications for computer- and video-based tasks. Acta Psychologica, 99(3), 293-310. DOI

 
Useful Infovis sites...

Infovis Wiki - a new shared space or "community platform", designed to bring together views, news and other information from the length and breadth of the IV community. Regular updates make this a resource worth bookmarking.

Infovis.net - an online magazine which publishes regular articles and tutorials on key topics in the field and a "Who's Who" directory of key individuals.

An Atlas of Cyberspace - a comprehensive classification of a wide range of solutions to visualizing the content and structure of information spaces. No longer updated but still a compelling read.

University of Maryland HCI laboratory - an impressive archive of past and present projects that have explored and proposed IV solutions to popular problems.

My Infovis Links - part of my old web-site. Contains many useful links, but some may be broken now. Please let me know if you find any.

Useful IR and Text Mining sites...

BCS IRSG group - IR special interest group of the British Computer Society. Links to upcoming events and the Informer news letter.

Information Retrieval Facility - a non-profit research organisation providing services to information retrieval, including reference corpora and a super-computing infrastructure

Lucene - Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Terrier - Terrier is a highly flexible, efficient, and effective open source search engine, readily deployable on large-scale collections of documents. Research can easily be carried out on standard TREC and CLEF test collections. Terrier is written in Java, and is developed at the School of Computing Science, University of Glasgow.

NaCTeM - The National Centre for Text Mining (NaCTeM) is the first publicly-funded text mining centre in the world. We provide text mining services in response to the requirements of the UK academic community.

GATE - General Architecture for Text Engineering. University of Sheffield. Provides open-source software and a research network.

Top 10 IV Journals*

1. IEEE Transactions on Visualization and Computer Graphics (165 cites, IF = 2.45)
2=
IEEE Computer Graphics and Applications (68 cites, IF = 1.89)
2=
Information Visualization (68 cites, IF = n/a)
4.
Communications of the ACM (66 cites, IF = 2.65)
5.
ACM Transactions on Graphics (62 cites, IF = 3.38)
6.
Journal of the American Society for Information Science and Technology (44 cites, IF = 1.95)
7.
International Journal of Human-Computer Studies (35 cites, IF = 1.77)
8=
IEEE Transactions on Software Engineering (28 cites, IF = 3.57)
8=
ACM Transactions on Computer Human Interaction (28 cites, IF = n/a)
8=
Journal of Visual Languages and Computing (28 cites, IF = 0.86)

Click here to see lists of the top 10 most cited books, journal and conference papers in IV.

*Based on citations made by papers retrieved from ISI WOK that were published in the period 2006-9 and contained the phrases "information visualization" or "information visualisation" Impact factor (IF) based on 2008 ISI data.

Useful Software

My Datasets and Programs

Chorus - a tool suite for harvesting and exploring Twitter datasets. Free to download and use.

Oyster search - a simple tool implementing the CCA concept. Free to download and use.

I have now made datasets used in Cribbin (2010) available for other researchers to use. In due course, I will make other datasets and software available as well. Click here to go to this micro-site.
 

Third Party Tools

Citespace - a freely available Java application for analyzing and visualizing scientific literature. Written and maintained by Chaomei Chen.
Graphviz - open-source software for visualizing graphs and networks
Infovis Toolkit - "An Interactive Graphics Toolkit written in Java to ease the development of Information Visualization applications and components"
Prefuse - "A Java-based toolkit for building interactive information visualization applications"
Piccolo - "a toolkit that supports the development of 2D structured graphics programs, in general, and Zoomable User Interfaces (ZUIs), in particular"
Protovis - an open source toolkit based on JS and SVG: "Protovis is a graphical toolkit, designed for visualization. It retains some of the conceptual simplicity and low-level control of graphical systems by dealing directly with graphical elements (shapes, lines, i.e., marks), but specifies marks declaratively as encodings of data"
KDNuggets - long list of links to commercial and free visualisation software

 

New Links