Mastering Data-Intensive Collaboration through the

Synergy of Human and Machine Reasoning (dicoSyn 2012)

A workshop at CSCW 2012

February 12, 2012, Seattle, WA

In Brief


Contemporary collaboration settings are often associated with huge, ever-increasing amounts of multiple types of data, which may vary in terms of relevance, subjectivity and importance, ranging from individual opinions to broadly accepted practices. In such settings, collective sense making is crucial for well-informed decision making. This sense making process may both utilize and provide input to intelligent information analysis tools. Through position papers and interactive discussions, this workshop aims to bring together researchers and practitioners from different scientific fields and research communities to further explore (i) the synergy between human and machine intelligence, and (ii) larger issues surrounding analytical practices and data sharing practices in the above settings.


Workshop themes

This workshop aims to bring together researchers and practitioners from different scientific fields and research communities to exchange experiences and discuss the topic of how data-intensive and cognitively-complex sense making and decision making within diverse types of teams can be facilitated and augmented. The workshop will offer a venue for targeted discussion on the development and evaluation of innovative services that shift in focus from the mere collection and representation of large-scale information to its meaningful assessment, aggregation and utilization. Of particular interest are approaches that bring together the reasoning capabilities of the machine and the humans in contemporary collaborative settings. In parallel, much interest is given to larger issues surrounding analytical practices and data sharing practices in the above settings.
The workshop will be structured around a number of main themes (research issues), including: 

  • Innovative approaches to the exploration, delivery and visualization of the pertinent information: Particular challenges are related to: (i) the intelligent semantic annotation, structuring and aggregation of voluminous and complex data, (ii) the meaningful analysis and exploitation of data patterns and interrelations, (iii) the capturing of stakeholders’ tacit knowledge, as far as information analysis and problem solving are concerned, through a social web approach, and (iv) the exploitation of particular user and group characteristics to properly direct or adapt data. Generally speaking, semantics to be deployed should come out of a joint consideration of stakeholders, their actions in the settings under consideration, and data considered each time.
  • Novel collaboration tools and platforms for handling ill-defined domains: In the settings under consideration, we need to think about appropriate solutions that easily enable stakeholders create and maintain private or public workspaces, where the most pertinent information about the problem at hand can be gathered, linked, synthesized and assessed. Through such workspaces, stakeholders need to carry out synchronous or asynchronous collaboration to accommodate and elaborate relevant data, get recommendations, identify inconsistencies, spot and repair information gaps, reason about actions, etc.
  • Collaborative sense making of real-world multi-faceted data: Information explosion led to a need for human to make judgments on the value and relevance of this information at the point of use (very often in a problem solving situation). It has been recognized that sense making activities extend far beyond individuals as people have to work together to make sense of data, which come from heterogeneous information sources.  Although individual sense making has been studied since ‘90s, the challenges in collaborative sense making remain, especially within the context of increasing data intensiveness of current digital landscape.
  • Novel mechanisms for understanding collaborative patterns and intelligent probing: In the settings under consideration, data sources are associated with various types of information, each of them covering distinct aspects. A systematic way is needed to generate different points of view for such kind of data. We need to help users utilizing complex multi-source data in a reasonable way by supporting them in finding relevant information and by providing personalized recommendations. However, the development of effective recommender systems faces - particularly in the domain of complex data - challenging issues, such as a complex object representation and lack of information about user preferences.
  • Advances in cloud computing and scalable high-performance data mining for data-intensive collaboration: Recent research in data mining is geared towards the extraction of more semantic information. At the same time, the exploitation of a cloud infrastructure to adapt and refine computationally expensive algorithms for semantic data mining to new paradigms for distributed computing, such as the MapReduce paradigm (as implemented in frameworks like Mahout), is very interesting. For instance, Mahout may significantly help towards grouping similar items, identifying hot topics, assigning items to predefined categories, and recommending important data to diverse stakeholders.


Submission types

The workshop welcomes both short (2-4 pages) and long (8-10 pages) papers, describing innovative work in the form of case studies, theoretical work, position statements, novel practices, and work in progress. Submissions should be formatted according to the HCI Archive Format. Contributions will be thoroughly reviewed by the co-organizers and members of the workshop’s PC. Papers to be presented will be selected upon the quality of the work described, their relevance to the workshop’s main themes, and their potential to foster fruitful discussions during the workshop. Depending on the number of submissions, we will consider the possibility of accepting some submissions as posters.
All submissions should be sent to


Important dates

  • Submissions due: November 25, 2011  December 2, 2011
  • Notification of Acceptance: December 12, 2011
  • Workshop: February 12, 2012.


Workshop format

The workshop will be highly interactive and engage participants in fruitful discussions on its topics. Towards this aim, accepted papers will be first clustered according to the topics addressed. Each such cluster will contain no more than 4 papers, each of them to be shortly presented by one of its authors (workshop participants will have the full versions of the papers). Authors will be asked to follow a structured template for their presentation, focusing on the problem addressed in their approach, the methodology followed, and the results of their work.
Each workshop session will be coordinated by a facilitator (session chair), who will be an expert on the topics discussed in the particular session. After the short paper presentations, the facilitator will initiate and coordinate a discussion between the presenters and the audience (on issues raised during the presentations – it is also foreseen that the facilitator will have previously elaborate a set of open issues to be discussed), thus forming an “open” panel (round table) on the session’s topics. It is expected that the duration of each workshop’s session will be about 90 minutes.


Accepted papers


Post-Doctoral Researchers’ Use of Preexisting Data in Cancer Epidemiology Research  
Betsy Rolland & Charlotte P. Lee

CoPExplorer: A Novel Collaboration Platform for Quantitative Proteomics Research in China  
Tun Lu, Xianghua Ding, Xing Huang, Guangmeng Zhai, Zhaocan Chen & Ning Gu

Two Socio-Technical Gaps of Cyberinfrastructure Development and Implementation for Data-Intensive Collaboration and Computational Simulation in Early e-Science Projects in the U.S.  
Kerk F. Kee & Larry D. Browning

Emergent Properties of Data-Driven Technological Development in Science
Stephanie Gokhman

Striving toward ad hocracy in dataland   
Ben Li

On the identification and integration of countermeasures to cope with data intensiveness in collaboration settings  
Manolis Tzagarakis, Spyros Christodoulou & Nikos Karacapilidis

Distributed Data Mining for User Sensemaking in Online Collaborative Spaces  
Ahmad Ammari, Lydia Lau & Vania Dimitrova

Distance Metric Learning for Recommender Systems in Complex Domains  
Natalja Friesen & Stefan Rüping



Enhancing a collaboration space with social media features to support data-intensive collaboration

Wolfgang Prinz



Charlotte Lee comments on the outcomes of the Group Discussion session
Joining efforts with Workshop 2
The poster of the workshop

Workshop co-chairs

Nikos Karacapilidis, University of Patras & CTI, Greece
Lydia Lau, University of Leeds, UK
Charlotte Lee, University of Washington, USA
Stefan Rüping, Fraunhofer IAIS, Germany


Related CSCW_2012 Workshop

There is a complementary workshop at CSCW 2012: W2.Data-Intensive Collaboration in Science and Engineering. Both workshops are driven by the same concern with the "data deluge" and address both social and technological issues. The workshops will be held on successive days so participants can attend either or both workshops.


To help understand the two workshops:


  • The other workshop (W2) will be held on the first workshop day. It has a broad focus on issues of data-intensive collaboration, and leans more toward social and organizational issues.


  • This workshop (W12) will be held on the second workshop day. It is more focused on a particular class of technologies and approaches that involve (i) the synergy between human and machine intelligence, and (ii) larger issues surrounding analytical practices and data sharing practices.


The workshops are being organized independently, but share a common theme. The organizers have coordinated our planning so that participants can attend just one or the other, but participants should find it valuable to attend both. We believe our field and community will benefit from the discussion in both workshops, and we encourage participants to attend both events.


To that end, participants who wish to attend both may either submit separate position papers or a single long (8-10 pages) paper to both workshops. Long papers should address themes from both workshops, and participants should submit the paper separately to each workshop and indicate that it has also been submitted to the other workshop. The paper will be reviewed independently by each workshop committee. Accepted participants will need to register separately for both workshops.


ddd This workshop is organized in the context of the EU Collaborative Project “DICODE - Mastering Data-Intensive Collaboration and Decision” which is co-funded by the European Commission under the contract FP7-ICT-257184.