ECML PKDD 2006 Workshop
on
Knowledge Discovery from Data Streams

http://www.machine-learning.eu/iwkdds-2006/

The International Workshop on Knowledge Discovery from Data Streams (IWKDDS-2006) will be held on Monday, September 18th, 2006 in conjunction with the 17th European Conference on Machine Learning (ECML) and the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) in Berlin, Germany. This workshop is a follow-up to the International Workshops on Knowledge Discovery from Data Streams (IWKDDS) at ECML/PKDD 2004 in Pisa, Italy, and at ECML/PKDD 2005 in Porto, Portugal.

The goal of this workshop is to promote an interdisciplinary forum for researchers who deal with sequential learning, anytime learning, real-time learning, online learning, etc. from data streams and related themes.

Learning from data streams is an increasing research area with challenging applications and contributions from fields like data bases, machine learning, and visualization. This year the workshop has a special emphasis on learning from sensor networks in ubiquitous environments.

This event will be supported by the European Project KDUbiq-WG3.

Motivation and Goals

The goal of this workshop is to convene researchers who deal with decision rules, decision trees, association rules, clustering, filtering, preprocessing, post processing, feature selection, visualization techniques, etc. from data streams and related themes. A special emphasis is on constrained algorithms designed to handle limited bandwidth, limited computing and storage capabilities, limited battery power, and specific network-communication protocols.

Many sources produce data continuously. Examples include customer click streams, telephone records, large sets of web pages, multimedia data, and sets of retail chain transactions. These sources are called data streams. If the process is not strictly stationary (as most of real-world applications), the target concept may gradually change over time. Hence data stream mining is an incremental task that requires incremental learning algorithms that take drift into account. Data streams are increasingly important in the research community, as new algorithms are needed to process this streaming data in reasonable time.

Many researchers coming from different areas (data mining, machine learning, OLAP, databases, etc.) are designing new approaches or adapting some of the traditional algorithms to data streams. The number of researchers in this field also is growing considerably, and in many conferences data streams are becoming a consolidated topic (ICML, KDD, IJCAI, ICDM, SAC, ECML, etc).

Structure of the Workshop

The event will be a one-day workshop on Monday, September 18th, 2006 with

  • Invited Speaker:
    Hillol Kargupta, University of Maryland - Baltimore County, Baltimore, MD, USA:
    Peer-to-Peer Distributed Data Stream Mining and Monitoring
    (abstract of the invited talk: see link in the workshop program below)
  • 12 Research Paper Presentations and 1 Poster Presentation
    (see workshop program below)
  • Panel Discussion on Knowledge Discovery in Ubiquitous Environments
    (invited panelists: Hillol Kargupta, Michael May, and more panelists to be announced later)

Topics

A data stream is an ordered sequence of instances that can be read only once or a small number of times using limited computing and storage capabilities. Topics of interest for the workshop include but are not restricted to:

  • Data Stream Models
  • Learning in Ubiquitous Environments
  • Clustering from Data Streams
  • Decision Trees from Data Streams
  • Association Rules from Data Streams
  • Decision Rules from Data Streams
  • Feature Selection from Data Streams
  • Visualization Techniques for Data Streams
  • Incremental Online Learning Algorithms
  • Single-Pass Algorithms
  • Scalable Algorithms
  • Real-Time and Real-World Applications Using Stream Data
  • Constrained Algorithms

Paper Submission

All submissions will be reviewed by at least two members of the program committee.

The papers must be in English and should be formatted according to the Springer Verlag Lecture Notes in Artificial Intelligence (LNAI) guidelines:
http://www.springer.de/comp/lncs/authors.html

The maximum length of papers is 10 pages. Papers should be submitted electronically in PDF format by email to all the workshop chairs:

  • jgamafep.up.pt
  • aguilarlsi.us.es
  • ralf.klinkenbergcs.uni-dortmund.de

Important Dates

Date:   Event:
June 28th, 2006   Original paper submission deadline
July 5th, 2006   Extended paper submission deadline
August 1st, 2006   Notification of acceptance/rejection
August 15th, 2006   Camera-ready copies of accepted papers dues
September 18th, 2006   Workshop



Documentation of workshop results beyond ECML/PKDD's publication: The organizers are in contact with an international journal in order to publish extended versions of the best papers in a special issue.

Workshop Program

The workshop will take place on Monday, September 18th, 2006 in room 1070 with the following schedule:

Time:Talk or Event:
  
09:00-10:00hInvited Talk:
Hillol Kargupta, University of Maryland - Baltimore County, Baltimore, MD, USA: Peer-to-Peer Distributed Data Stream Mining and Monitoring [abstract]
10:00-10:20h Jaroszewicz & Ivantysynova & Scheffer: Schema Matching on Streams [PDF]
  
10:20-11:00hCoffee Break including Poster Presentation:
Patnaik & Sanyal: Structural Analysis of the Web [PDF]
  
11:00-11:20h Campo-Ávila & Ramos-Jiménez & Gama & Morales-Bueno: Improving Prediction Accuracy of an Incremental Algorithm Driven by Error Margins [PDF]
11:20-11:40h Pereira Rodrigues & Gama: Online Prediction of Clustered Streams [PDF]
11:40-12:00h Calders & Dexters & Goethals: Mining Frequent Items with a Flexible Window in a Stream [PDF]
12:00-12:10h Katakis & Tsoumakas & Vlahavas: Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams [PDF]
12:10-12:20h Rasmus Pedersen: Hard Real-time Analysis of Two Java-based Kernels for Stream Mining [PDF]
  
12:20-14:00hLunch Break
  
14:00-14:20h Hinneburg & Habich & Karnstedt: Analyzing Data Streams by Online DFT [PDF]
14:20-14:40h Phuong & Washio: High-Order Substate Chain Prediction Based on Massive Sensor Outputs [PDF]
14:40-15:00h Kakoliris & Blekas: Incremental training of Markov mixture models [PDF]
15:00-15:10h Ruiz Moreno & Spiliopoulou & Menasalvas: User constraints over data streams [PDF]
  
15:10-15:30hBreak
  
15:30-15:50h Baena-García & Campo-Ávila & Fidalgo-Merino & Bifet & Gavaldà & Morales-Bueno: Early Drift Detection Method [PDF]
15:50-16:00h Csernel & Clerot & Hébrail: StreamSamp -- DataStream Clustering Over Tilted Windows Through Sampling [PDF]
16:00-17:00hPanel Discussion Knowledge Discovery in Ubiquitous Environments, invited panelists:
Hillol Kargupta, Michael May, and more panelists to be announced later.

Workshop Proceedings

The electronic workshop proceedings in PDF format can be downloaded from here [4.5 MB].

Workshop Organization

Workshop Chairs:
  • João Gama, LIACC, University of Porto, Portugal;
    jgamafep.up.pt
  • Jesus S. Aguilar-Ruiz, University of Seville, Spain / University of Pablo de Olavide, Spain;
    aguilarlsi.us.es
  • Ralf Klinkenberg, University of Dortmund, Germany;
    ralf.klinkenbergcs.uni-dortmund.de

Workshop Program Committee

  • Michaela Black, University of Ulster, Coleraine, Northern Ireland, UK
  • Andre Carvalho, University of Sao Paulo, Brazil
  • Pedro Domingos, University of Washington, Seattle, WA, USA
  • Francisco Ferrer, University of Seville, Spain
  • Mohamed Gaber, Monash University, Victoria, Australia
  • Joao Gama, University of Porto, Portugal
  • Ray Hickey, University of Ulster, Coleraine, Northern Ireland, UK
  • Hillol Kargupta, University of Maryland - Baltimore County, Baltimore, MD, USA
  • Ralf Klinkenberg, University of Dortmund, Germany
  • Jeremy Z. Kolter, Georgetown University, Washington, DC, USA / Stanford University, CA, USA
  • Miroslav Kubat, University Miami, FL, USA
  • Mark Last, Ben-Gurion University, Israel
  • Mark Maloof, Georgetown University, Washington, DC, USA
  • S. Muthu Muthukrishnan, Rutgers University and AT&T Research, USA
  • Masayuki Numao, Osaka University, Japan
  • Pedro Rodrigues, University of Porto, Portugal
  • Josep Roure, Technical University of Catalunya, Spain / Carnegie Mellon University, Pittsburgh, PA, USA
  • Jesus S. Aguilar-Ruiz, University of Seville, Spain / University of Pablo de Olavide, Spain
  • Bernhard Seeger, University Marburg, Germany
  • Elaine Parros Machado de Sousa, University of Sao Paulo, Brazil
  • Min Wang, IBM Watson Research Center, Hawthorne, NY, USA
  • Wei Wang, University of North Carolina, Chapel Hill, NC, USA
  • Xiaoyang Sean Wang, University of Vermont, Burlington, VT, USA
  • Gerhard Widmer, University of Linz, Austria
  • Philip S. Yu, IBM Watson Research Center, Yorktown Heights, NY, USA

Links to Related Events

Sponsors

This workshop is supported by the European Project KDUbiq-WG3.

KDubiq



Please report errors on this page to ralf@ralf-klinkenberg.de .

Photo by Land Berlin/Thie