Held in conjunction with
Abstract Due: 11 February 2005
Papers Due: 21 February 2005
Acceptance Notification: 21 March 2005
Publication Ready: 31 March 2005
Jelber Sayyad Shirabad
School of Information Technology and Engineering,
University of Ottawa
Ottawa, Ontario K1N 6N5
jsayyad AT site DOT uottawa DOT ca
Department of Computer Science,
Portland State University
Portland, Oregon 97207-0751
timm AT cs DOT pdx DOT edu
promise AT site DOT uottawa DOT ca
Workshop Program is online
If we understand something well, we can make proper decisions about
it. Software engineering is a decision intensive discipline. Do we
really understand software engineering? Can we help software engineers
by building models that make explicit the knowledge hidden in various
software resources? Are those models "predictive"; i.e. can they be
readily used to make a prediction regarding some aspect of the
software, without requiring further expert interpretation to make
a prediction? Can those models let us make better decisions? Can we
assess those models? Are those models correct; e.g. do different
researchers working on the same data arrive at similar models? Are
there better, faster, cheaper ways to build those models? These are
the questions addressed by this workshop.
The main theme of this workshop is "issues and challenges surrounding
building predictive software models". Some progress has already been
made in this field. Predictor models already exist for software
development effort and fault injections as well as co-update or
change predictors, software quality estimators and software
escalation ("escalation" predictors try to guess what bug reports
will require the attention of the senior experts). However, in most
cases they have been presented in venues that cover a diverse set of
The goals of the workshop are:
Public Data Policy
PROMISE 2005 GIVES THE HIGHEST PRIORITY TO CASE STUDIES, EXPERIENCE
REPORTS, AND PRESENTED RESULTS THAT ARE BASED ON PUBLICLY AVAILABLE
DATASETS. TO INCREASE THE CHANCE OF ACCEPTANCE, AUTHORS ARE URGED
TO SUBMIT PAPERS THAT USE SUCH DATASETS. DATA CAN COME FROM ANYWHERE
INCLUDING THE WORKSHOP WEB SITE. SUCH PAPER SUBMISSIONS SHOULD INCLUDE
THE URL ADDRESS OF THE DATASET(S) USED.
A COPY OF THE PUBLIC DATASETS USED IN THE ACCEPTED PAPERS WILL BE POSTED ON
THE PROMISE SOFTWARE ENGINEERING REPOSITORY. THEREFORE,
IF APPLICABLE, THE AUTHORS SHOULD OBTAIN THE NECESSARY PERMISSION TO
DONATE THE DATA BEFORE SUBMITTING THEIR PAPER. ALL DONORS WILL BE
ACKNOWLEDGED ON THE PROMISE REPOSITORY WEB SITE.
- To bring together researchers and practitioners from various
backgrounds with interest in building predictive models with the
aim of sharing experience and expertise.
- To steer discussion and debate on various aspects and issues
related to building predictive software models.
- To initiate the generation of a publicly available repository
of software engineering data sets. We believe such a repository
is essential to the maturity of the field of predictive software
models and software engineering in general.
- To put together a list open research questions that are deemed
essential by the researchers in the field
The use of publicly available datasets will facilitate generation of repeatable,
verifiable, refutable, and improvable results, as well as providing an opportunity for
researchers to test and develop their hypothesis, algorithms, and
ideas on a diverse set of software systems. Examples of such datasets
can be found on the workshop web site or the
PROMISE SOFTWARE ENGINEERING REPOSITORY.
We ask all researchers in the field, to assist us with
expanding the PROMISE repository by donating their data sets.
For inquiries regarding data donation please send an email to
promise AT site DOT uottawa DOT ca.
Topics of Interests
In line with the above mentioned goals the main topics of interest
- Applications of predictive models to software engineering data.
What predictive models can be learned from software engineering data?
- Strengths and limitations of predictive models
- Empirical Model Evaluation Techniques.
What are best baseline models for different classes of predictive software models?
Are existing measures and techniques to evaluate and compare model goodness
such as precision, recall, error rate, or ROC analysis adequate for evaluating
software models, or are more specific measures geared toward software
engineering domain needed?
Are certain measures better suited for certain classes of models?
What are the appropriate techniques to test the generated models e.g.
hold-out, cross-validation, or chronological splitting ?
- Field evaluation challenges and techniques.
What are the best practices in evaluating the generated software models in the real world?
What are the obstacles in the way of field testing a model in the real world?
- Model shifting.(Concept drift).
When does a model need to be replaced?
What are the best approaches to keeping the model in sync with the changes in the software?
What predictive models are more prone to model shift?
- Building models using machine learning, statistical, and other methods.
How do these techniques lend themselves to building predictive software models?
Are some methods better suited for certain class of models?
How do these algorithms scale up when handling very large amounts of data?
What are the challenges posed by the nature of data stored in software
repositories that make certain techniques less effective than the others?
- Cost benefit analysis of predictive software models
Is cost-benefit analysis a necessary step in evaluating all predictive models?
What are the requirements for one to be able to perform a cost benefit analysis?
What particular costs and benefits should be considered for these models?
Benchmark Dataset Papers
- Case studies on building predictive software models
To encourage data sharing and/or publicize new and challenging
research direction a special category of papers will be considered
for inclusion in the workshop. Paper submitted under this category
should at least include the following information:
Participation in the workshop
- The public URL to a new dataset
- Background notes on the domain
- What problem does the data represent
- What would be gained if the problem was solved
Proposes a measure of goodness to be used to judge the results, for
instance a good defect detector has a high probability of detection
and a low probability of false alarm
A review of current work in the field (e.g. what is wrong with
current solutions or why has no one solved this problem before ?)
- Description of data format.
Recommended format is Attribute-Relation File Format (ARFF).
For an example of such a dataset see Cocomo NASA/Software cost estimation in the PROMISE Software Engineering Repository
However, if ARFF is not the appropriate format for your data, please
provide detailed description of your data format in the paper.
A guideline from UCI Machine Learning repository for documenting datasets can be found
This information is placed before the actual data when using ARFF format.
However, if you are using an alternative format that does not support
comments in the dataset, provide this information in a separate file with
extension .desc, and submit the URL of this file.
- Preferably some baseline results
The workshop will accept potential position papers in the areas of
interest mentioned above. The papers will be limited to a maximum of
5 pages. Papers must be original and previously unpublished. SUBMISSIONS
WHICH INCLUDE EMPIRICAL RESULTS BASED ON PUBLICLY ACCESSIBLE DATASETS
WILL BE GIVEN THE HIGHEST PRIORITY
. Authors should
submit a paper abstract prior to paper submission. Each paper will be
reviewed by the program committee in terms of their technical
content and their relevance to the scope of the workshop, as well as
its ability to stimulate discussion. Authors of accepted position
papers are expected to attend and participate in the workshop.
Papers should conform to ICSE 2005 Paper Format Instructions
Accepted file formats are Postscript and PDF.
To submit the abstract of your paper please follow
Abstract Submission Instructions
To submit your paper please follow
Paper Submission Instructions
Prior to the workshop the accepted papers will be posted on the
workshop web page at:
This is to facilitate a more fruitful discussion during the workshop.
|Victor Basili||University of Maryland, US
|Gary Boetticher||University of Houston- Clear Lake, US
|Lionel Briand||Carleton University, Canada
|Bojan Cukic||CSEE, WVU, US
|Martin S. Feather||NASA JPL, US
|Tim Menzies||Portland State University, US
|Allen P. Nikora||NASA JPL, US
|Charles Pecheur||Universite Catholique de Louvain, Belgium
|Alessandra Russo||Imperial College London , UK
|Jelber Sayyad-Shirabad||University of Ottawa, Canada
|Eleni Stroulia||University of Alberta, Canada
|Marvin Zelkowitz||University of Maryland, US