Statistical Analysis of The Most Popular Software Service Effort Estimation Datasets

Amid Khatibi Bardsiri, Seyyed Mohsen Hashemi, Mohammadreza Razzazi

Abstract


Considering the complex nature of software projects, we have to use historical data and past experiences to execute them better. In previous years, a large number of software engineering datasets were introduced for different purpose. One of the important groups among these datasets is the use of software effort estimation repositories as a framework for analyzing diverse methods and models of estimation. In recent decades, researchers have worked on the different types of these datasets for various purposes and have tried to find the features of each one. DPS, ISBSG, Desharnais, Maxwell, and CF are among the most popular of these datasets. Insufficient or unstructured documentation causes problems for researchers in recognizing and working with datasets that are suitable for their purposes. This article intends to perform a thorough statistical analysis of the five the most popular datasets for software effort estimation to provide researchers with useful information and to help them select the appropriate repositories. In this paper, a thorough statistical analysis of software effort datasets is performed, and sufficient explanations are offered so that researchers have better options for their particular purposes. It is suggested that software engineering community should be aware of and account for the software effort dataset related issues when evaluating the validity of research outcomes

Keywords


software effort estimation; repository; software dataset; statistical analysis

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

ISSN: 2180-1843

eISSN: 2289-8131