தகவல் தொழில்நுட்பம் மற்றும் மென்பொருள் பொறியியல் இதழ்

தகவல் தொழில்நுட்பம் மற்றும் மென்பொருள் பொறியியல் இதழ்
திறந்த அணுகல்

ஐ.எஸ்.எஸ்.என்: 2165- 7866

சுருக்கம்

Use of the Multiple Imputation Strategy to Deal with Missing Data in the ISBSG Repository

Abdalla Bala and Alain Abran

Multi-organizational repositories, in particular those based on voluntary data contributions such as the repository of the International Software Benchmarking Standards Group (ISBSG), may be missing a large number of values for many of their data fields, as well as including some outliers. This paper suggests a number of data quality issues associated with the ISBSG repository which can compromise the outcomes for users exploiting it for benchmarking purposes or for building estimation models. We propose a number of criteria and techniques for preprocessing the data in order to improve the quality of the samples identified for detailed statistical analysis, and present a multiple imputation (MI) strategy for dealing with datasets with missing values.

Top