ESSeRE Lab Machine Learning for Code Smell Detection

Machine Learning for Code Smell Detection

We experimented several machine learning techniques for detecting code smells in Java code.


Datasets

The following datasets are related to the publication [3] (password: AF-MZ-mlcsd-severity-2016):


Datasets

We provide different datasets regarding the work done during the experimentations:

  • The metrics extracted from the classes and methods of 74 systems of the Qualitas Corpus: metrics.zip
  • The manual evaluation we performed and used as a training set:

We highly recommend reading at least one of the papers listed at end of the page in order to understand how these datasets have been created.

Furthermore, it is now available a tool that supports the creation of these datasets: WekaNose

Metric definitions

We applied machine learning algorithms to datasets representing source code artifacts (classes and methods) through a large set of metrics. The list and definitions of the exploited metrics are reported in a separated document.

Download Metric Definitions

  1. Ordered List ItemOrdered List ItemArcelli Fontana, Francesca, Marco Zanoni, Alessandro Marino, and Mika V. Mäntylä. 2013. “Code Smell Detection: Towards a Machine Learning-Based Approach.” In Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM 2013), 396–99. Eindhoven, The Netherlands: IEEE Computer Society. doi:10.1109/ICSM.2013.56.
  2. Arcelli Fontana, Francesca, Mika V. Mäntylä, and Marco Zanoni. 2015. “Comparing and Experimenting Machine Learning Techniques for Code Smell Detection.” Empirical Software Engineering, June, 1–49. doi:10.1007/s10664-015-9378-4.
  3. Arcelli Fontana, Francesca & Zanoni, Marco. 2017. “Code Smell Severity Classification using Machine Learning Techniques”. Knowledge-Based Systems. 128. doi: 10.1016/j.knosys.2017.04.014.
  4. Umberto Azadi, Francesca Arcelli Fontana, and Marco Zanoni. 2018. Machine learning based code smell detection through WekaNose. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceedings (ICSE ’18). ACM, New York, NY, USA, 288-289. doi: 10.1145/3183440.3194974