We experimented several machine learning techniques for detecting code smells in Java code.
Code smell severity classification
The following datasets are related to the publication  (password: AF-MZ-mlcsd-severity-2016):
Code smell detection (binary classification)
We provide different datasets regarding the work done during the experimentations:
The results of the detection of different Advisors: advisor_detection.zip
The metrics extracted from the classes and methods of 74 systems of the Qualitas Corpus: metrics.zip
The manual evaluation we performed and used as a training set:
We highly recommend reading at least one of the papers listed at end of the page in order to understand how these datasets have been created.
Furthermore, it is now available a tool that supports the creation of these datasets: WekaNose
We applied machine learning algorithms to datasets representing source code artifacts (classes and methods) through a large set of metrics. The list and definitions of the exploited metrics are reported in a separated document.
Download Metric Definitions
- Ordered List ItemOrdered List ItemArcelli Fontana, Francesca, Marco Zanoni, Alessandro Marino, and Mika V. Mäntylä. 2013. “Code Smell Detection: Towards a Machine Learning-Based Approach.” In Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM 2013), 396–99. Eindhoven, The Netherlands: IEEE Computer Society. doi:10.1109/ICSM.2013.56.
- Arcelli Fontana, Francesca, Mika V. Mäntylä, and Marco Zanoni. 2015. “Comparing and Experimenting Machine Learning Techniques for Code Smell Detection.” Empirical Software Engineering, June, 1–49. doi:10.1007/s10664-015-9378-4.
- Arcelli Fontana, Francesca & Zanoni, Marco. 2017. “Code Smell Severity Classification using Machine Learning Techniques”. Knowledge-Based Systems. 128. doi: 10.1016/j.knosys.2017.04.014.
- Umberto Azadi, Francesca Arcelli Fontana, and Marco Zanoni. 2018. Machine learning based code smell detection through WekaNose. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceedings (ICSE ’18). ACM, New York, NY, USA, 288-289. doi: 10.1145/3183440.3194974