Abstract:
Water pumping stations play a vital role in the citizens life where the failure in the pumping
schedule, or the quality of the pumping may affect the life of the citizens. The data of the water
pumping station may expose the weakness points in the system of the station where they can be
overcome using machine learning approaches. In this paper, six decision tree algorithms are
examined to find the optimal one for classifying the data of water pumping stations. The main
goal is to determine the fault in the sensors to control the pumping process and to overcome the
future failure. Six algorithms namely (J48, Rep Tree, Random Forest, Decision Stump,
Hoeffding Tree, and Random Tree) are examined before and after implementing feature
selection (FS) process. FS is implemented to find the most correlated sensors that remove the
less correlated sensors. FS process affects the accuracies of the algorithms where it enhances the
resulting accuracies of the algorithms. Random Forest and Random Tree algorithms prove their
accuracy in data classification with 100% after implementing FS and removing the less
correlated sensors data. The model can be used as assistant tool for classifying and predicting the
failure in water pumping station