Abstract:
Coastal areas are vulnerable to disasters such as tsunamis, floods, large waves, and hurricanes. Many studies on disasters in
coastal areas were based on surveys for specific areas, but limited research explored the whole country. Applying data analytics for
disaster management is critical to reducing the impact of disasters. This study aims to classify provinces based on disaster events and
disaster preparedness and response capacity in coastal villages through cluster analysis, principal component analysis, and a
combination of principal component analysis and cluster analysis. This secondary study applies data mining techniques to Indonesian
official statistics. Data mining used the Python Scikit-learn and Tableau analytical software. The unit of analysis is all provinces of
Indonesia as an archipelago country. The cluster analysis optimally produced two clusters with 6 (18%) and 27 (82%) provinces. The
small cluster, named the high-intensity cluster, has a higher intensity of disaster events, preparedness, and response than the big one,
named the low-intensity cluster. The big cluster has a higher percentage of coastal villages (25%) than the first (10%). The results of
the principal component analysis were used to classify regions through geographic heat maps and scatter plots. Combining multiple
principal component analysis and cluster analysis provides an alternative method to cluster analysis alone. The analysis produced three
clusters with 6 (18%), 10 (30%), and 17 (52%) provinces. However, the cluster model from cluster analysis alone is better than the
model from the combination of principal component analysis and cluster analysis. Therefore, cluster analysis and principal component
analysis might be used independently, and both methods are complementary to exploring regional classification. The result of this
study suggests an improvement in disaster preparedness and response for coastal villages, especially for provinces with a high
percentage of coastal villages.