Class Homework (1)
GEOP – 555
Term – 212
Due date: 21 March 2022
The given excel sheet is based on Eltom et al. (2016) paper. The sheet lists the lab measurement of 156 samples collected from two locations near Riyadh, Saudi Arabia. A total of 87 measurement and relations are given for each sample.
The target of this study is to predict the depositional environments, referring to Eltom et al. (2016) “the interpretation of depositional environments provides important information to understand facies distribution and geometry. The classical approach to interpret depositional environments principally relies on the analysis of lithofacies, biofacies and stratigraphic data, among others. An alternative method, based on geochemical data (chemical element data), is advantageous because it can simply, reproducibly and efficiently interpret and refine the interpretation of the depositional environment of carbonate strata. Here we geochemically analyze and statistically model carbonate samples (n=156) from seven sections of the Arab-D reservoir outcrop analog of central Saudi Arabia, to determine whether the elemental signatures (major, trace and rare earth elements [REEs]) can be effectively used to predict depositional environments.”
In this homework, we shall apply an unsupervised technique (K-mean and/or DBSCAN) to cluster the given data into different groups, each group (cluster) represents a lithofacies associated with a different depositional environment.
• Set 62 is the rare earth element (REE)
• Set 63 is the light rare earth element (LREE)
• Set 64 is the medium rare earth element (MREE)
• Set 65 is the heavy rare earth element (HREE)
Questions and Requirements
1. Study the given data before you start the process to select which features should be excluded from the process
2. Select a number of features for the clustering process
3. Apply K-mean to cluster the given data
4. Use an evaluation approach to identify the optimum number of clusters (elbow is an example, but you can use other approaches such as (va = evalclusters(data,idx,'CalinskiHarabasz');) in MatLab.
5. Apply DBSCAN on the same data set using the same features (Matlab code is provided)
6. Compare the K-mean and the DBSCAN results
7. Comment on the results
A short report contains at least the following
1. A list of the excluded features. Give reason(s) for the exclusion
2. A table and/or a figure shows the K-mean result
3. Describe your final clusters and explain the main characteristics of each cluster
4. The result of the elbow test (Figure and explanation)
5. A table and/or a figure shows the DBSCAN result
6. What are the similarities and differences between K-mean and the DBSCAN results
7. Short conclusions
• Text should be free from typos, spelling errors, and grammar errors
• Figure should be in a good shape and good resolution.
• Figures should have axis scale, labels, titles, and Figure captions.
• Report should be divided at least into “Summary”, “Testing and selecting features”, “K-mean result”, “DBSCAN results”, “K-mean vs. DBSCAN”, and “Conclusions”.