Assignment Task: K-Means Clustering Exercise
In this exercise, you will use the R Studio interface to run the k-means clustering method. Unlike classification methods, KMeans clustering method groups data instances based on common characteristics. Each instance does not have a predefined label or class.
Exercise Instructions:
Part 1: Complete an exercise in a Word document KMeans Clustering Using R.docx on seeds.csv data. Save the file on your hard drive, and follow the instructions in the document to load the file into R.
Get familiar with k-means clustering method, including input parameters. Note the differences between clustering and classification.
Your output might be slightly different depending on R and R Studio version. The overlapping labels on the cluster plot are acceptable since those are hard to fix. You do not need to write a report for this part.
Part 2: Run an exercise on a vehicle dataset and write a report on your findings and results interpretation in your own words. The report needs to cover the exercise key points below in order.
Download the vehicle.csv file to your hard drive.
1. Introduction - What do you expect the k-means clustering method to accomplish for the vehicle data?
2. Data pre-processing
Run the set.seed command. Include the command on the report and explain the reason for running this command.
Load the data from vehicle.csv file into R. Create a copy of the vehicle dataset called myvehicle. Include the command in the report.
Remove the variable class from a myvehicle. Include the command in the report, and explain why we remove the class variable.
Run the scale command to scale the myvehicle. Include the command in the report, and explain why we scale data.
Discuss any additional data pre-processing that you run. Include the commands and explain what each command does in the report.
3. Run the kmeans method with k=4 and store the output in the variable kc.
Include the command in the report and discuss the input parameters you used.
Enter kc at the command prompt and hit enter. Include the command output in the report and answer the following questions.
How many instances are in each cluster?
What information does the cluster means section of an output provides and how were the numbers obtained?
What is clustering vector?
What is sum of squares by cluster, and what does it mean?
Run the kc$iter command, and explain what the output shows. Include the command, the output, and explanation in the report.
4. Clustering evaluation
Build the cross-tabulation to compare how the method clustered the vehicles with the actual vehicle class. Include the command and the output in the report. Answer the following questions.
What is the dominant vehicle class in each cluster?
What additional information does the table show?
What percentage of vehicles were clustered in agreement with the actual class?
5. Build the cluster plot. Include the command, the plot, and the plot interpretation in the report.
6. Experiment with 3 different k values, and summarize the findings in the tabular format.
Explain the effect of k values on method results.
What is an ideal value of k for the vehicle data? (This is an open-ended question)
7. Summary
What differences between k-means clustering and classification methods did you observe?
Which part of this exercise did you find the most challenging and which approach did you take to resolve the challenge?
Submit the following - The report addressing the key points above and An R script with commands your ran and brief comments on the commands purpose.
Can't Figure Out How To Compose Your Assignments And Homework? Avail K-Means Clustering Assignment Help From Tutorsglobe And Live Your Academic Life Stress-Free!
Tags: K-Means Clustering Assignment Help, K-Means Clustering Homework Help, K-Means Clustering Coursework, K-Means Clustering Solved Assignments
Attachment:- Exercise-K-Means Clustering.rar