K-Means is a simple clustering algorithm for unsupervised learning.
		Because AIMA Python does not include an implementation of this
		algorithm, nor does the book provide a description of it, you will use
		a more main-stream implementation supported by the SciPy library .
.
	
Do the following exercises with K-Means clustering.
Download the following sample code, which implements a one-dimensional K-means clustering problem: lab1a.py. Make sure that you can answer the following questions:
Download this two-dimensional clustering problem: lab1b.py and, again, make sure that you can answer the following questions:
You can find a more detailed explanation of k-Means elsewhere: k-means clustering; Thrun’s lecture on Unsupervised Learning.
The EM algorithm generalizes K-Means.
Do the following exercises with EM clustering.
Download the following sample code, which implements a simple Gaussian mixture model: lab2a.py. Make sure that you can answer the following questions:
Modify the code from the previous exercise to match the example output shown in the text in Figure 20.11. You can make Gaussian mixture models with more than one component by:
components.
					
							numpy.concatenate((sigma1 * numpy.random.randn(n1, 2) + mu1, 
                   sigma2 * numpy.random.randn(n2, 2) + mu2))
						
						The mu and sigma values for each cluster can differ, and the number of points specified for each cluster will determine the weights.
You won't be able to make the model exact, but produce something close to what is shown in the figure.
Now try EM on a real dataset, the well-known iris dataset. The AIMA data distribution contains this dataset as well.
Download the following sample code, lab3.py, which loads the iris data-set from a pickled file, iris.txt. Modify the code to use a gaussian mixture model to classify the basic flower types represented in the iris data.
Do your learned weights, means and variances match the actual structure of the data? How would you check this?
This exercise uses pickle, a Python tool for dumping and loading data. You’ll use this in the homework as well.
Submit your source code as specified above in Moodle under lab 9.