A Study of Fuzzy and Non-fuzzy clustering algorithms on Wine Data

Data clustering has been considered as the most important exploratory data analysis method used to extract the unknown valuable information from the large volume of data for many real time applications in Data Mining technology. Most of the clustering techniques proved their efficiency in many fields such as decision making systems, medical sciences, earth sciences, etc. partition based clustering is one of the main approaches used in clustering. This work reports the results of classification performance of four such widely used algorithms namely K-means (KM) or Hard c-means, Fuzzy C-means, Fuzzy Possibilistic c-Means (FPCM) and Possibilistic Fuzzy c-Means (PFCM) clustering algorithms. Well known data set from UCI machine learning repository are considered to test the algorithms. The efficiency of clustering output is compared with the results observed from the repository. The experimental results demonstrate that FCM, FPCM and PFCM gives the similar percentage of correctness and classification performance.FCM, FPCM and PFCM results are better than K-means. The experimental results prove that fuzzy clustering algorithms are better than non-fuzzy clustering algorithm.


Introduction
Clustering is one of the important phenomenons in soft computing which creates clusters of most identical featured objects in a group of data.A cluster of objects can be treated collectively as one group and so may be considered as form of data classification.Clustering data streams attracted many researches since the applications that generate data streams have become more popular.Clustering is also often called as Classification.Clustering is an important tool in data analysis, image processing, data mining, pattern recognition, medical diagnosis and etc [1].
Wine is an alcoholic beverage made from fermented grapes.Different varieties of grapes and strains of yeasts produce different styles of wine.These variations result from the complex interactions between the biochemical developments of the grape, the reactions involved in fermentation.Wine has been consumed for its intoxicating effects, which are evident after the normal serving size of five ounces.Due to the rapid growth in technology, conventional classification methods are quite difficult to analyze accurate diagnosis without ambiguities.Since the conditions are vague in medicine, fuzzy methods are supportive rather than crisp.The fuzzy cluster analysis is an iterative method.In this method memberships are assigned to the objects ranged between 0 and 1 by means of a membership function.This feature becomes a relative one and simultaneously more than one class or cluster can have the same object but with different degrees.These algorithms look for the cluster prototypes by optimizing the objective function (a function which is used to find the distance between the prototype and the object).In this paper, the authors aim to show the clustering results of the wine data by implementing non-fuzzy and fuzzy clustering methods namely k-means and fuzzy c-Means (FCM),Fuzzy possibilistic c-mean and Possibilistic fuzzy c-mean method respectively and to show that fuzzy algorithm gives better performance than non-fuzzy systems.

The Dataset
To evaluate K-means (KM), Fuzzy C-Means, Fuzzy Possibilistic c-Mean (FPCM) and Possibilistic Fuzzy c-Mean (PFCM) algorithms, the real world data set Wine were obtained from the UCI Machine Learning Repository donated by Forina [2] respectively.The Wine data set contains 178 samples and each sample has 13 attributes or chemical analysis of the wine derived from three different cultivars but grown in the same region in Italy.The samples are grouped into three different classes according to the cultivars: Cultivar 1 containing 59 samples, Cultivar 2 containing 71 samples and Cultivar 3 containing 48 samples.The attributes are the values of chemical analysis of Alcohol, Malic acid, Ash, Alkalinity of ash, Magnesium, Total phenols, Flavonoids, Nonflavonoid phenols, Proanthocyanins, Color intensity, Hue, OD280/OD315 of diluted wines and Proline.The samples in Wine data set are classified in to three different cultivars: 59 samples belong to cultivar 1, 71 samples belong to cultivar 2 and cultivar 3 contains 48 samples.

K-means algorithm
The K-Means [3] is one of the famous hard clustering algorithm.It takes the input parameter k, the number of clusters, and partitions a set of n objects into k clusters so that the resulting intra-cluster similarity is high but the inter-cluster similarity is low.The main idea is to define k centroids, one for each cluster.These centroids should be placed in a cunning way because of different location causes different results.So, the better choice is to place them as much as possible far away from each other.The next step is to take each point belonging to a given data set and associate it to the nearest centroid.When no point is pending, the first step is completed and an early groupage is done.At this point we need to recalculate k new centroids.http://www.ispacs.com/journals/cacsa/2017/cacsa-00079/International Scientific Publications and Consulting Services After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new centroid.A loop has been generated.As a result of this loop we may notice that the k centroids change their location step by step until no more changes are done.In other words centroids do not move any more.Finally, this algorithm aims at minimizing an objective function, in this case a squared error function.

The objective function
Where ‖  () −   ‖ 2 is a chosen distance measure between a data point   () and the cluster center,   is an indicator of the distance of an n data points from their respective cluster centers.The Algorithm is: Step1: Select K points as initial centroids.
Step3: Form k clusters by assigning all points to the closest centroid.
Step4: Re-compute the centroid of each cluster.
Step5: Until the centroids do not change.
K-means algorithm is significantly sensitive to the initial randomly selected cluster centers.The algorithm can be run multiple times to reduce this effect.The K-Means is a simple algorithm that has been adapted to many problem domains and it is a good candidate to work for a randomly generated data Repeat 2 and 3 until no change in each cluster center Here [1, ∞) is a parameter that determines the degree of fuzziness, = [ 1 ,  2 , … .  ] where   ℜ  is a vector of (unknown) cluster prototypes (centers).The prototypes, the membership functions and the Euclidian distance metric are calculated by the equations (2.3), (2.4), (2.5) respectively.

Fuzzy c-Mean Clustering
3) When the objective function converges to a local minimum the iteration terminates.Detailed algorithm was proposed [5]  The algorithm is given by the following basic steps.
Step 1: Randomly initialize partition matrix U, number of clusters c, weighting parameter m and the termination tolerance ε > 0.
Step 2: Determine the fuzzy cluster prototypes by using the equation (2.3).
Step 3: update the membership matrix by using the equation (2.4).
Step 4: Compare the membership matrices of previous and after the iteration and repeat from step 2 until it meets the convergence criteria.
In fuzzy clustering, FCM is a popular clustering method but it has also some drawbacks.For example, if the method is used to partition two clusters and there is an object which is equidistance from two centers then according to the constraint on the membership value it assigns equal membership value regardless of the actual belonging to a cluster.These points are called as noise points.

Fuzzy Possibilistic c-mean algorithm
Traditional clustering approaches the partition whereby each object can only belong to one cluster at any one time.Fuzzy clustering extends this notion to each object can belong to more than one cluster at a time with different membership values using a membership function.These membership values ranged from 0 to 1. FPCM was developed based on fuzzy theory by Pal and Bedzek [5].The concept of typicality and membership functions was introduced in FPCM model to overcome the drawbacks occurring in FCM model proposed by Bezdek et al. [4].The partition of the dataset Z into c clusters is represented by the fuzzy partition matrix N. The fuzzy partitioning space for Z is the set Here  = [ 1 ,  2 , … .  ] where   ℜ  denotes a vector of (unknown) cluster prototypes (centers) and the degree of fuzziness determined by a weighting parameter, , 1 ≤  ≤  (2.8) (2.9) The algorithm is given by the following basic steps.
Step 1: Initialization: Randomly initialize partition matrix U, number of clusters c, weighting parameter m and  the termination tolerance ε > 0.
Step 2: Centroid calculation: Determine the fuzzy cluster prototypes by using the equation (2.8).http://www.ispacs.com/journals/cacsa/2017/cacsa-00079/International Scientific Publications and Consulting Services Step 3: Classification: update the membership matrix by using the equation (2.9) and the typically matrix by using the equation (2.10) Step 4: Convergence criteria: Compare the membership matrices of previous and after the iteration.If the comparison value is less than the termination tolerance, then stop else repeat from step 2.

Possibilistic Fuzzy c-Mean Clustering
In order to achieve good clustering results the memberships and typicalities are both important.Nikhil et al. [6] proposed Possibilistic Fuzzy c-Mean (PFCM) model.In this proposed model the constraint in the FPCM model that the sum of the typicalities of all data points in a cluster is equal to 1 is relaxed and retains the constraint on memberships.The basic steps of the PFCM algorithm are described as follows.
Step 1: Initialization: Randomly initialize partition matrix , and typicality matrix, number of clusters c, parameters m, a, b and the termination tolerance ε > 0 Step 2: Centroid calculation: Calculate the fuzzy cluster prototypes by using the equation (2.12).
Step 3: Classification: Update the membership matrix by using the equation (2.10) and the typicality matrix by using the equation (2.11).
Step 4: Convergence criteria: Compare the membership matrices of previous and after the iteration.If the comparison value is less than the termination tolerance, then stop, else repeat from step 2.
This model has the potential that is either it can influence the prototypes by means of memberships (when a >b) or by typicalities (when b > a).If the values of a and b are restricted as a =1 and b = 0 then the PFCM model performs as FCM model.The effect of outliers can be reduce by considering high value of b (m) than a (  ).

Results and Discussion
The algorithms were implemented in MATLAB version R2012a.To achieve good clustering results, the authors considered the maximum of 100 iterations.The algorithms were run 15 times independently to get good results.

1 )
Fuzzy c-Means [4] algorithm is one of the most popular clustering fuzzy clustering methods.Consider a dataset Z with N observations is an n-dimensional row vector. = [ 1 ,  2 , … … ,   ] ∈ ℜ  .The dataset Z is represented as N x n matrices.In medical diagnosis the rows of Z represents patients and the columns are symptoms or laboratory measurements for these patients.The partition of the dataset Z into c (1 ≤  ≤ ) clusters is represented by the fuzzy partition matrix = [  ]  .The fuzzy partitioning space for Z is the set.Fuzzy c-Mean model achieves its partitioning by the iterative optimization of its objective function given as  ⏟ , {(; , ) = ∑ ∑ (  )  ‖  −   ‖ 2   =1  =1 } Where  = [  ]  (2.2)

Table1:
The clustering results obtained by the algorithms K-means,FCM, FPCM and PFCM Wine data set (3 clusters) -

Figure 1 :Figure 2 :Figure 3 :Figure 4 : 4 Conclusion
Figure 1: K-means result for wine data - International Scientific Publications and Consulting Services

Table 2 :
Comparisons of classification performance and percentage of correctness performance