CATEGORIZATION OF DATA USING HIERARCHICAL CLUSTERING
Need help with a related project topic or New topic? Send Us Your Topic
DOWNLOAD THE COMPLETE PROJECT MATERIAL
CATEGORIZATION OF DATA USING HIERARCHICAL CLUSTERING
Chapter one
INTRODUCTION
Given a data collection comprising n points in high dimensional space, it is frequently beneficial if it can be projected onto a lower dimensional environment without significant distortion.
This procedure is known as dimensionality reduction. Dimensionality reduction, in essence, reduces the number of variables to be considered in such a way that the relevant data is kept while the data volume is reduced.
Dimensionality reduction helps to lower the runtime of algorithms whose runtime is determined by the dimensions of the working space. It also broadens the range of methods for data processing. It provides complexity control, preventing overfitting of training data.
Dimensionality can be applied to a variety of domains, including text data, image data, closest neighbour search, and clustering and classification. Clustering is the division of a set of data into subsets (called clusters) so that observations within the same cluster are comparable in some way.
Clustering is a strategy for unsupervised learning. In contrast, classification is a method of supervised learning. The supervised learner’s objective is to predict the value of the function for every valid input after seeing a number of training instances (i.e., input-target output pairs). As previously stated, the primary goal of this research is to categorise data using hierarchical clustering.
Need help with a related project topic or New topic? Send Us Your Topic