Sponsors

Sponsors


Home

Clustering approaches address segmentation problems. These approaches assign records with a large number of attributes into a relatively small set of groups or "segments." This assignment process is performed automatically by clustering algorithms that identify the distinguishing characteristics of the dataset and then partition the n-dimensional space defined by the dataset attributes along natural cleaving boundaries. There is no need to identify the groupings desired or the attributes that should be used to segment the dataset.

Poll: Which of the following would you recommend as the best introductory book on data mining?
Data Mining: Concepts and Techniques - Han & Kamber
Data Preparation for Data Mining - Pyle
Introduction to Data Mining - Tan, Steinbach & Kumar
Principles of Data Mining - Hand, Mannila & Smyth
Machine Learning - Mitchell
The Elements of Statistical Learning - Hastie, Tibshirani & Friedman
Introduction to Business Data Mining - Olson & Shi
Predictive Data Mining: a practical guide - Weiss & Indurkhya
Other
Books are way too structured and expensive for me!
[View results]

Check out more information about these books here!
-->


Clustering is often one of the first steps in data mining analysis. It identifies groups of related records that can be used as a starting point for exploring further relationships. This technique supports the development of population segmentation models, such as demographic-based customer segmentation. Additional analyses using standard analytical and other data mining techniques can determine the characteristics of these segments with respect to some desired outcome. For example, the buying habits of multiple population segments might be compared to determine which segments to target for a new sales campaign.

Clustering divides a database into different groups. The goal of clustering is to find groups that are very different from each other, and whose members are very similar to each other. Unlike classification, you don’t know what the clusters will be when you start, or by which attributes the data will be clustered. Consequently, someone who is knowledgeable in the business must interpret the clusters. Often it is necessary to modify the clustering by excluding variables that have been employed to group instances, because upon examination the user identifies them as irrelevant or not meaningful. After you have found clusters that reasonably segment your database, these clusters may then be used to classify new data. Some of the common algorithms used to perform clustering include Kohonen feature maps and K-means.
Don’t confuse clustering with segmentation. Segmentation refers to the general problem of identifying groups that have common characteristics. Clustering is a way to segment data into groups that are not previously defined, whereas classification is a way to segment data by assigning it to groups that are already defined.  
Disclaimer
The content on this site is provided as information only and does not constitute an endorsement by the webmaster. It is your responsibility to check out suppliers thoroughly. Trademarks and Service Marks are the property of their respective companies. Note: If you think that a reference to  your work/site/tool should be added to this site or if you have any suggestions related to improvement of this site, please send an email to: admin@eruditionhome.com
This website is about data mining, data mining tutorial, data en language mining, data mining software, data mining tool, crm data mining, business data intelligence mining, data mining technique, application data mining, data mining web, data mining solution, data mining technology, data mining process, data mining warehouse, data definition mining, data mining science technology, data mining privacy, course data mining, data mining reason, data discovery knowledge mining, data data mining warehousing, data job mining, data introduction mining, data mining sas, data mining research, data mining news, concept data mining, data data mining warehouse, data mining text, data mining training, case data engineering in mining software study, consulting data mining, data decision mining thesis tree, data mining server tool, data knowledge management mining, data mining multimedia, data dmo mining sql, care data health mining, code data mining project, data mining olap, data define mining, article data mining, comparison data detection intrusion mining, data mining oracle, data mining pdf, data mining warehousing, data mining program, data mining services, application data mining statistical, association data mining, case data mining study, content data management mining, chennai data mining, data example mining, data it loc mining, data mining seminar, data government mining, audit data mining, classification data mining project report, data information mining, data mining technologies, company data mining, data mining resource, data disadvantage mining, data discovery journal knowledge mining, data marketing mining, data mining visual, data free mining software, career data mining, conference data mining, data mining model, article data data mining warehouse, benefit data mining, data faq mining, data library mining, data mining product, anova data mining, application data digital library mining, data data mining quality, data data mining reduction, data journal mining, analytic data kurt mining technologies.