Data mining means seeking patterns in a collection of data. For example, a business might “mine” sales data to discover purchasing trends. The information revealed by data mining can then be used to improve processes, or plan future strategies.

We cover customer experience (CX) and customer journey management in separate articles.

How does data mining work?

Data mining uses statistical analysis, sometimes coupled with machine-learning techniques, to discover relationships between sets of data. This typically means identifying trends and correlations, along with anomalous items that don’t fit the patterns.

Although data mining is commonly associated with sales and marketing, it can be applied to any area of operations where records or log files are kept, such as manufacturing processes or IT support.

How does data mining differ from big data analysis?

Data mining is often practised on a comparatively small scale, focusing on defined sets of structured data: it has sometimes been referred to as “database mining”. By contrast, big data ideally draws in the widest possible range of internal and external sources.

Data mining also has a narrower goal than big data analysis, seeking only to discover patterns in the supplied data. Data mining tools don’t attempt to forecast future outcomes, although analysts may make projections based on its findings. This means data mining can be a simpler and less resource-intensive process than big data analysis.

How can companies get started with data mining?

Several commercial data-mining tools are available on a subscription basis, such as Google Cloud Platform, IBM SPSS Modeller, Microsoft Azure Analysis Services and Oracle Data Mining. These services enable managers and analysts to perform data mining tasks without requiring expert developers and data scientists.

For scenarios where it makes sense to build a bespoke data-mining solution, developers can use frameworks such as Wolfram Mathematica, which has built-in support for statistical analysis and machine learning. Another option is the open-source R programming language, while those who prefer to program in Python can use the free scikit-learn library to add machine-learning capabilities to data-analysis projects.

What are the downsides of data mining?

While focused data mining may be easier and cheaper than a big data approach, its smaller scope means it’s less likely to unearth new, unexpected patterns. At the same time, extrapolating from limited data can create misleading impressions: decision-makers must remember that data-mining produces statistics, not facts.

Since data mining is often used in the context of customer research, companies must also be mindful of data-protection regulations. If the data to be mined includes any personal information, it’s recommended to plan out the purpose, process and intended goal of the process ahead of time, to ensure customer privacy is protected.

Read next: How AI is helping IT managers do their jobs better right now

Summary

  • Data mining means processing a set of data to uncover patterns. 
  • Connections exposed by data mining can be used to identify problems and opportunities in many areas of a business. 
  • Compared to big data analysis, data mining typically focuses on smaller data sets, and doesn’t generate projections. 
  • Where data mining is used for customer research, businesses need to take care that data-protection regulations are respected. 
Avatar photo
Darien Graham-Smith

Darien is one of the UK's most knowledgeable technical journalists. You will find him in PC Pro magazine, writing reviews for a variety of sites and on guitar with his band The Red Queens. His explainer articles help TechFinitive's audience understand how technology works.

NEXT UP

what is thunderbolt share shown by a PC connected to a laptop

What is Thunderbolt Share?

Intel has just announced Thunderbolt Share, which can link two PCs together in a way that we’ve never seen before. To discover how it works, and what you need, read our explainer.