
What is big data?
The term “big data” means large quantities of raw information – typically terabytes or more – collated from multiple sources both within and outside of an organisation. The term can also loosely refer to the process of analysing such large sets of data to discover insights.
What sort of information is included?
Big data can include everything that could conceivably be relevant to a company’s operations, such as sales data, website activity, CRM records, network logs and sensor information. External information may include details of social media activity or currency exchange rates. Big data sets often focus on “high-velocity” sources, which grow rapidly as new data is continually added.
Another hallmark of big data is the inclusion of unstructured or semi-structured content, such as social media posts or web pages. This contrasts with traditional database-driven approaches to business intelligence, although big-data sources generally still need to be translated into a consistent format for analysis.
Why is big data useful?
Big data analysis uses powerful computing resources to process data sets that would be too large and diverse for a human to work with. Subtle trends and correlations can be spotted, and actionable insights can be generated – perhaps relating to customer behaviour, or to inefficiencies in the company’s workflow – that would be missed by traditional approaches, or uncovered much more slowly.
Does big data use AI?
Big data analysis doesn’t necessarily involve artificial intelligence. However, the task of finding patterns and connections in very large, unorganised data sets is a natural fit for machine learning. AI logic can be used at multiple stages of a big data process, such as standardising the data and making predictions from incomplete information.
How is the data processed?
There is no off-the-shelf tool for big data analysis: the process needs to be custom-coded to suit the available data sources and business parameters. Many solutions use the open-source Hadoop programming framework, which has built-in capabilities for handling the ingestion, storage and processing of large data stores.
What sorts of organisation can make use of big data?
Big data is of particular interest to enterprise-scale businesses: these are the companies most likely to generate the huge quantities of data required for big data analysis. Large companies are also most likely to have the resources to invest in the necessary computing power, and can afford to hire professional developers and analysts to realise big data projects.
However, big data techniques are open to businesses of all sizes. Hosted services such as Google Cloud BigQuery, IBM Cloud Pak for Data and Microsoft Azure Databricks let businesses of any size assemble their own data analysis processes, using a variety of languages and frameworks, on a pay-as-you-go basis.
Summary
- Big data refers to very large collections of unstructured data, and the analysis that can be performed on them.
- Applying AI-type logic to big data stores can unearth insights that a human data worker would never discover.
- Big data processing normally entails some degree of custom coding, using a suitable framework such as Hadoop.
- Small businesses can take advantage of numerous cloud-based big data services.
NEXT UP

Nathalie Parent, Chief People Officer at Shift Technology: “HR is the conscience of an organisation”
For more than 30 years, Nathalie Parent has led global HR teams, working primarily with software companies. Today she’s Chief People Officer at Shift Technology

AWS makes it cheaper to store little-used data with EFS Archive
Amazon introduces new storage class that makes it cheaper to store rarely used files

Why should we care about robot carers?
Robot carers are real, but caregiving has bigger problems, writes Richard Trenholm in this FlashForward edition