What is data science?
Data science is a term that gets thrown around a lot these days. So it’s worth pausing for a moment to define what it is. The aim is to produce insights from large data sets. Insights that will aid better decision making.
By insights we are referring to the hard-to-detect patterns that exist in the given data sets.
There would be no point using a data science project where simple querying would do the job. The kind of insights revealed are the hard to see patterns in big data sets.
Start with a question
Any statistical analysis starts with a question. A business question such as: how can I see which of our products our customers buy together?
Or for social care you may ask: how can we know when an elderly person’s health is showing signs of deterioration?
Data science methodologies will reveal patterns in large data sets. Those methodologies help you make otherwise difficult decisions by providing actionable insights…
Detected patterns lead to actionable insights. Or, help to solve the problem first identified in the question raised.
You might ask: what is valuable in my data?
For instance, you want to understand the purchasing patterns of your customers.
Data science algorithms will find patterns and allow better decisions. Such as what to keep in stock and at what level. What price points are the most successful and so on.
A price point change and the following sales patterns will identify the impact of the change. That means there will be insights available about more price point alterations. Tracking variations in the fluid intake of elderly people is another example.
Over time, machine-learning algorithms will learn what is normal. That means abnormal patterns get flagged. And then there is an actionable insight. Even if a simple health check for the person involved.
Is data science the same as data mining?
In essence, no. Data mining deals with the analysis of structured data. It has an emphasis on commercial applications. Data science goes further by capturing, cleaning and transforming unstructured data. It’s all about those patterns
Using data science we can extract patterns based on different questions. We’ve already mentioned a couple of examples. But starting with a different question is also possible. And this approach will help you identify patterns from the same big data. We call this clustering. While in business speak it’s referred to customer segmentation.
Online retailers ask questions about products that people buy at the same time. That’s called association-rule mining. Machine learning can discover relations between variables in big data sets. And yes, machine learning is part of the data science ecosystem.
So what is data science, then?
Data science includes a set of principles, problem definitions, algorithms and processes for extracting non-obvious patterns from large data sets.
Phew! But that is as tight a definition as we can come up with. Remember, this is not a single channel to solve data issues.
We’ll continue our insight series with a look at Analytics As A Service, which will be here on the 3rd August. In the meantime if you have a question feel free to get in touch.