Strategic consultants are increasingly using novel data science methods to drive business outcomes for their clients.
By helping clients get more out of the data that they already have or by finding new sources of data, consultants can turn data insights into quantifiable value that impacts the bottom line.
Hunting for value within data often requires hours and hours of manual data science labour. Needless to say, optimising that process to save time really matters, but even more important is the ability to discover hidden insights because it is these insights that drive a competitive advantage for a consultant’s clients.
We talked to Tony Aug, co-founder at Nimble Gravity, to understand how SparkBeyond helps the consultancy rapidly navigate the data science process, and discover the novel features hidden within pools of data.
Nimble Gravity came across SparkBeyond at a conference where representatives from SparkBeyond were demonstrating how the SparkBeyond Discovery platform works – including the ability to use the Discovery platform to perform automated feature engineering.
Feature engineering is one of the more time-consuming aspects of data science, and also one of the most important because it involves identifying the most valuable signals within a large pool of data. In some ways, one could argue that feature engineering is the cornerstone of data modelling.
Nimble Gravity immediately saw the potential for using SparkBeyond as a type of data engineering front-end – providing an interface to accelerate the progress of data science projects. It also quickly became clear that SparkBeyond offered a way to follow a more structured approach to insight discovery that relies less on intuition.
In other words, SparkBeyond offered Nimble Gravity a tool to step through the insight discovery process in a faster, more structured manner.
With consultancy projects, there’s always a finite amount of time to devote to a client project. And so, while Nimble Gravity strives to ensure no stone is left unturned in extracting value from client data, there is significant value to be found in any tool that optimizes the data discovery process.
This more systematic approach saves the Nimble Gravity team time, while ensuring better outcomes for its clients because it reduces the risk that important features will be missed.
According to Tony, “Trying to pin down just a single feature can require countless lines of Python code to extract. In contrast, the Discovery platform can rapidly process millions of permutations without the need for manual data science work. It enables us to look into features that we might never have considered.”
The Discovery platform also helps Nimble Gravity through a scoring approach, whereby millions of potentially significant features are ranked in terms of value – helping its consultants to examine the best 1,000 features through an iterative process.
By stepping through feature discovery, this still large number can be narrowed down to the 100 or so really interesting features. Therein lies the magic of automated feature engineering. Without SparkBeyond, considering even 100 features would have been impossible. Yet the Discovery platform enables Nimble Gravity to, in an almost literal sense, turn over every stone looking for value in data.
As a typical example of how SparkBeyond helps data scientists navigate through complex, multi-faceted data sets, Tony pointed to the value of geolocation data such as OpenStreetMap. Here, SparkBeyond helped Nimble Gravity to draw insightful links between census data such as median income, linking that data up to internal data sets such as sales revenue for that area.
Evaluating complex overlapping data sets can be a mammoth task. Intuition can help experienced data scientists to explore promising areas but, as Tony says, “There are instances in which intuition is less beneficial.”
The only way forward with such vast data sets is a step-by-step, comprehensive process to test features. With geolocation data it can involve millions and millions of permutations which simply wouldn’t be practical to test. According to Tony, “It would take all the data science capacity on the planet, for a week, to go through millions of different permutations to try and narrow it down to the hundred or so that are of interest.”
These points of interest can now be found much more readily because the SparkBeyond Discovery platform makes it so much easier and faster to test and process vast data sets – when compared to a far more manual method of trying to find the value in data.
Time to value is a key prerogative for the team at Nimble Gravity. Where data drives a business decision, the faster you get access to the right data to make the decision – the better. SparkBeyond significantly speeds up this process and enables Nimble Gravity to deliver decision-making advice to its clients faster.
But SparkBeyond also enriches the data science process because it allows data scientists at Nimble Gravity to explore so much more than would otherwise have been practical. It means, for example, that any resulting machine learning model is significantly richer, with far higher predictive accuracy.
It does so at least in part because SparkBeyond encourages experimentation. While data scientists would normally explore along their lines of intuition, they needed to be reasonably certain of the direction of travel given the fact that trying different aspects required an enormous time investment. According to Tony, “SparkBeyond Discovery enables us to simply bring in external data just to see what we might find.”
SparkBeyond gives data scientists at strategic consultants the ability to freely explore and to discover important features that they simply didn’t know were there – driving outsize data science outcomes for their clients.
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis