The platform’s new drag-and-drop interface allows data professionals to build their own feature search space across text, geo-spatial and time-series data.
This makes it easy for any user to combine the hardest data types, and discover novel insights in one place, fast, without the usual scripting, coding, and IT hand-holding.
This month, we’re unveiling the new SparkBeyond Discovery: the data professional’s triple threat. The platform’s intuitive user-friendly interface makes its speed (condensing hours of menial data prep and analysis into minutes), rigor (using the world’s largest library of functions to test millions of hypotheses at scale), and performance (automatically producing composite features for stronger results), a completely new experience.
Previously only available to top enterprises—think PepsiCo, ABInBev, among others—the all-new Discovery platform is ideal in equipping all cross-functional teams with no-code data science capabilities. Data professionals from all-sized organizations can now connect complex data sets, use automatic feature engineering with unprecedented depth and breadth, and build their own Machine Learning models—all with a drag-and-drop interface.
As companies are rebuilding their foundations to compete in the era of data and advanced analytics, we rebuilt SparkBeyond Discovery from the ground up, for the long haul. Let’s take a deep dive here to explore the invisible part of the iceberg, and see how the platform’s new look and feel helps drive the cycle of learning and iteration for all data professionals.
The cornerstone of SparkBeyond Discovery is its ability to automate and exhaust the search for precise, actionable insights.
While traditional analytics and data science relies almost entirely on testing hypotheses generated by a human mind, SparkBeyond Discovery integrates millions of built-in functions and code to programmatically write its own hypotheses. Then, the machine tests them at ultra-high velocity to find meaningful patterns and ‘signal’ in the data.
This means the Discovery platform can test millions of hypotheses and ideas in seconds and minutes—independently of a human—which would otherwise take a data professional days, weeks, or even months.
The benefits of this proprietary AI technology are multiple.
By exhaustively searching the data for patterns, a data professional can be confident in the resulting features generated by the platform. And while better features radically improve the accuracy and stability of predictive models, they also enable faster and better business decision-making underpinned by better insights.
The new Discovery platform’s visual and intuitive interface helps data professionals understand the relationships between each feature. Users can now drill down on a particular feature, understand how feature strength changes over time, and deep dive into the feature’s influential factors. These capabilities facilitate quicker and clearer identification of actionable insights, and help sort relevance and significance.
“A feature (for insights and models) is like that quality ingredient you’re about to put into a Michelin dish. The skill of a chef is crucial, but no chef can make magic without good ingredients.”
The platform’s drag-and-drop UI allows data professionals to build their own feature search space across text, geo-spatial and time-series data. This groundbreaking interface makes it easy for any user to combine the hardest data types, and discover novel insights in one place. All without the usual scripting, coding, and IT hand-holding.
This is the natural progression for SparkBeyond Discovery, built atop the world’s largest library of functions. As the flagship product, the founding SparkBeyonders began their journey creating a machine that could crawl the web for code — in the same way Google crawls the web for text — and constructed a robust library of millions of functions which powers these complex and complicated data joins.
Combining these functions help build complex features. For example, if one function could compute the distance between two geocoordinates, another could explore the immediate surroundings of the buildings, and a final one could calculate the angle of the sun during a certain time of day in a specific location, these could be put to work together to discover solutions for energy efficient buildings.
Now automated, users can join together any file or data type (even that cumbersome geo-spatial or time-series data set) with just a few clicks using the platform’s built-in connectors. This also includes external inputs, including common ones such as maps, weather and Wikipedia.
This not only expedites the ETL, it enriches the output by empowering the data professional to explore more and more data types, in a shorter time. By automating this part of the analytics workflow, users can achieve the high scalability, strong performance and operational best practices used by the most successful analytics teams.
One of our main goals in building SparkBeyond Discovery is ensuring subject matter experts can provide crucial input whenever needed. Business involvement can too often be a bottleneck, restricting data science projects to remain in pilot purgatory, instead of in production and making impact.\
The new platform’s built-in help-guides ensure a smoother and faster onboarding process, while Plain English explanations, intuitive microsegment management, and ‘glassbox’ transparency encourage multidisciplinary teamwork. So business analysts, data professionals, subject matter experts and other decision-makers can pinpoint business actions based on the insights shown.
SparkBeyond Discovery doesn’t behave like a black box. By bringing business and operations into the fold, these hybrid teams can start fast, iterate, and quickly test new ideas better than ever before. Humans are at the center of every platform interaction to create a trust-based, transparent, ‘glass box’ machine.
Many organizations have found that 60-80% of a data scientist’s time is spent preparing the data for modeling. Once the initial model is built, only a fraction of his or her time—4%, according to some analyses—is spent on testing and tuning code. In essence, tuning model parameters has become a commodity, and performance is driven by data selection and preparation.
Yet by broadening and enriching the options for data selection, and automating the data preparation, the all new SparkBeyond Discovery ensures enriched, insightful, and robust features for modelling. This bottom-up methodology is embedded into the platform workflow itself, and, as it’s powered by millions of functions, enables the platform to test hypotheses and theories on the data, prove and refute those same hypotheses, connect the dots across rows of complex datasets, and then articulate its ideas and insights in plain English.
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis
Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis