There is no doubt that companies from nearly all sectors of the economy are becoming aware of the benefits of data-driven decision making. These companies see the value of moving from situations where major decisions are made based on the personal experiences of a small set of executives to a more data-driven, analytical approach to decision making. A recent study completed by IBM shows that “one in three business leaders are forced to frequently make critical decisions without the information they need” . As the data volumes grow and data is generated at previously unheard of velocities, simply analyzing the data to identify important trends can take many days or in some cases weeks of computer processing using traditional data analysis methods.
This is where the Big Data Revolution starts: the infrastructure needed to deal with high volumes of high velocity data coming from real-time systems needs to be set up so that the data can be processed and eventually understood.
What makes this a challenging task is that the data isn’t simply coming from transactional systems: it can include tweets, FaceBook updates, sensor data, music, video, webpages, and of course, numeric data. Finally, the definition of today’s data might be different than tomorrow’s data. Think about how a blog works—we change and evolve tags on a blog as we learn more about a subject area.
Big-Data infrastructure companies, such as Cloudera, HortonWorks, MapR, 10Gen, and Basho offer software and services to help corporations create the right environments for the storage, management, and analysis of their big data. This infrastructure is essential for deriving information from the vast data stores that are being collected today. Setting up the infrastructure used to be a difficult task, but these and related companies are providing the software and expertise to get things running relatively quickly.
So let’s say that you have the infrastructure up and running—what can you expect? Most infrastructure companies offer an analytical solution that provides reporting and in some cases, visualization ability. This will allow your analysts to extract actionable information from data sets.
Suppose you are the CMO of a retail organization with an online and brick-and-mortar business and you’ve developed a big-data infrastructure. Once the infrastructure is up and running you may be able to see sales as a function of time and geography, categorized by online and offline sales. That’s interesting, but it doesn’t take the other aspects of your data into account. It is even more interesting to see how the tweets regarding your latest campaign on FaceBook affected a promotion you were running yesterday on the west coast using your new location-based sales app. To answer this question, your big-data infrastructure needs to support data analysis far beyond what is usually found in a relational-database management system.
As Trident Capital’s Big-Data Venture Advisor, I’ve considered several companies that demonstrate sophisticated software to support the big-data infrastructure. The infrastructure is the necessary first step in the Revolution. The next step will be in the area of applications that are specifically targeted to a particular vertical. Such applications will utilize the big-data infrastructure to solve particular business problems.
Going back to our retail example, the goal of the CMO of the retail organization isn’t only to figure out how a particular sales campaign worked out. The larger goal is to drive more sales across all channels. The CMO would really be interested in buying software and related services that can help drive sales across more channels by predicting user demand in real-time using historical data.
At Trident, we are very interested in companies that develop such applications on top of big-data infrastructures to address industry problems.
I’ll be moderating a panel session moderating at the Cloud Analytics Summit, hosted by THINKstrategies, the Cloud Computing Showplace and Rising Tide Media, on Wednesday, April 25 at the Computer History Museum in Mountain View, CA. The session is called “Industry-Specific Cloud Analytics Customer Success Stories” and will take place at 1:35-2:15pm.
The purpose of the Summit is to provide a unique meeting place for corporate decision-makers from mid-size and large-scale enterprises, as well as various solution providers, to learn about the latest Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) solutions aimed at addressing their BI needs, so they can harness their ‘Big Data’ sources, and integrate their systems and applications into a more productive enterprise-wide resource to satisfy their corporate requirements.
We are fortunate to have several companies representing the entire spectrum from infrastructure to analytics at our panel:
- JobVite: Applications of big-data analytics to social recruiting.
- Host Analytics: corporate performance management application for finance and accounting in the cloud.
- PivotLink: Applications to help retailers be more competitive and profitable.
- Ayasdi: Applications for interactive analysis and visualization for complex data.
- SkyTree: Big Data Analytics infrastructure by performing machine learning on massive datasets.
The Panel will be an excellent opportunity for us to discuss the trends in big data, and the Big-Data Revolution.
 S. LaValle, “Business Analytics and Optimization for the Intelligent Enterprise,” IBM Global Business Services, http://www-05.ibm.com/ch/asc/pdfs/bao-intelligent-enterprise.pdf, 2009.