Could you talk about the data landscape and how you see it being put to use for competitive advantage?
Everything today is information centric. You bank, book tickets, purchase on the internet and technology is now at a point where a lot of information is being captured and can also be mined to a different degree.
In our current generation we have information that is not just text based like video, images, and so on. So key questions like how you can really make sense of all this information and mine it to your advantage are very important.
Today we are using a lot of sensors to capture data, which results in a lot of data being aggregated. All of these surveillance videos, satellite images and sensors capture a lot of information and therefore, the amount of information getting created is exponential. So how do you make sense of this data? How do you store it efficiently.
There are some big issues that this data is creating. People are creating, capturing and storing this data under the assumption that they will extract something from it, acquire some intelligence from it. But what has emerged as a challenge is the sheer volume of the data. So traditional computing methods are grossly incapable of handling this data.
And take a large amount of time to process this data. This results in data being unavailable when you need to ine it for information. Lets say that one wants to do a pattern analysis for fraud tracking lets say for a bank. If you want to track a fraudster you want the information immediately at your fingertips, and not even 2 minutes from then.
Businesses had a huge amount of information that they realized was very useful but they did not have any tools or trackers to make sense of that information and put it together in interesting ways.
A few things have changed. We have some technologies which have created a lot of interest in the industry. These are pretty path breaking technologies and as adoption increases these will change the way people currently look at IT and Business Systems. So there is a broad umbrella terms that is being used to describe this whole context which is Big Data. In short, this is the problem, because today we are looking at feasible solutions that can harness the power of big data and power analytics.
What are some of the evolving methods of working with Big Data?
So there is an architecture that has evolved called Hadoop. So initially this started as a project in Google Labs. So what the architecture does is that it allows organizations to perform distributed processing. Which means they can use cheap hardware to run powerful systems. What Hadoop architecture has made possible is to process this huge amount of information, by breaking it into a number of smaller pieces, breaking it down and using different servers to process it.
The thing with big data is that with the amount of information that enterprises are gathering today, the limits of data driven intelligence are being pushed. Today enterprises have many nodes of data; for example, supply chain data, customer data, social data, data generated within the organization and so on.
The compute power required to mine this information was previously unavailable, but now you have technologies like Hadoop and In-Memory computing which are transforming the way that analytics are performed. What used to take months of time to crunch, now takes only a few minutes.
So in this case, the possibilities are quite endless. Enterprises will learn to serve their customers better, and have more insight into user behavior. This will in turn help them make business decisions that have maximum impact. Big Data can change the way governments operate, how supply chains are built, and can really have a positive impact on enterprise decision making.
Could you talk to us about NetApps role in the Big Data Equation?
We have a good number of customers in this space and we have identified this as a big opportunity for us as have many others. Being in the data storage management space this opportunity is very exciting for us. We recently acquired a company called Enginio. This company in terms of footprints or install base had the largest number of systems across the world because it wasnt selling a product under its brand name, its products were being rebatched as VM products by IBM, Dell and most other vendors.
So we acquired this company for close to $400 million and they had roughly a $1 billion business. The reason we acquired this company was because we primarily designed storage systems for the enterprise. So we had a lot of enterprise applications that were very feature rich and were generating more and more data. In big data the intelligence is really in the architecture. They dont necessarily need a very intelligent storage but it needs to be reliable and fast.
So when we look at a Big Data Strategy, our interests are three fold:
Analytics: There are people who are innovating on that front. Hadoop is one of the prime examples and there are also others that are emerging. Big data analytics is one aspect of this whole opportunity.
Bandwidth: Speed is of the essence. You not only want to process data you want it to be processed fast. So they require a high amount of bandwidth. So Enginio was no frills and a rugged industrial strength box and allowed a great amount of bandwidth ideal for big data offerings.
Content: Unlike traditional data big data is not just in one datacenter. It in all likelihood has been across the world. So you need a storage architecture that can keep the data ecosystem coherent.
We acquired a company called Bycast which allows us to control data which is across the globe. We can offer boundary-less containers to store this data with our technology. So this is what we have to offer.