Insight, Observations, and and Updates on the Big Data Landscape

Jeffrey Abbott

Subscribe to Jeffrey Abbott: eMailAlertsEmail Alerts
Get Jeffrey Abbott via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: EMC Journal, Data Lakes News

Data Lakes: Blog Feed Post

Live From Strata + Hadoop World: Dry Lakes, Salt Lakes, Data Lakes

Jeffrey Abbott

IMG_1565Water, water everywhere and nothing to drink. Today I traveled from Boston to San Jose, CA. With stunningly clear weather and a window seat, I observed the transition from a frozen blanket of white covering the entire Northeast and Great Lakes, to the dry and rugged Rockies that are oddly snow-free, to the nearly empty reservoirs of California with their bleached sidewalls that reveal our failure to control our supply and demand for natural resources. The picture here is the Utah Wasatch range that’s home to Snowbird and Alta, which usually have among the most snow of any U.S. ski area (looks more like May than February right now). This year, you’ll find far more snow in New England. This trip brings me to the biggest gathering of big data practitioners of the year and although I see empty reservoirs, I see lots of data lakes.


In fact, from looking at the top big data vendors, it seems that the notion of a data lake has surpassed the skepticism, rejection, and second guessing that plagues all new tech concepts. Vendors, customers, and industry experts have found common ground around the idea that the data lake can relieve the challenges of the data warehouses. The big question is where the data lake fits with the data warehouse. Is it a teammate, a leader, a follower, or a full-on replacement?

The data lake, although it suffers from a bad name, leverages new technologies and approaches to accommodate both structured and unstructured data from a range of sources without the need to categorize/classify/label it when it’s captured. In other words, because technologies such as Hadoop enable it to be ingested with high efficiency, we can now store it without already knowing how we’ll use it.

Although so many vendors are rushing to position their capabilities to build you a data lake, many of them are missing the primary reason why their customers are slow to adopt. The challenge is that the promised value of a data lake has two distinct categories. The first is easy. It’s the cost savings side. It’s the efficiency derived from a better way to store massive amounts of both structured and unstructured data. And although that matters, it’s… well… boring. What makes business leaders interested? New products, services, markets, customers, business models, partnerships, revenue streams, etc. And those are exactly the right types of use cases for big data analytics and data lakes.

But in order for business leaders to sign off on major investments, they need numbers, metrics, KPIs, ROI, time-to-value, opportunity cost, economies of scale, etc. And for big data, they need to understand the analytics  use cases that will result in insight that advances their strategic initiatives. They need this before committing to making a major shift in how they “afford” IT, in hopes of turning it from a cost center into a revenue center.

From Day 1 at the Strata Conference in San Jose 2015,  it’s apparent that the data lake has moved from an experiment that runs alongside a data warehouse, into a better approach to ingest and store data that has untapped value. The critical first step is to determine where and how to apply the analytics capabilities.  Many studies show that identifying use cases for big data is the biggest obstacle in big data adoption.  EMC has addressed this with a Big Data Vision Workshop. This infographic explains the process.

Live From Strata + Hadoop World: Dry Lakes, Salt Lakes, Data Lakes
Jeffrey Abbott

Read the original blog entry...

More Stories By Jeffrey Abbott

Jeff is part of Tata Consultancy Services Digital Software and Solutions group, as a lead evangelist for its IoT analytics platform solutions for smart cities, smart retail, smart banking, smart comms, and other areas.

Prior to TCS, Jeff was part of EMC’s Global Services division, helping customers understand how to identify, and take advantage of opportunities in Big Data, IoT, and digital transformation. Jeff helped build and promote a cloud-based ecosystem for CA Technologies that combined an online community, cloud development platform, and e-commerce site for cloud services and spent several years within CA’s Thought Leadership group, developing and promoting content and programs around disruptive trends in IT. Prior to this, Jeff spent 3 years product marketing EMC, as well as a tenure Citrix, and numerous hi-tech marketing firms – one of which he founded with 2 former colleagues in 1999. Jeff lives in Sudbury, MA, with his wife, 2 boys, and dog. Jeff enjoys skiing, backpacking, photography, and classic cars.