Follow Us

Zaloni and its Data Lake Management Approach: Interview with Scott Gidley

The accelerated evolution of the big data and database markets and the increasing need for managing larger volumes of diverse types of data have favored and created the need for the emergence of so-called data lake management providers.

One of these key new players in this market is Zaloni.
Zaloni, provides enterprise data lake management, governance and self-service data solutions aimed to ensure organizations to have a clean, actionable data lake.

Co-founded and led by Ben Sharma (CEO) & Bijoy Bora (COO), this North Carolina based software company relies on the experience of its team to provide a reliable solution for managing reliable and efficient data lake services.

Recently we have the opportunity to speak with Scott Gidley ―Zaloni’s VP of Product Management― about the company, its solutions and in general, about the big data and data lake markets.

Scott leads the strategy and roadmap of the Zaloni platform portfolio. He is a 20-year veteran of the data management software and services market. Prior to Zaloni, Scott was a senior director with SAS and was the CTO & Co-founder of DataFlux Corporation.

Here, an insightful interview with Scott.

Hi Scott, so, what is Zaloni, what is the story behind the company?

Zaloni is an enterprise software company that helps its customers modernize their data platforms and ecosystem in order to leverage data more effectively - such as through more agile, more advanced analytics.  Zaloni provides Bedrock ―a Data Lake Management and Governance offering and its Mica Catalog and Self-service offering.

Zaloni has a number of customers across a variety of industries including Du and Verizon in Telecommunications, UnitedHealthcare Group and SCL Health in Healthcare, as well as Avant and Metlife in Financial Services.

How would you describe, based on your experience with Zaloni, the state of the Big Data and Data Warehousing markets in terms of maturity, its integration within a coherent data management infrastructure, challenges and opportunities perhaps?

From our experience, organizations continue to explore new capabilities and insights that are promised by exploiting the volume, variety and veracity of big data in their environments. While the maturity of the big data management and analytics technologies continues to evolve, it is far from a finished product and less mature than the existing EDW and data management platforms.

Zaloni customers often look for guidance on how to blend their existing data platform with their new big data initiatives. This includes leveraging enterprise data governance policies in their big data projects,  off-loading or augmenting their data management and transformation processing from the EDW into a data lake, implementing data quality, security, and tokenization routines on data native to the data lake, and transforming streaming data to deliver real-time business value.

What are the common mistakes companies tend to make when trying to launch a data lake initiative?

While data lakes are becoming the preferred platform for modernizing enterprise data environments, un-managed, poorly thought through data lakes are simply data swamps and their usefulness decays over time. We find the following common mistakes:

  • The data dump into Hadoop. Too often organizations base the initial success of their data lake on the amount of data they can load into it. While a great initial for cost-savings – if you dump the data into Hadoop without appropriately tagging the data and indexing the metadata, you will not have the visibility and quality you need from the data lake.
  • No clear use case up front. If a company does not work to think through a solid set of use cases for the data lake prior to implementation, they will not see the benefits as easily and the entire project can get scrapped due to lack of stakeholder support.
  • Start small, deliver value, then expand – Too often organizations try to implement the most difficult use case to prove value in their big data projects. Starting with a smaller project,
  • completing it end-to-end, will provide a roadmap for enterprise level success.

(post-ads)

A lot has been said about Data Lakes, could you describe in your vision what a data lake and a data lake management platform are?

Data lakes are becoming the preferred platform for the modern data platform.

Data Lakes are most often referred to as a single repository for storing data across an enterprise in its raw format. Data Lakes often utilize Hadoop, Spark, and other scale out architectures to store and transform this data to be used for analytic and BI use cases.

Data Lakes differ from traditional EDW in that data may be stored in any format and processed natively, rather than having to be extracted from the data lake to process.

Existing data lake management platforms focus primarily on ingestion of data at scale, metadata management, data quality, security, and self-service data preparation. This has served the initial data lake implementations well, but as data lakes mature the focus will shift to automation and optimization of these processes.

The future success of big data environments will be based on the ability to automate many of the manual processes today and to help end users and subject matter experts to focus on business value and not acquisition and management of data.

What are the main challenges your potential customers face that get them to look for a solution such as Zaloni?


  • First and foremost, organizations struggle with the skills gap created by big data technologies such as Hadoop, Spark, Map Reduce, etc. Technologies such as Zaloni Bedrock alleviate the resource crunch by providing a business user-friendly interface that reduces the technical complexity.
  • The big data ecosystem is comprised of many tools and technologies (as mentioned above). In addition to the skills gap, organizations struggle with the requirement to stitch together these technologies into a coherent data platform. Bedrock Zaloni provides a single interface for an end-to-end data platform. This simplifies the deployment and development and speeds time to value.
  • Leveraging existing data management principles (data governance, data quality, lineage, privacy and security) in the big data world. As organizations look to adopt data lakes or big data projects, they want to ensure they are compliant with their corporate data standards. Having a seamless way to leverage existing data management principles in the data lake is critical for enterprise adoption.

Zaloni's Data Lake Services (Image courtesy of Zaloni)

About Zaloni’s Data Lake 360 Platform, could you describe its key components (Bedrock & Mica)?

Bedrock is a data lake management and governance platform that enables managed ingestion of data at scale, that inventories all metadata, builds workflows and provides appropriate governance requirements such as data lineage, data quality, data security and privacy and data lifecycle management.

Mica is Zaloni’s self-service data platform. It provides an on ramp for the business user or data scientist into the data lake. It is a catalog and self-service data preparation tool.

Together, these products are particularly powerful as the can help IT and the business collaborate in such a way that it reduces time to insight.

How does Zaloni approach the issue of data warehouse modernization (augmentation)?

Data warehousing modernization or augmentation is a common use case for Zaloni customers. These use cases often focus on enabling processing of new data types and formats not easily processed by their existing data platform.

In these cases there organizations use the data lake to process data from new formats and then make them available to their existing EDW in a format that it can consume.

Zaloni also helps our customers with their ETL offload capabilities where the data lake is used for much of the core data management processing reducing this overhead on the EDW allowing it to focus its compute resources on analytics, business intelligence, and other high value use cases.

And what is the company’s approach to data governance, especially with those companies overwhelmed with traditional data warehouses and emerging big data initiatives?

Data governance is a core tenant of the Zaloni technology and our delivery methodology.

We drive governance through the entire data management pipeline from ingestion, metadata management, data quality, data privacy and security, and overall lifecycle management of the data.

Metadata is core to our governance practices. In addition, the Zaloni Metadata Exchange Framework o allows organizations to leverage their existing data governance tooling via an exchange of data governance policies to and from the data lake.

One interesting offering is Mica, Zaloni’s data catalog and preparation offering, could you tell us a bit more about it?

Zaloni Mica is a self-service data platform that provides data cataloging, data preparation, and provisioning of data to and from the data lake environment.

Mica provides a business user driven interface that engages more of the types of users to access data that is relevant to them.

Mica also provides a governed “self-service” approach in that any preparation or data transformation may be scheduled or managed so that permanent changes to the data lake are managed as required by the use case.

Finally, on a personal note. Who was/is your favorite superhero?   

Marvel’s Wolverine – He is a bit of an underdog, always taking on the more powerful mutants. I always root for the underdog.


previous article
Newer Post
next article
Older Post



no

Name

Email *

Message *