Skip to main content

Zaloni and its Data Lake Management Approach: Interview with Scott Gidley

The accelerated evolution of the big data and database markets and the increasing need for managing larger volumes of diverse types of data have favored and created the need for the emergence of so-called data lake management providers.

One of these key new players in this market is Zaloni.
Zaloni, provides enterprise data lake management, governance and self-service data solutions aimed to ensure organizations to have a clean, actionable data lake.

Co-founded and led by Ben Sharma (CEO) & Bijoy Bora (COO), this North Carolina based software company relies on the experience of its team to provide a reliable solution for managing reliable and efficient data lake services.

Recently we have the opportunity to speak with Scott Gidley ―Zaloni’s VP of Product Management― about the company, its solutions and in general, about the big data and data lake markets.

Scott leads the strategy and roadmap of the Zaloni platform portfolio. He is a 20-year veteran of the data management software and services market. Prior to Zaloni, Scott was a senior director with SAS and was the CTO & Co-founder of DataFlux Corporation.

Here, an insightful interview with Scott.

Hi Scott, so, what is Zaloni, what is the story behind the company?

Zaloni is an enterprise software company that helps its customers modernize their data platforms and ecosystem in order to leverage data more effectively - such as through more agile, more advanced analytics.  Zaloni provides Bedrock ―a Data Lake Management and Governance offering and its Mica Catalog and Self-service offering.

Zaloni has a number of customers across a variety of industries including Du and Verizon in Telecommunications, UnitedHealthcare Group and SCL Health in Healthcare, as well as Avant and Metlife in Financial Services.

How would you describe, based on your experience with Zaloni, the state of the Big Data and Data Warehousing markets in terms of maturity, its integration within a coherent data management infrastructure, challenges and opportunities perhaps?

From our experience, organizations continue to explore new capabilities and insights that are promised by exploiting the volume, variety and veracity of big data in their environments. While the maturity of the big data management and analytics technologies continues to evolve, it is far from a finished product and less mature than the existing EDW and data management platforms.

Zaloni customers often look for guidance on how to blend their existing data platform with their new big data initiatives. This includes leveraging enterprise data governance policies in their big data projects,  off-loading or augmenting their data management and transformation processing from the EDW into a data lake, implementing data quality, security, and tokenization routines on data native to the data lake, and transforming streaming data to deliver real-time business value.

What are the common mistakes companies tend to make when trying to launch a data lake initiative?

While data lakes are becoming the preferred platform for modernizing enterprise data environments, un-managed, poorly thought through data lakes are simply data swamps and their usefulness decays over time. We find the following common mistakes:

  • The data dump into Hadoop. Too often organizations base the initial success of their data lake on the amount of data they can load into it. While a great initial for cost-savings – if you dump the data into Hadoop without appropriately tagging the data and indexing the metadata, you will not have the visibility and quality you need from the data lake.
  • No clear use case up front. If a company does not work to think through a solid set of use cases for the data lake prior to implementation, they will not see the benefits as easily and the entire project can get scrapped due to lack of stakeholder support.
  • Start small, deliver value, then expand – Too often organizations try to implement the most difficult use case to prove value in their big data projects. Starting with a smaller project,
  • completing it end-to-end, will provide a roadmap for enterprise level success.

(post-ads)

A lot has been said about Data Lakes, could you describe in your vision what a data lake and a data lake management platform are?

Data lakes are becoming the preferred platform for the modern data platform.

Data Lakes are most often referred to as a single repository for storing data across an enterprise in its raw format. Data Lakes often utilize Hadoop, Spark, and other scale out architectures to store and transform this data to be used for analytic and BI use cases.

Data Lakes differ from traditional EDW in that data may be stored in any format and processed natively, rather than having to be extracted from the data lake to process.

Existing data lake management platforms focus primarily on ingestion of data at scale, metadata management, data quality, security, and self-service data preparation. This has served the initial data lake implementations well, but as data lakes mature the focus will shift to automation and optimization of these processes.

The future success of big data environments will be based on the ability to automate many of the manual processes today and to help end users and subject matter experts to focus on business value and not acquisition and management of data.

What are the main challenges your potential customers face that get them to look for a solution such as Zaloni?


  • First and foremost, organizations struggle with the skills gap created by big data technologies such as Hadoop, Spark, Map Reduce, etc. Technologies such as Zaloni Bedrock alleviate the resource crunch by providing a business user-friendly interface that reduces the technical complexity.
  • The big data ecosystem is comprised of many tools and technologies (as mentioned above). In addition to the skills gap, organizations struggle with the requirement to stitch together these technologies into a coherent data platform. Bedrock Zaloni provides a single interface for an end-to-end data platform. This simplifies the deployment and development and speeds time to value.
  • Leveraging existing data management principles (data governance, data quality, lineage, privacy and security) in the big data world. As organizations look to adopt data lakes or big data projects, they want to ensure they are compliant with their corporate data standards. Having a seamless way to leverage existing data management principles in the data lake is critical for enterprise adoption.

Zaloni's Data Lake Services (Image courtesy of Zaloni)

About Zaloni’s Data Lake 360 Platform, could you describe its key components (Bedrock & Mica)?

Bedrock is a data lake management and governance platform that enables managed ingestion of data at scale, that inventories all metadata, builds workflows and provides appropriate governance requirements such as data lineage, data quality, data security and privacy and data lifecycle management.

Mica is Zaloni’s self-service data platform. It provides an on ramp for the business user or data scientist into the data lake. It is a catalog and self-service data preparation tool.

Together, these products are particularly powerful as the can help IT and the business collaborate in such a way that it reduces time to insight.

How does Zaloni approach the issue of data warehouse modernization (augmentation)?

Data warehousing modernization or augmentation is a common use case for Zaloni customers. These use cases often focus on enabling processing of new data types and formats not easily processed by their existing data platform.

In these cases there organizations use the data lake to process data from new formats and then make them available to their existing EDW in a format that it can consume.

Zaloni also helps our customers with their ETL offload capabilities where the data lake is used for much of the core data management processing reducing this overhead on the EDW allowing it to focus its compute resources on analytics, business intelligence, and other high value use cases.

And what is the company’s approach to data governance, especially with those companies overwhelmed with traditional data warehouses and emerging big data initiatives?

Data governance is a core tenant of the Zaloni technology and our delivery methodology.

We drive governance through the entire data management pipeline from ingestion, metadata management, data quality, data privacy and security, and overall lifecycle management of the data.

Metadata is core to our governance practices. In addition, the Zaloni Metadata Exchange Framework o allows organizations to leverage their existing data governance tooling via an exchange of data governance policies to and from the data lake.

One interesting offering is Mica, Zaloni’s data catalog and preparation offering, could you tell us a bit more about it?

Zaloni Mica is a self-service data platform that provides data cataloging, data preparation, and provisioning of data to and from the data lake environment.

Mica provides a business user driven interface that engages more of the types of users to access data that is relevant to them.

Mica also provides a governed “self-service” approach in that any preparation or data transformation may be scheduled or managed so that permanent changes to the data lake are managed as required by the use case.

Finally, on a personal note. Who was/is your favorite superhero?   

Marvel’s Wolverine – He is a bit of an underdog, always taking on the more powerful mutants. I always root for the underdog.


Comments

  1. For any size of business to remain aggressive, it's basic to understand its information since its partners are likely previously doing likewise with theirs.artificial intelligence training in pune

    ReplyDelete
  2. At that point you will have the option to set your test date when given an ID. You can take the test whenever during the year in the wake of applying.
    ExcelR PMP Certification

    ReplyDelete
  3. Glad to chat your blog, I seem to be forward to more reliable articles and I think we all wish to thank so many good articles, blog to share with us.
    ExcelR digital marketing course in sydney

    ReplyDelete

  4. This is my first time visit here. From the tons of comments ExcelR PMP Certification on your articles.I guess I am not only one having all the enjoyment right here!

    ReplyDelete
  5. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work! PMP Certification Training

    ReplyDelete
  6. I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    360DigiTMG pmp certification

    ReplyDelete
  7. PMP Certification Pune
    I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article.
    Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.

    ReplyDelete
  8. What an incredible post that you have shared here. You make data easy for us that we get great realities about it. A debt of gratitude is in order for this enlightening online journal. oracle fusion manufacturing training India

    ReplyDelete
  9. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.PMP Certification

    ReplyDelete

  10. ExcelR provides . PMP® certification. It is a great platform for those who want to learn and become a PMP®. Students are tutored by professionals who have a degree in a particular topic. It is a great opportunity to learn and grow.

    PMP® certification

    ReplyDelete

  11. ExcelR provides PMP Certification. It is a great platform for those who want to learn and become a PMP Certification. Students are tutored by professionals who have a degree in a particular topic. It is a great opportunity to learn and grow.


    PMP Certification

    ReplyDelete
  12. Truly, this article is really one of the very best in the history of articles. I am a antique ’Article’ collector and I sometimes read some new articles if I find them interesting. And I found this one pretty fascinating and it should go into my collection. Very good work!
    pmp certification

    ReplyDelete
  13. Truly, this article is really one of the very best in the history of articles. I am a antique ’Article’ collector and I sometimes read some new articles if I find them interesting. And I found this one pretty fascinating and it should go into my collection. Very good work!
    pmp training in bangalore

    ReplyDelete
  14. You have such a wonderful ability to make people feel valued and appreciated. Buy google business email pricing from google business email pricing

    ReplyDelete

Post a Comment

Popular posts from this blog

Machine Learning and Cognitive Systems, Part 2: Big Data Analytics

In the first part of this series, I described a bit of what machine learning is and its potential to become a mainstream technology in the industry of enterprise software, and serve as the basis for many other advances in the incorporation of other technologies related to artificial intelligence and cognitive computing. I also mentioned briefly how machine language is becoming increasingly important for many companies in the business intelligence and analytics industry. In this post I will discuss further the importance that machine learning already has and can have in the analytics ecosystem, especially from a Big Data perspective. Machine learning in the context of BI and Big Data analytics Just as in the lab, and other areas, one of the reasons why machine learning became extremely important and useful in enterprise software is its potential to deal not just with huge amounts of data and extract knowledge from it—which can somehow be addressed with disciplines such as data

Next-generation Business Process Management (BPM)—Achieving Process Effectiveness, Pervasiveness, and Control

The range of what we think and do is limited by what we fail to notice. And because we fail to notice that we fail to notice there is little we can do to change until we notice how failing to notice shapes our thoughts and deeds. —R.D. Laing Amid the hype surrounding technology trends such as big data, cloud computing, or the Internet of Things, for a vast number of organizations, a quiet, persistent question remains unanswered: how do we ensure efficiency and control of our business operations? Business process efficiency and proficiency are essential ingredients for ensuring business growth and competitive advantage. Every day, organizations are discovering that their business process management (BPM) applications and practices are insufficient to take them to higher levels of effectiveness and control. Consumers of BPM technology are now pushing the limits of BPM practices, and BPM software providers are urging the technology forward. So what can we expect from the next

Teradata Open its Data Lake Management Strategy with Kylo: Literally

Still distilling good results from the acquisition of former consultancy company Think Big Analytics , Teradata , a powerhouse in the data management market took one step further to expand its data management stack and to make an interesting contribution to the open source community. Fully developed by the team at Think Big Analytics, in March of 2017 the company launched Kylo –a full data lake management solution– but with an interesting twist: as a contribution to the open source community. Offered as an open source project under the Apache 2.0 license Kylo is, according to Teradata, a new enterprise-ready data lake management platform that enables self-service data ingestion and preparation, as well the necessary functionality for managing metadata, governance and security. One appealing aspect of Kylo is it was developed over an eight year period, as the result of number of internal projects with Fortune 1000 customers which has enabled Teradata to incorporate several be