Open Source Master Data Management – a Q&A with Jim Walker and Yves de Montcheuil of Talend by Ron Powell

b-eye-network.com

BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.

Presented as a Q&A-style article, these interviews with leading voices in the industry including software vendors, end users and independent consultants are conducted by the BeyeNETWORK and present the behind-the-scene view that you won’t read in press releases.

This BeyeNETWORK spotlight features Ron Powell’s interview with Talend’s Jim Walker, Director of Product Marketing, and Yves de Montcheuil, Vice President of Marketing. Ron, Jim and Yves discuss open source in general, and talk about open source master data management and Talend’s goal to democratize enterprise technologies and make them available to the masses.

Jim, Talend has been in open-source integration since its inception. Are you seeing an acceleration in open source adoption?

Jim Walker: It has been interesting to see the effect of open source technology in the marketplace and how quickly open source technologies do get adopted. It has also been interesting to see the organic growth that a product goes through as it gets out into marketplace. From day one until now, if we look back at numbers, I think we’ve had over 15 million downloads of our open source solution from our website. We’re counting now about 750,000 users of Talend software worldwide. This shows acceleration and momentum in the marketplace.

We’ve been really doubling year over year our rates of adoption in terms of all of our products, across the entire solution set of data integration, data quality, master data management and application integration. We’ve seen a significant increase, and it’s been a steady doubling every year. As we first ventured out from data integration into data quality, we have seen acceleration of our open source profiling tool. Everything we go to market with has a Talend free or open source version and a commercial edition. From the data quality point of view, the open source product is called Talend Open Profiler so we have a tool that’s out there that many other vendors are charging about $20,000 for, and we give it away at no charge. We’re seeing an incredible amount of downloads of that tool. It’s a fairly powerful tool that’s out there for the masses today, and it allows people to identify problems they have from a data quality point of view.

Talend MDM is our most recent addition as we expanded in the master  data management market. In about 18 months, we’ve had about 45,000 downloads, and it’s interesting to look at the community to see where the momentum has been. We see people that are still modeling and mastering customer and product as traditional domains, but I think the pure accessibility and kind of democratization of the marketplace has really opened up master data concepts into other areas.  As the market matures, we’re seeing a huge effect in the marketplace in terms of the way people think about master data management.

You mentioned data profiling and data quality, and most people in our audience are very familiar with how that works with open source, but open source MDM is a new area Could you talk a little bit about how open source MDM is changing the marketplace?

Jim Walker: Ron, it’s really about accessibility, and it’s democratizing the market just as we saw in data integration over the past four or five years. We’re starting to see big changes. We put surveys out to our Community Edition users, those who have downloaded our open-source MDM. We asked them how they use the tool. The main focus is customer and product domains, but we’re seeing people master other domains like suppliers, assets, and organizations of websites, and there’s a lot of reference data usage. Quite honestly, MDM and MDM tools just weren’t accessible for these types of domains before. I think they were either way too complex or way too expensive. Now with an open source alternative, people are taking the master data principles and really addressing other domains, which is a unique concept and that is really what’s changing the marketplace.

Now in a year and a half, we’ve seen wide scale adoption of Talend’s master data management products. Where we had seen no projects before, we’re starting to a kind of an organic growth in organizations. In the survey I mentioned, we asked our open-source adopters of Master Data Management how they’re using the software. We asked if they were working toward a project that will be live and in production. We wanted to determine if it was just curiosity, because a lot of people are just curious about master data management. We found that some of them were just evaluating, but we also found that people were implementing upwards of 60 different projects with open-source MDM.

When you think about 60 master data management projects that have been created in just 18 months because of our open-source tool, that’s a big change. We’ve seen maybe just into double-digit growth for many of the different solutions that are out there in terms of how many projects they’re getting going year-over-year, but upwards of 60 is just really a massive amount of MDM projects. Like I said, I think it’s because of cost and because people now understand the principles. They want to be able to apply it to really a wide array of different domains that have never really been thought about before. So, in answer to your question, yes, there is an acceleration, there’s a lot of change in the marketplace, and, quite honestly, I think open-source MDM will affect the way that people think about master data management.

You mentioned 60 different projects in MDM, and at Talend you also talk about a unified platform for all data and application integration approaches. Can you elaborate more on what you mean by unified?

Jim Walker: A unified platform to Talend is truly a unified platform. What Talend goes to market with is one set of technologies across data integration, data quality, master data management and application integration (ESB and SOA). Everything is part of a common platform. What that means is we’re talking about a single development environment across all these different functions. This is all Java-based in an Eclipse environment. As a result, there’s one set of developers who can really move from one area into the next. It provides an ease of transition between different projects. My heart and soul are, quite honestly, in the MDM space, but when I look at an MDM project, what’s the point of mastering data if you aren’t integrating data? What’s the point of mastering data if it’s not of high quality? What if you aren’t making it accessible by publishing and subscribing via an ESB for Master Data Services? All these things become extremely important. In the traditional models, or the legacy models of the master data management at least, you had to go out and buy a license for each one of these things. They’re separate technologies. When you start to do these things together, you start to see economies of scale across all the different functions by being able to share developers across data integration, data quality and MDM.

The one other area I think is really important from a unified platform point of view is the concept of a common repository. With one set of technologies, I can make a configuration that’s going to connect my MDM or connect my Enterprise Service Bus to some backend database or application. If I have a set of credentials and a configuration for that, I just place that into the repository and whoever owns that system owns that system. Whenever that is updated, all the various different areas are updated as well – all the different jobs and projects going on. It’s a common repository for artifacts and all the metadata around data management projects, which I think is the real key as well. There are some nuances too about execution, monitoring, and understanding what’s going on across an entire implementation, but when you start to do all these things together, you really start to look at the problems of data management, application integration and integration in general from a very different point of view.

If I look at the market today, one of the biggest trends is big data – the unstructured big data trend with Hadoop and NoSQL technologies. How do these trends and technologies affect Talend and your product offerings? There’s really a common thread here with Talend because this big data movement is also an open-source movement. Could you elaborate on that?

Yves de Montcheuil: You’re absolutely right, Ron. Big data started in open source and that’s where we’re seeing the most innovation today. At Talend, we recognized that quite early, and we have integrated big data into our strategy for over three years now, working very closely with the providers of Hadoop technology primarily but also our own MapReduce technology, before Hadoop got so prevalent.

Where we are seeing big data is not only as a source and a target – of course we need to get data into the big data like Hadoop and we need to extract data from those structures – but also as a way to use the power of Hadoop to process the data for data integration, for data quality, and even for master data management? Hadoop is a highly powerful transformation engine, and what we’re doing is we’re making it extremely easy to deploy and to run integration and quality processes inside Hadoop through our traditional user interface-driven and code-generation approaches. So big data is growing out there. The goal for organizations is to be able to leverage all their data, not just the data that is stored in data warehouse and to provide access to all the data regardless of its shape or its origin.

Another big trend – and it seems you’re participating in all of them is – is the cloud. Most enterprises are considering participating in either a public or private cloud. How does Talend support cloud initiatives?

Yves de Montcheuil: Well, what we’re seeing in the cloud is that lots of companies, as you said, are starting to deploy mission-critical applications in the cloud. However, they’re not ready to deploy everything in the cloud. They’re keeping lots of stuff inside the firewall – the on-premises systems. One of the biggest challenges they are facing is to get those applications deployed in the cloud integrated with the rest of the information systems on premises. They end up with a very hybrid environment coming all the way from on premises, private cloud, public cloud, and then software-as-a-service – that, even though it’s not technically cloud, actually runs in the cloud and the issues are quite similar. And what you need to do is get all those applications to work together, to integrate together, not only at the data level but also at the process and application level, to be able to exchange data and to run the integration processes as close to the application and data as possible. So Talend Cloud is actually designed to run in the cloud, to run in a hybrid environment, and to connect and integrate hybrid applications.

Yves, do you see more private clouds over public or is there a balance? What are you seeing with your customer base?

Yves de Montcheuil: There is definitely a balance, Ron, depending on the size of the organization and depending on the criticality and confidentiality of the data and applications. Frankly, I think there is something for everybody in the cloud offerings out there. With a private cloud, you can very easily extend your IT stack and give it the elasticity and the flexibility that you need. With the public cloud, you get essentially unlimited scalability and elasticity but, of course, you are losing in terms of privacy and control of the data. You have to accept certain security compromises. Now as you combine both, I think you can very easily deploy an infrastructure that really meets your needs.

Well, we’ve talked about MDM and open source, we’ve talked about big data, and we’ve talked about the cloud. What do you see for the future? If you look in your crystal ball, where are we going?

Yves de Montcheuil: Well, I’d love to be able to look in the crystal ball and predict the future. But where we are going is that we are going to continue to democratize the integration market as a whole. If you look at the path that we have followed for the past five years, we started small. We started as a pure-play data integration player. We are now a global company deployed in 8 countries with 400 employees and with a very comprehensive product stack. We will continue to expand this product stack with always one purpose in mind – to democratize enterprise technologies and make them available to the masses and provide those technologies to companies that could not afford the proprietary environments of the past.

Thank you so much Jim and Yves for agreeing to this interview so our readers can learn more about Talend and open source. It’s been a pleasure talking to you.

Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward Show. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at rpowell@powellinteractivemedia.com

Recent articles by Ron Powell

Leave a Reply

Your email address will not be published. Required fields are marked *