Dave Kellogg is an advisor, director, consultant, angel investor, and blogger focused on enterprise software startups. He has 10 years’ experience at the CEO, CMO, and independent director levels across 10+ companies ranging from zero to over $1B in revenues. He is currently an executive-in-residence (EIR) at Balderton Capital and principal of his own consulting business.
As the Co-founder and CEO of Alation, Satyen lives his passion of empowering a curious and rational world by fundamentally improving the way data consumers, creators, and stewards find, understand, and trust data. Industry insiders call him a visionary entrepreneur. Those who meet him call him warm and down-to-earth. His kids call him “Dad.”
Producer: (00:01) Hello and welcome to Data Radicals. In today's episode, Satyen sits down with Dave Kellogg. Dave Kellogg is one of the leading enterprise executives in software today. His blog, Kellblog, is among the most highly regarded content hubs for software leaders today, drawing on his extensive experience as an angel investor, board member, advisor, and thought leader. In this episode, Dave discusses why he emphasizes simple messaging, how he uses frameworks to problem-solve, and how to map your business in a complicated world.
Producer: (00:32) This podcast is brought to you by Alation. Successful companies make data-driven decisions at the right time, quickly, by combining the brilliance of their people with the power of their data. See why thousands of business and data leaders embrace Alation at Alation.com.
[music]
Satyen Sangani: (01:04) Today on Data Radicals, we have Dave Kellogg. Dave Kellogg joined Balderton as the company's first executive in residence. He brings more than a decade of experience at each of CEO, CMO, and independent director levels across some of the world's leading SaaS and enterprise software firms. He was previously CEO of cloud financial planning company Host Analytics, now Planful, as well as SVP/GM of Salesforce service cloud business, CEO of MarkLogic, and led marketing for 9 years at business intelligence leader BusinessObjects. He started his career in technical and product marketing positions at Ingres and Versent, and previously, Dave has served on the boards of Nuxeo, Granular, Aster Data, and of course, Alation. Dave, welcome to Data Radicals.
Dave Kellogg: (01:47) Thanks for having me, Satyen. It's great to be here. Good to see you again.
Satyen Sangani: (01:50) I have wanted to have you on the podcast for a long time because you've seen, I think, almost every generation of data company that exists. And you started almost 4 decades ago at some of the very early database companies. Tell us about how you've seen the data space evolving and what changes you've seen over the last 40 years, and what are the major shifts that you, perhaps, see going forward?
Dave Kellogg: (02:13) So when I started, I viewed myself as a kind of foot soldier in the relational database (RDBMS) revolution at the time. So, 1983 was the first time I used a relational database at Lawrence Berkeley [National] Lab in Berkeley. I started working in Ingres in 1985, and I would estimate the entire RDBMS market was less than $100 million at that time, so it was very early days. I think I've seen 4 big things. If you try and look back across that long a period of time, it's hard to net it out, but I'd say first, the invention of the relational database, which went far further. By 1990, I thought that market was over — [laughter] oops. And I was like "Oh, there'll be something new." And so the RDBMS was a big one.
Dave Kellogg: (02:52) I think the data warehouse was huge. So the invention of the data warehouse in the '90s by Bill Inmon, I think the invention of what... I don't have a good name for it. Maybe it's NoSQL, but the hybrid search engine database, which MarkLogic was one, but I would include MongoDB, Hadoop, broadly speaking. But the database was kind of built like a search engine. That was a major change, because it allowed the introduction of unstructured data, and I would say the modern data stack today. So to me, those are the 4 big kinds of infrastructure-level changes that I've seen.
Satyen Sangani: (03:19) Yeah. And the modern data stack, I guess, would be ushered in by the likes of the cloud data warehouses like Redshift and Snowflake and Databricks. Is that fair to say?
Dave Kellogg: (03:27) Yeah. I would say that they certainly brought it. I have maybe a unique view that I think we'll get to later. But I feel like the modern data stack was the first time that the output of analytics was going into a model. And normally, the output was into a report or into a dashboard, or maybe later into a visualization, but there was always kind of a human behind it. And now, we're getting the output of analytics and making it the input to a model: first, to build it and train it, and then second, to deploy it in production. So it had massive implications because in my mind, it changed the consumer.
Satyen Sangani: (03:58) How is the consumer today different from 10, 20 years ago?
Dave Kellogg: (04:03) In the old days, BI was about building reports or dashboards or charts, visualizations, where you were basically trying to make an argument to somebody. And you would send some analysts off to study a problem and you'd come back and make an argument on slides and then you'd show it. Or maybe it'd be an ops review. But invariably, a lot of BI got output into PowerPoint, ultimately, [chuckle] and people talked about it, and that was it. The purpose was to kind of serve as information to inform a discussion to make a decision. And to me, in a modern data stack, you're not informing a group of people. You're kind of informing a model. And that means, first, that model can be used to say, as we did at Alation at one point, what's our ICP? So let's go build a model and get lots of data in there, and then we'll have a discussion. The old deployment model used the new technology to build a model, and then discuss the model and say, "Okay, we're gonna pick our ICP to be this."
Dave Kellogg: (04:55) But it can also now inform... That model could be plugged into a production application. I'm running a call center and I wanna know which leads to call back first, so it's real time, it's running in production. The consumer is effectively an application, not a human, then therefore you need observability to check for model drift. All these things you didn't have to worry about before, you now worry about, observability being one. Is it up? Is it running? Has it drifted? These are all concerns you never had when the user was a person. Ultimately to me, that was the light that went off in my head. The BI from my entire career was about building reports, charts, graphs, whatever, for people to look at and make decisions, and the modern stack was about building models. And yes, sometimes people would consume those models, but a lot of times, they wouldn't. Those models would just get plugged into a production app.
Satyen Sangani: (05:41) We talk a lot about this idea of complexity because when you have this idea of a person being the end point, on some level, that's the end point. There's no further computation that goes from the person to another computer. But when you have a model, a model can feed other models, which can then feed other models, and then if you want, you can build a ton of other models. And now you've got models and models, and at some level, you have that FiveThirtyEight problem where the model itself is brittle because it's only just a function of a whole bunch of other models. How do you see that evolving?
Dave Kellogg: (06:12) I don't have a pithy answer for you. I think it's going to keep happening, Because for years, we talked about real-time analytics, and I think the only real-time analytics app was credit card scoring or credit transactions. That's been real-time for a long time. You swipe your card, it goes off, and somebody says, "Does this one look sufficiently like the other ones to approve it? Or is it sufficiently different that we reject it?" So we're comfortable with that as a real-time app. I remember at some point people talked about real-time analytics companies and I never understood what it meant, because to me, what we actually wanted was smart applications, in my mind. I don't want to have an app on the side that makes a list of the hot sales leads to call. I want a CRM app that in the lead management module does that automatically. I want a smart app, not a BI product. And I think BI vendors for a long time, back in the day, would talk about analytic apps. And to me, the analytics needed to be baked into the production app, not the other way around.
Dave Kellogg: (07:04) So I think the trend you cited is gonna continue, and then what you've cited was just more... The problem is even harder than I described. It's not as simple as the data is feeding a report or a dashboard like the quaint old days. Now it's feeding a model, so we have to worry about, "Is it up? Is it running? Is the model still valid, or have we drifted away from the training data? And how are the downstream models affected by this?" But I think it's still the same problem. In some ways, the traditional BI stack grew up to serve the needs of “make charts, reports, and graphs for people.” And to me, the modern data stack grew up in support of models, if I had to just net it out.
Satyen Sangani: (07:43) Yeah, which I think is a really profound point. And I guess the implication is, if you think about things like reverse ETL where you're taking output from the analytical engines and putting them back into LTP apps, now in the limit, every operational app is gonna become in essence, an analytical app because all of the points of transactions will have recommendations or decisions associated with them. Because if they're deterministic, they'll be done automatically, and if they're stochastic or require judgment, then there's gonna be some recommendation that comes with it, presumably.
Dave Kellogg: (08:13) Yeah, I agree. I think the future is smart apps. I don't predict the BI vendors end up on top on this. I think people who sell modern data stack tools, which are all effectively enabling tools, they're not applications to help you market leads or find customers or process service calls faster. These are infrastructure-level tools that help you build smart apps. I still think there'll be analytics on the side for the post-game show. At the quarterly business review, we need to review the monthly reports and charts and graphs and dashboards. All that stuff's gonna continue to exist where people are primarily the consumers. But there's this entire world growing up next to us where models are the consumers, and those models are enabling a generation of smart applications.
Dave Kellogg: (08:53) It's super exciting, but I think it's super different from how we grew up. And for whatever it's worth, I think it's 2 stacks. One of the things I think people are afraid to talk about today, everyone kind of pretends it's 1 stack. Oh yeah, like the NBS incorporates traditional stack or traditional stack incorporates the NBS stack. And I think we've got 2 stacks built for 2 different purposes used by 2 different teams of people. And I think one of the interesting challenges will be, to what extent and how do we unify them going forward?
Satyen Sangani: (09:18) But do you see a need for unification or does it have to be that they're unified?
Dave Kellogg: (09:21) I think it does. I'm a believer in Alation, so my immediate gut feel is to go, "Where does a model belong? In a Git repository or an Alation catalog? What's code, and what's data?" I mean, this is another interesting thing because you're like, "Hey, what's a model when models are feeding models?" Well, when models are trained by data, what's the difference between data and models? So it's kind of this existential, is it like light where it's both a particle and a wave? [laughter] Is a model both data and code? Because it kinda is. And where should it live? And where should the metadata about it live? I think right now, it's more likely to live in a code repository. The verdict being that that it is code, not data, but why not in a data catalog?
Satyen Sangani: (10:02) Yeah. I think arguably one could have it live in both and those things do related things, but not the same. In the same way that you wanna discover data, you wanna discover a model because, to your point, the output of a model is just more data. So what's the difference? Not different from, I guess, the difference between sort of a materialized view and actual physicalized data. They're kind of the same thing.
Dave Kellogg: (10:22) Yeah, in some ways. And I would further agree that there are things you wanna do to a model that maybe you don't wanna do in a data catalog, but I think it's a really interesting question. If nothing else, this brick wall between code and data is getting eroded quickly.
Satyen Sangani: (10:34) Yeah. I would agree. So in this world of the modern data stack, one of the things that has happened of late is that in all of the excitement around the new stack, there's been all of these companies that have gotten funded over the last 3 to 5 years. And certainly in the last year, that number has gone down because funding in general has gone down. How do you see that trend moving forward? Do you think we're gonna have more companies, less companies?
Dave Kellogg: (11:01) So I think there might be more. It's going to be harder to get funding. I think there'll probably be a similar number of companies, but each with about half the cash [chuckle] or maybe a quarter of the cash. And so I think the thing that's gonna change is that money will not be flowing so freely. First, there's still an enormous amount of work to do, right? One of my tests on the kind of TAM for the business intelligence analytics market is, one day in the future, imagine you go to a cocktail party and you ask a bunch of business executives, "Does everybody have access to all the data they think they need in order to make good business decisions?" And the day that everyone raises their hand and says, "Yes," is the day we're done. [chuckle] And we are a long, long way from done.
Dave Kellogg: (11:39) I've felt this since 1983, basically. And we keep chipping away at the problem. The problem keeps getting harder as we chip away at it. It's like the data volumes have exploded, which makes everything exponentially harder. When I worked at Ingres — this will blow your mind — the maximum row length was 512 bytes. Okay? That was the biggest size a row could be in a table. [laughter] And we used to fight about page versus row-level locking. And it's like we have a 500K row and a 2K page. Who cares? [laughter] So I think we're chasing the problem as the problem gets harder and expands in scope right before our eyes. So I think there's an enormous amount of work yet to be done, so I think there'll be a lot of companies out there.
Dave Kellogg: (12:21) I do think there was a kind of Cambrian explosion of tools in the modern data stack days. I think that's gonna be followed by some consolidation, just because cash is harder to come by. I'm a big believer in the modern data stack. It's just different. I just think it's been built to do a different thing for a different user, and I think there's some desire to gloss over that by some of the people involved.
Satyen Sangani: (12:41) I would tend to agree. Despite all these companies, we've seen very few of them get to some material scale. How we just passed $100 million, I'll call $100 million material scale because it's in some sense a self-serving definition, but I think it's actually real. And one of the things that I think or at least I've observed is that you have a lot of data companies because in some sense, data companies try to systematize how people think, and people think in a lot of different ways and business processes work and business works in a lot of different ways, and data is materialized in a lot of different ways. And so you see a lot of these companies where you can get to some small level of scale, but then those use cases don't get to a large level of scale. What are the trends that you're seeing as we now move into this more capital-constrained environment? So you see a lot of companies, you talk to a lot of them. What is this constraint-based environment now causing people to think about and do?
Dave Kellogg: (13:31) I always feel like data is sometimes an abstraction contest [laughter] because all we're selling is abstractions. "Well, I got a model. Or I got a meta model. Or I got a meta meta model, and my meta meta model is better than your model." And I don't actually think that helps sell software, for what it's worth, but you'll see it happen in data circles. Literally, I'm in meetings where I have no idea what's being discussed because there's just so many different kinds of competing abstractions. And today to make that concrete, you have the data fabric versus data mesh. In my mind, they're competing abstractions. You can't buy one of either, right? I'll give a different example. BusinessObjects, we sold a semantic layer. It was an abstraction. The reason I think it sold so well was the abstraction came with a tool. [chuckle]
Dave Kellogg: (14:11) So imagine if we had just tried to sell a semantic layer standalone, I don't think it would have worked. But we said, "Hey, we've got this tool, and what makes this tool great is the semantic layer that comes with it." All of a sudden, now the abstraction is an enabler to a more powerful tool. So in many ways, I think that's why Alation has done so well compared to, say, metadata management vendors, which is you didn't just say, "I'm a metadata repository," which you could have said. "I've got lots of metadata about lots of data assets across the organization." You said, "I've got a data catalog." You've turned into software of practical value, something that uses the abstraction of this metadata layer. So in general, that is the tip. I think the answer to the question is, companies should focus on selling things, and if the thing is better because it has an abstraction, great. [chuckle]
Dave Kellogg: (14:57) You're better because you have the metadata inside the Alation catalog. BusinessObjects is better because they have a semantic layer inside its BI tool. So if you produce a more useful tool that does something, then great. If some abstraction in there helps you do that, that's awesome. But the thing I've seen, where you're just kind of selling abstractions, I don't think it works.
Satyen Sangani: (15:16) Yeah. I would tend to agree with that. I mean, for us, I think what you observed is purposeful because I think on some level, everybody that is in or around the analytics space have always said, "Well, the metadata is the core, or most valuable asset." And so you hear long-time analytics people say, "Well, if we just figure out the metadata." And yet the reality of it is, the metadata is not a thing. It's useful in the context of something else. And so in our case, we basically focused on search and discovery as being the killer app.
Dave Kellogg: (15:41) Totally. And then you discovered 5 other apps that all rely on metadata, and that to me was the key, which is you didn't go around saying, "Hi, we've got a better metadata management tool. You can do with it whatever you might wanna do with metadata." Which is what a lot of data companies do in effect because that's a more pure world. It's easy to be puristic. As soon as you say, "I wanna help you solve a problem," now you're building an app — in your case, the search and discovery app or the core use case of the catalog to go do something. And it kind of turns it from an abstraction into an application. This is the same thing I was trying to say with BusinessObjects.
Satyen Sangani: (16:13) I wanna switch gears a little bit because one of the things that you are a master of is marketing in the data space. That's obviously where you grew up. Even in describing the data space more broadly, you're able to simplify messages that can often be pretty complex. And so one of the things that I find in the data space that's really confusing is that everybody kind of says the same [stuff]. Like, "We're reducing time to insights. We're helping create business value and helping you be smarter business and... "
Dave Kellogg: (16:40) “Single source of truth!”
Satyen Sangani: (16:42) Yeah, single source of... Yeah, exactly. Like nobody's ever heard that before. And so you talk to lots of data companies. What's your advice on how they market themselves? And how do they think about doing that? What are the questions you ask them? And how does that inform maybe the people who are buyers who think about hearing all these messages for the first time?
Dave Kellogg: (16:58) I think for a data company, it's very important to say who you are and where you fit. You know this, but one of my favorite quotes about the whole data space is a quote from Edward R. Murrow who — I don't know what he was speaking about — but he said, "Anyone who's not confused really doesn't understand the situation." [laughter] And that's the way I feel about data. If you're not confused, you don't actually understand it because there's so many different competing ideas and different directions. We must understand first that people are confused. I've been in the space for 30 years. One of the reasons I wanted to work with Alation is I was tired of not understanding metadata. My joke used to be that you could only make “meta money” doing metadata.[chuckle]
Dave Kellogg: (17:38) Now you and some others have proved me wrong on that. But I never understood what metadata was. And I also joined the board of a company called Profisee, which was master data management. And the data people can't even sort out their acronyms, because even in that little world, MDM can mean either master data management or metadata management, and we wonder why people are confused. So whenever I advise a data company, I just say, "You need to draw a map of the world that looks familiar to the market and your customer, and then say, we are here and we solve this problem. And if that resonates with you, great, and if it doesn't resonate with you, we can use the map to find the person who it will resonate with." Say you solve a problem for a data steward, then you draw the map and you say, "We help data stewards right here." And if the person says, "Well, I don't know much about data stewards," you say, "Great, who's your head data steward or who's running your data stewardship program?"
Dave Kellogg: (18:27) So that to me is it. But I actually think the person who shows up with the best map wins. It's kind of a framing competition because once you're drawing the map — and this is what the modern data stack people do so well, by the way. Because they had built a map that is simplified in my mind, really according to their terms. If you're looking for example on Jason Horowitz's map, it kind of maps the traditional stack into a little layer below all the new stuff. Some nice sleight-of-hand there to take an existing $10 billion industry and put it in one box in the bottom left-hand corner. [laughter] Right? Then put all of the new stuff on top. So I'll just call it the power of the map, that we need to simplify things because people are confused. And how do we do that? We draw the map, we say where we are. The other thing we need to do is stop using synonyms. We need to use the same word repeatedly to describe the same thing. If I use a name and a nickname, it confuses people because I may think they're 2 different things.
Dave Kellogg: (19:21) Well, in Alation, you might say, "I stored the metadata for something versus I catalog something." And I would just prefer that you use one or the other always, because then eventually someone's gonna say, "What's the difference between storing the metadata and cataloging it?" And I'd say... Well, personally for Alation, I would recommend saying, "Well, we put something in the data catalog, we catalog it, and when we catalog it, we automatically track who's using it, where it came from, collaboration threads around it, any privacy rules associated with it." And now I've got a rigorous language where I can say that cataloging it means something and it means all these things. So ultimately that to me is kind of “definition of terms”: Using the same terms over and over again to keep things simple, and then trying to simplify the map. Basically having empathy for their listener and saying, this is really, really confusing, and we want them to understand, so we're gonna simplify it, we're gonna make a map and show them where we are, and we're gonna talk about the same concepts using the same words — always.
Satyen Sangani: (20:18) Yeah, you almost need like a Datapedia, which ironically is of course what everybody's trying to do internally within their organizations, because they themselves need their own business glossary that they're trying to develop, and then they're buying tools that are confusingly named.
Dave Kellogg: (20:32) Yeah. It's pretty funny. You actually need a data landscape glossary, right? In the same way that they need a data glossary for their own data. Yeah.
Satyen Sangani: (20:36) Yeah. Except it changes every year. It becomes like hundreds and hundreds of companies more every single year.
Dave Kellogg: (20:42) Certain people like to stir the pot to keep it confusing because the confusion sells conference tickets and it sells analyst reports, and if you just take the same line and put it in a new bottle every couple of years, that keeps people busy at conferences and stuff. So I would say the industry in general, it's probably one of the contributing factors for why there is such confusion. I do think it's inherently complicated, for what it's worth, and I do think a lot of it is just abstract, but I do think some people benefit from confusion, for sure.
Satyen Sangani: (21:08) Okay. I like your definition of the modern data stack, which is maybe the fundamental output of it is no longer just a report, but possibly a model. I think that's actually a fairly grokable, thoughtful thing. But I also have seen so many different generations over the last decade that I have been building of Alation of different versions of what might be the modern data stack, it seems like there's new tools literally every quarter, much less every year. And so in that sense, I do feel like there's a little bit of a slight-of-hand in calling any one of these versions “modern.”
Dave Kellogg: (21:38) Yeah. And the “modern metadata stack.” And I mean, there's definitely kind of a linguistic slight-of-hand going on. “Modern” is inherently good. It's a good name. Let's be clear, the modern data stack people have good marketing with the help of not only their own talent, but also the help of some talented VCs like Andreessen, who — that paper by Martin Casado et al. on the modern data stack is very good. It's a very good piece of a kind of architecture-level marketing. My only beef with it — as a traditional data stack person — it tends to be a little dismissive of the fact that these are actually 2 different stacks that grew up in different times with different users. And, yes, ELT might be useful in my world; I think it is, actually. So you're gonna get stuff that's useful in both that grew up in one, but doesn't change the fact that grew up primarily serving world B versus world A.
Satyen Sangani: (22:25) So you write a blog. When we first met, one of the things that I always admired about you is that you have this ability to sort of create a framework around everything. Every question that I had, you had a framework that was already created in thinking that through or that particular issue through, whether it was the business issue or a market issue or a people issue or whatever it may be. Was that a cultivated skill? Did you always have this capability, or how did you come to this ability to simplify things in such an easy way — or what seems like an easy way?
Dave Kellogg: (22:55) So I think there are 2 things driving it. One, as a marketing person, you get dedicated very quickly to simplification. So as you know, fairly early in my career, I switched from being kind of a math computer guy to a marketing guy. And look, the math guy in me loves rigorous foundations, definitions, axioms, theorems, building up a logical argument, so that's always been helpful. I mean, that's why I say with pipeline today, if you don't have good definitions for your pipeline stages, you're analyzing smoke in a butterfly on that or something if there's no foundation. I got that from my math training. I got simplification from marketing, because marketing is really all about simplification. People don't have a lot of time. The world is super complicated. People need the world simplified for them. And if you don't do it, somebody else will. A confused buyer is just gonna buy from the market leader. So if you're running a startup, you're definitionally not that.
[laughter]
Dave Kellogg: (23:45) So the burden of simplicity is on you, right? If you wanna be successful, you need to have a very simple explanation of why someone should buy your stuff. So part of it comes from my marketing background and a desire for simplification. And some of it comes from my executive background, which is just, I like rigorous decision-making, and rather than solve the issue at hand, sometimes to a fault, I would try to build a framework for solving it more broadly. That's useful because when you get to be my age, you end up with a lot of frameworks that you've built. It might actually slow you down a little bit, it might be irritating to your team: Like, “Just answer the question!” It's like, "No, no, I built this really cool framework for how to think about the question."
[chuckle]
Dave Kellogg: (24:20) But I think it's a combination of those 2 things. Because by the way, one of the ways we simplify marketing is we build frameworks. We build positioning charts. So we're trying to say, "Hey, the world is really confusing. There's 2 types of databases. I'll go back to 1987 in Sybase's original positioning. There are 2 types of databases, those that are good at doing decision-making, DSS, and those that are good at LTP, and then there's those that use older technology, hierarchical network and those that use relational, and there's only one database that can do LTP that's relational in Sybase. That's a classic simple marketing framework that explains the world in a familiar way. The person's head is nodding the whole time. At the end of that framework, I understand your message, which is, "Oh, you're like IDMS, but you're relational, and Oracle and Informix and the other people are all really optimized for DSS and can't handle transaction volume. Is that correct?"
Dave Kellogg: (25:08) And the answer is yes. And that's just how marketing works. So I don't know if I'm drawn to frameworks which drew me to marketing or if I'm drawn to marketing which drew me to frameworks, but that's why I have this framework thing. The other three that I'd say, Satyen — just because it's fun — is it's really hard to figure out when you're simplifying, when you're throwing out essential ingredients and when you're not. Have I simplified to the point where my simplification is really meaningless, because I threw out something important? And that leads you to the level of layered messaging, which is actually the way you learn math. They'll say, "Okay, assume the world is flat, and tell me how to calculate the area of a square." Now assume the world is a sphere, then tell me how to calculate the area. So now we're non-Euclidean geometry, right? And it's like, "Oh." So to me, it's the same idea. It's kind of like if you make a simplifying assumption, you solve the problem, and if you want to get to the next layer, then you say, "Well, let's go challenge that assumption, and now do it again." And that's why really good marketing messaging has multiple layers. So that's where you build credibility with a technical crowd because they're gonna be like, "You sound like some idiot who doesn't know what they're talking about."
Dave Kellogg: (26:07) And it's like, "No, no, no! We could change those three simplifying assumptions and get down deep and talk about why we're different” — and that layering ability is so important.
Satyen Sangani: (26:16) And how do you do that? I mean, it's not just important in marketing messages. It's important in building models for analytics, frankly. I mean, to be able to pick out the salient characteristics of a phenomena is kind of the essence of thinking. What are the key things that you do to get to that essence? Do you have any tips or tricks or techniques that you might recommend when you approach a problem?
Dave Kellogg: (26:35) So first, before answering, I'll just say, I'm reading [Richard] Rumelt’s latest book called The Crux. You know I'm a fan of Good Strategy, Bad Strategy, his first strategy book, and The Crux is really all about basically identifying the crux of the problem, and it's all focused on this: That in a business, there might be 6 or 8 big problems the company is worried about, and his whole thing is, get to the crux, get to the crux, get to the crux, get to the single one that is the biggest one. Solve that. And then when you're done solving that, look at the next one. So to me, as I read that book, to me, it's the same conversation, whether it's company strategy, marketing messaging. So much of success in life is sorting signal from noise, identifying the most highly leveraged problem and then focusing on that. I guess one trick is you say, "Well, what if this problem was solved, what will my next problem be?" And I think that could help you sort them. Let's just say you have 8 big problems and imagine each one is solved, what your next problem is, that might help you figure out which is the most important one, or certainly if solving 1 solves 3 others, you're gonna get that relationship.
Dave Kellogg: (27:33) I think to me, it's largely about discipline. You know one of my tricks is to force things into 3’s. No matter what the question is, you have to have a 3-point answer, and that forces you into a layered message. It just does, because you can't say everything the engineers wanna say about what makes it different, because that'll be 15 points, and that answer is too long and too complicated. So you only get 3, which forces you to abstraction, and then you get 3, about 3. This is what I call a ternary message tree, because I can remember that. I still remember message trees I built 20 years ago, because you can remember 3 things and you can remember 3 things about 3 things. So those are some of the tricks. I use message trees, and I just... To me, I really enjoy that work. For what it's worth, it's really trying to identify that hardest problem, because I think it's so... I guess it might even be that to such an... Just trying, just actually saying — because I find it pleasurable to say — “I wanna find the single hardest problem and be reasonably sure I had it right.”
Dave Kellogg: (28:23) Because I think a lot of people just give up. They just say, "Well, we'll just have 6 goals." And to me, it's kind of a copout. I'll feel bad and then say, "I should have tried to find the 1 or 2. Could I have reduced it further?" I'll give you another example of the reductionism, but my little reductive mission statements like "Marketing exists to make sales easier" or "HR exists to help managers manage." And those are actually pretty robust. They aren't just little pithy sound bites I popped up with. I can actually drive almost all of marketing for "make sales easier," and I can drive almost all of HR [with] "help managers manage." So I guess I just like reductionism, and I don't know what's the chicken or what's the egg in terms of marketing, math, and reductionism, but to me, it's what I like doing, and I think it's really useful for messaging, and it's really useful for strategy.
Satyen Sangani: (29:04) Yeah, for sure. The thing that I love about this idea of sort of getting to the essence of a problem or reductionism is that it forces you to prioritize because you can't execute on everything. And so every single time you're having to sort of pick a set of problems, what's the right team member to go hire next, where do I make my next investment, what feature do I build next, where do I invest my next sales dollar? All of those questions are effectively ones of priority. No matter what scale you're at, no matter what you're trying to do, you always have that fundamental problem. Who are the people that have influenced you? One of the questions I've never really asked you is, who have been the mentors in your career that have influenced how you've thought? Because you've certainly been one of mine certainly over this journey, and I'd be curious to know who the people that have affected your career and how you think and how you act.
Dave Kellogg: (29:54) Well first, thanks, and it's been a pleasure being one. Second, mine would be... I'm just gonna pick people who they're more like flashpoints, who really influenced me. Like at Ingres, one of our CMOs was a guy named Chris Greendale, and he was the guy who taught me the "marketing exists to make sales easier" thing. And I was like, "I like that." I heard that in an all-hands meeting. I was like, "I like that, and I wanna grab onto that." And I became a CMO on the back of that little three-word slogan. I like a lot of marketing thinkers. Theodore Levitt is probably one of my favorite marketing authors. He's a fairly theoretical marketing author. He was a Harvard Business School professor. He wrote the “Marketing Myopia” essay about railroad trains and are they in the railroad business, the transportation business. So he was kind of a deep thinker in marketing. I liked him. Mike Moritz at Sequoia, I didn't know him terribly well, because he wasn't my partner, but we had one or 2 crossovers and he was like, "Make a plan that you could beat." And that was another one of those — I tend to absorb those phrases. I was like, "Oh, I like that. I know what my job is now."
Dave Kellogg: (30:50) My job is to make a plan that I can beat and I did. So that was super useful. Certain salespeople, I'm having trouble remembering whom, like name names, but I've learned a lot from salespeople because I'm not a salesperson by nature, and I was always interested in learning a lot about sales. Certainly Tim [O’Neil, CRO] at Alation, I learned a lot from him. He is one of the best prioritizers I've ever worked with, and the thing that's unique about Tim in my mind is — because a lot of salespeople can think about three things, but on Monday, they're different than they are on Tuesday, and they're different than they are on Wednesday, and that's not helpful. The real moral is if you could think of three things and say, "These are the three things that matter, this is what sales needs from the organization," and those don't change over months...
Satyen Sangani: (31:31) For sure.
Dave Kellogg: (31:32) That's powerful, and that's what Tim does, in my mind, uniquely well, because that obviously, it makes it easy for the organization to fall in line, because there's a consistent direction. And the more there's a weather vane, the sales side says, "We want this, now we want that and then we want IBM, and then we want PLG," if they're a weather vane of priorities, it just kind of paralyzes the organization behind them.
Satyen Sangani: (31:51) Which is the caricature in sales, because in essence, a lot of salespeople are, "Well, what's the most important three things? It's the last three things I heard in my last deal."
Dave Kellogg: (31:58) Absolutely. The Qualified Sales Leader was a great book in that regard. I would not be doing justice to Bernard Liautaud as well. He's currently the boss at Balderton where I work a day a week, so I don't wanna sound like a suck-up, but I'm a part-time employee there. But at BusinessObjects for nine years, I learned a ton from Bernard on how to run a business. I mean, most of the stuff I write on metrics are derivatives from sheets that BusinessObjects built in the '90s. I could show you the BusinessObjects weekly sheet and you would recognize it in a lot of my work, and it's not just that I've learned from him, but he's very good at discipline, very good at staying rational, very good at putting the company first, very good at treating people well, I think. So I learned a ton from him, but I think at that point, it was probably members of my team. I learned a lot from our CFO, Ian. He was really solid.
Satyen Sangani: (32:41) Super helpful to have that list. And I'm sure anybody who can make that list for themselves, well, obviously gives you not only sort of the people, but also what you've learned from those people, which is obviously a helpful inventory to understand what you yourself believe. Switching gears a little bit, you and I have spent a lot of time talking about this concept of data intelligence, and early on when you were sort of in an interim gig here as our CMO, we had a lot of conversations where, are we using catalog or are we a data intelligence platform, or are we something else, or we did a governance tool. And you sort of grounded us in this idea of intelligence, which was obviously something that Stewart Bond from IDC had come up with. How do you think about the space of data intelligence and how does that fit with the category that you know even better, which is business intelligence?
Dave Kellogg: (33:24) I'll go back to that historical time there, and to me, the data catalog somehow in my opinion got synonymized with search and discovery, and therefore it was gonna hold you back. If you just said, “We're a data catalog,” not everyone, but a lot of people heard, “Oh, a catalog. What do I do with a catalog? I find things just like a card catalog in a library or any other kind of catalog.” So I think while search and discovery was Alation's kind of beachhead use case — and by the way, choosing the word “data catalog” back in the day, it was an excellent decision, because I think if you called yourself a data search and discovery app, it would not have gone as well. We could debate that, but I think naming it a data catalog was an excellent idea, and I think you wrote the data catalog kind of as far as it could. And one of my theories on these markets is they're a little bit like boxing matches, and I learned this in Ingres, because in Ingres the first round of the boxing match was Ingres versus Informix versus Sybase versus Oracle, and then Oracle won that, Sybase came in second, Informix third, Ingres fourth, probably.
Dave Kellogg: (34:23) And you learn that when you win a round, it's not just over. There's another round, and now you get to go up against IBM and Microsoft, right? So now Oracle got to get on for round 2. Like you beat other little guys. Now you're beating the big guys like Microsoft and Oracle, and that's the way software works, in my mind, is that if you don't evolve, if you just stay focused on what you started to as, it doesn't end well, that you're kinda like a shark. You always need to be moving, always need to be swimming forward. So I felt that Alation is an excellent data catalog and search and discovery is a critically important use case that we needed to keep moving forward. And there was this notion that Stewart at IDC was pushing called data intelligence, and I really felt Alation needed a new category to kinda grab onto. And Stewart had provided us with one because he was pretty aligned in terms of where we wanted to go, and most important to me at the time, it was a unification of search and discovery and governance. There are a lot of other ancillary categories, but the 2 big ones were search and discovery and governance, and how do we unite them into one thing, and again, Stewart had helped with that by saying data intelligence.
Dave Kellogg: (35:28) I view intelligence as a software category, much like BI by the way. I'm gonna give you an analogy. In BI, round 1 was query and reporting tools. BusinessObjects won that, Cognos won in OLAP and Crystal won in reporting. And there were specialized competitors. Actually, it was a reporting competitor, OLAP@Work was an OLAP competitor, TM1 was an OLAP competitor, BusinessObjects and other query competitors like Cognos Impromptu. But you had round 1. What we now think of as a subcategory has got one, and then round 2 was to consolidate the categories into BI. That's what BI was. BI became a category of software that included query and reporting tools like BusinessObjects, OLAP tools like Cognos PowerPlay, and enterprise reporting tools like Crystal Reports. And to me, if you go back, and this is back to the simplification thing, but that was it.
Dave Kellogg: (36:41) All we did at BusinessObjects was step 1, win query and reporting, step 2, win BI suite, and to win BI suite, you needed to figure out which pieces mattered, because there were 5 other pieces out here that we could have bet on, but we didn't. We said, "We think the core of this is Q&R, OLAP, and enterprise reporting. So yeah, we did a data mining launch, we kind of did kiss the ring. If an analyst was excited about something, we’d OEM a technology and launch it, but we understood what the war was. And eventually, by acquiring Crystal, we really won that, because we had the leading query and reporting tool, and now we had the leading enterprise reporting tool, and our current reporting tool was added at OLAP. And that's what I saw happen with data intelligence, that the 2 anchor tenants were gonna be data search and discovery and data governance. I wasn't sure what the other ones were gonna be, and I felt like, wow, this is history all over again, because you have a lot of candidates to pick from.
Dave Kellogg: (37:08) Is it privacy, is it lineage, is it security? Is it... I'm not sure, but you know more than I do about that. So I see data intelligence as a kind of second-generation software category round 2 of industry consolidation, where the big players are duking it out for who's gonna be a billion dollar revenue company and the winners in that round are gonna do very well. Notably, I do not think data intelligence is a business goal. That's what I call data culture. I think corporations aspire to having a data culture. I know it's not new, and I know people are maybe tired of talking about it, but now I'm back to my cocktail party. Does everybody have a data culture they want? No. Okay, so you've wanted it for a really long time and you don't have it. That to me says we keep talking about it. It doesn't say people are bored about it. It says, no, no, people want this. They wanted it 10 years ago. They still don't have it. Let's go figure out a way to get them what they want. So I think data culture is actually the business goal that drives a lot of BI and DI and I think DI, just like BI, is a means to an end. BI is giving you tools to help you analyze and interpret data. DI is basically giving you data infrastructure to help you kinda manage it all.
Satyen Sangani: (38:15) Yeah. Or in the same way that the other analogy I could use would be like customer-centricity and CRM. Everybody wants to be customer-centric, but CRM doesn't necessarily make you customer-centric.
Dave Kellogg: (38:24) Yeah, it's a good point. It's a means to an end, and it's an excellent point that just because you have the means doesn't mean [chuckle] you're gonna have the end. Although I would argue it's probably a necessary, but not sufficient condition. It's very hard to be customer-centric without a CRM.
Satyen Sangani: (38:38) Yeah, one of the things that I find interesting in the BI landscape is that companies ultimately ended up evolving to have multiple BI tools. And certainly in the data landscape and the data platforms landscape, you see lots of companies with lots of different databases. I don't necessarily believe that that is what's going to happen in the data intelligence space, because it feels to me like there's almost a singular thing in governance and search and discovery. Those have to be one thing, not 10 things. How have you seen that evolve? You've obviously been on the board of a company like Profisee. You talked to a lot of these players. How do you see this playing out?
Dave Kellogg: (39:09) I just think we need to look at the drivers of entropy, right? So a degree of disorder and CIOs, their job is to reduce entropy, right? Here we go again, more reductionism. But if I had to just take what CIOs do for a living, it's reduce entropy. And literally, I met the CIO of GE once back in the day, and this is a quote, and this is where I got this from. He's like, "Dave, let me tell you what my job is." He was from New York. He's like, "I look at every software category, I make three buckets. Bucket A, you gotta use it; bucket B, you can use it; bucket C, don't try to use it." [laughter] I'm listening to the CIO of a $50 billion corporation.
Dave Kellogg: (39:44) And it was all about entropy. It wasn't about information for competitive advantage. It was about, I need to get control of the corporate-wide infrastructure. A little bit presumed, I believe correctly, that if you buy BusinessObjects or Cognos, probably either one can do the job. You go buy some runner-up 2 or 3 tool, it might not, and then we're stuck with a tool that nobody knows how to use and isn't doing the job. So that was basically the driving philosophy, I reduce entropy and reduce risk, and in some categories, there is a need for entropy. Let me give you an example. If you wanna manage time series data in this world — and I'm an advisor to a time series company, Influx — you should buy a time series database, in my opinion. The general purpose solution is not good enough. It's not even close to good enough. So there's gonna be entropy and databases. If you want to manage documents, you should probably use MongoDB or... I don't know, MarkLogic, MongoDB, a document or a database. To me, you have to look at the technology and goes, well, that means use entropy because there's no one general purpose database that can really hold everything.
Dave Kellogg: (40:43) Arguably, this is one of the reasons Spark was there, was that they kind of built their own in-memory database for doing analytics. So I think that's where you have to start. Where are the forces that drive us towards chaos, and where are the forces that drive us the other way? And my belief is at the low layer in databases, Michael Stonebraker wrote a paper on this decade ago called "One Size Fits All," and his answer was: basically, it doesn't. And by the way, the same things. In AI, this is called the Free Lunch Problem, that there's no one generalized solution that can solve every problem. Stonebraker is basically saying there's no one database that can solve every problem. If you wanna do star schema data warehousing, columnar databases are fantastic. If you don't, they're not good. If you wanna do time series, you should use a product or time scale or Influx because they're built for time series, for text indexing. Long answer to the question, but I actually believe that data intelligence layer is a counterpoint that you do want one of them, that given at the layer below, I'm gonna have a billion different databases and tools and assets. Visualization: Tableau is amazing at visualization. I'm gonna have Tableau on my stack. Looker is amazing at dashboards. There's going to be some Looker out there.
Dave Kellogg: (41:48) If I’m a CIO, I just know it, and I'm not gonna be able to look somebody in the eye and say, "Get off Looker for this thing is better." Because I don't know if there is something better. I don't think there is anything better than Tableau at doing general purpose visualization. So I have to look at this environment and say, "Where must I accept entropy even though I don't like it, and where I must I not?" And if I'm gonna say, "Look, given that I have all these different types of information assets and all these different types of software managing them at the search and discovery and the governance layer, I should have one." Because that's where I put Humpty Dumpty back together again. So I've always been a big believer. Back in the early days, we used to call this enterprise data modeling, which I think didn't work, because you're trying to build a single data model for the whole organization. That's too ambitious, but a single repository of metadata, governance and privacy rules, I think you can do, and I think you should do. So I actually think the entropy out there drives to Alation's advantage because the more complicated that world gets, the more you need one place to go, be the kind of single source of reference as you sometimes call it, to go find it all.
Satyen Sangani: (42:47) Yeah, I would tend to agree. Obviously, that's in line with how we think about the world, and I think you can't really have three collaboration tools. Like if three people are going to three different places to collaborate with each other, like if the data engineers are going one place and the analysts are going to another place and the business people are going to another place, it's like, well, how are they all collaborating with each other?
Dave Kellogg: (43:07) For governance, data search is an excellent point. I love that argument. You can't have 3 search engines, collaboration locations, governance repositories. You gotta have 1.
Satyen Sangani: (43:16) You've struggled and I think in the best of ways with learning the space. You've gone to boards of different companies, and executives at different companies. You have the advantage of having a son who — PhD, brilliant kid — and came out and obviously started working at some of these companies and evaluating them. What's the advice you'd give to somebody early in their careers or even mid-stage in their careers as they think about making a career in analytics and databases, and what would you tell them to go do or think about?
Dave Kellogg: (43:46) So 2 things. One, I'll tell you what I learned from my son, and it was really interesting. I learned about modeling from him because in his worldview, the only thing you would do with data is make a model. It's almost incomprehensible to him to do something else. [chuckle] And that itself was really interesting, because I was like, "Well, what do you do with data?" He was like, "Make models. Is there something else you can do with it?" And it's like, "Yeah, actually. [laughter] There are a lot of other things you can do with it." But his upbringing and his training just taught him like, "I make models. That's what I do." And that's what he does. And that was part of what helped me understand the modern data stack, right?
Dave Kellogg: (44:21) Like you ask him, “What do you think about SQL?” My son’s thoughts on SQL were fascinating to me which was “SQL is something you need to read a book on before a job interview, because they're gonna ask you a few questions that you need to answer and then you throw your book on the shelf, and then if you eventually need SQL later to get some data to go make models, then you'll go grab the SQL book,” but that was it. It was not cool, because for me it was cool, interesting, new language, revolutionary, codes, set algebra, blah, blah, blah. For him, it was just like this kind of very boring means to an end that yet invariably, there'll be a couple of questions on it in a job interview, so you have to know it. Once again, that's how a modeler thinks of SQL. This is why I like Compose so much, by the way, Satyen.
Dave Kellogg: (45:00) When I saw Compose, I was like, that's the tool that my son wants, because he's not interested in SQL. If there's some SQL left over from the last guy, and I can go customize that, that's fantastic because I have no interest in SQL whatsoever. SQL is a means to an end, and the end is building models, and if you could help me find good data to build the model, even better, because garbage in, garbage out is not gonna believe much worse today than it was 20 years ago, because we just have a bad report and maybe make a bad decision. Now you're feeding a bad production model, which is scaling the bad decision-making at real-time speed. So the answer to your question — look, 10 years ago, I used to tell people, the data science, I was like the guy in The Graduate who told Dustin Hoffman “Plastics,” and I always felt like I'd go up to younger people and say “Data science, plastics, it's the future.” And now I say “causal inference.” It's what my son actually does at Google, but I think I'm hooked that we're all drilled into here that correlation is not causation.
Dave Kellogg: (45:54) We are trained to say that and it's right, but then you go to a business meeting and we don't act like we know that. We go, "Oh my God, every customer had more than 4 calls, party 1 last quarter churned, so we got a reduced party 1 call volume." Right? Now maybe you don't do that at Alation, but a lot of companies do where they're implicitly deriving a causal relationship where there's none there. Maybe the problem is the software’s not working, and that's why they're calling support a lot. The old ice cream sales and drowning correlation where temperatures have gone up, so there's more ice cream sold and more drownings. But we run our businesses that way. So I believe in the future. The really interesting problem is gonna be causality because when you have enough data, you can from purely observational data derive causality. So I just think causal inference, The Book of Why is a great book on it. The guy's name is Judea Pearl who wrote it, and I continue to evangelize causal inference today the way I did data science 10 years ago, and I think I'm gonna be right again.
Satyen Sangani: (46:51) But I think that's the essence of analytics. The idea of a why, especially in a world of generative AI where you don't even know, is the machine writing the text that's being read, that's being written as the model running off a model? We're just running off a model. It's like just being able to sort of unpack the layers and layers and layers of crap that the machines are creating for us and for ourselves all day long. That's gonna be the essence of how we create value for professionals and for people. Yeah, right there with you. Dave, this has been awesome as always to have you on. Thank you for taking the time. Every time I talk to you, I learn something, and today is no different.
Dave Kellogg: (47:29) It’s a pleasure to see you. It's always great to see you too. I had a lot of fun and keep doing well over there at Alation. We're rooting for you.
Satyen Sangani: (47:35) Thanks. Have a good one.
Dave Kellogg: (47:37) Take care.
Satyen Sangani: (47:44) I've had the privilege of working with Dave for years. What I love about him is that no matter what problem you bring to him, he always responds with a framework for how to think about it. You can give a woman a fish today or you could provide her with a framework so she can survive in any waters. As a data radical, you're trying to deliver insights and enable the organization around you to know more, but the best way to help sometimes isn't with an answer. It's by thinking about how we ought to get to the answer and testing those underlying assumptions with the various stakeholders at hand. Frameworks breed simplicity and clarity. They also provide far more value than an answer alone and enable you to communicate your ideas in a much more consumable way. Think about how people market software. A confused prospect will typically buy from the market leader, but if you're a start-up, that's not ideal. So as Dave points out, you should develop a framework that not only describes the buyer's complex world in a simple way, but also confirms your solution’s role within that world.
Satyen Sangani: (48:44) So take a hint for marketing and economics. Simplify with a framework. Thank you for listening to this episode. And thank you, Dave, for joining. I'm your host, Satyen Sangani, CEO of Alation. And Data Radicals, stay the course. Keep learning and sharing until next time.
Producer: (49:02) This podcast is brought to you by Alation. Your entire business community uses data — not just your data experts. Learn how to build a “village” of stakeholders throughout your organization to launch a data governance program, and find out how a data catalog can accelerate adoption. Watch the on-demand webinar titled "Data Governance Takes a Village" at Alation.com/village.
[music]
Season 2 Episode 19
Tech journalist Matthew Lynley unravels the intricate landscape of large language models (LLMs), including their applications and challenges, as well as the race for dominance in the AI space. The founding writer of the AI newsletter Supervised, Matthew shares his views on the trends, rivalries, and future trajectories shaping the GenAI landscape.
Season 1 Episode 26
What’s the deal with DataOps? Guest speaker and Forrester Research analyst Michele Goetz says it comes down to democratizing data and enabling self-service. In this interview, learn how DataOps can support collaboration and coordination between the business and IT – and what the future has in store.
Season 1 Episode 22
Want your data to be a competitive asset? Make it FAIR — findable, accessible, interoperable, and reusable — and you’ll reduce the silos and improve efficiency. Francesco Marzoni explains how to apply the data management principles of FAIR at your organization to empower more people to derive the most value from your data.