You wouldn’t think a data scientist would tout vulnerability and storytelling as requirements for success, but that is exactly what Jacey Heuer has learned across multiple industries and projects that have failed and succeeded. In the first of this three-part series, Heuer shares that “what you think you know today should change tomorrow because you’re always discovering something more.”
Success in data science means:
- Acknowledging that 80% of projects never make it out of production, and not because of a failure of science but a failure in communication and being vulnerable.
- Putting yourself out there by connecting with different people.
- Acquiring and honing new skills and behaviors that support a deeper understanding of systems thinking and the dynamic variables within those systems.
- Always iterating and reinventing. The work is never done, and it’s never easy.
Three distinctions for roles and responsibilities:
- Data Analysts work with stakeholders in-depth to understand the problems, goals, and outcomes needed.
- Data Scientists focus on prototyping and exploring and twisting and turning data – looking for the algorithm.
- Machine Learning Engineers productionalize the output.
Read the Transcript
0:00:57.9 Matthew: On this episode of The Long Way Around The Barn, we kick off a three-part series with JC Heuer, a data scientist with a passion for learning, a passion for teaching, and an unquenchable passion for helping leaders understand the profound impacts of data-based decisions. I absolutely loved my conversations with JC, and was surprised and highly interested when he told me how vulnerability and storytelling were two of the greatest attributes of a useful data scientist. In these podcasts, JC shares with us a little about his personal and professional journey as a data scientist.
0:01:37.6 JC Heuer: And what I feel today might change tomorrow, and so on. What’s sort of the core component of that is the scientific thought process. I’m not going get too far ahead, but that’s something that connects with me deeply. Part of the reason I’m a data scientist is this: Your vision, what you think you know today should change tomorrow, because you’re always discovering something more. That’s the scientific process.
0:02:00.8 Matthew: His views on the development of data science as a body of knowledge and professional practice, how companies can realize the value of data decisions, and what people need to explore, learn and pursue in order to become a credible data scientist. JC, thank you for taking the time to meet with us, talk with us, teach us and just include us. Tell us a little bit about… We know currently that you’re working in the data space on purpose. You love it, it’s a passion, it’s your journey, it’s your current chapter or multiple chapters, but tell us a little bit about your journey, Where have you been? Where have you come from? How did you end up here? And then tell us about where you are and where you’d like to be heading. Teach us about you.
0:02:50.3 JC Heuer: Thanks for having me, Matthew, I appreciate it. And I liked the emphasis on purpose there. So my journey started… I’ll go way back to start with maybe, right? So I started off as an athlete, very focused on athletics. Coming through high school into my undergrad, I was gonna play professional basketball. So I’m a pretty tall guy, relatively athletic, depending who you talk to. And so that was really my initial journey. Various reasons it didn’t pan out. I ended up graduating and getting my undergrad, and finance is kinda where I started. And so there’s a lot of connection into data with finance, accounting, stuff like that. It’s not a stretch by any means, to get to the data side of that discipline. I started off in financial analytics, and then decided to go back and get my MBA. And so I was getting my MBA at Iowa State around the time that data science was really becoming more of a mainstream term. It was noted as being the sexiest job of the decade and all that kinda stuff. Around this time is when it was first getting popular. And so that was kind of my initial motivation, to be like, “Yes, I like finance.” I’m getting this sort of data bug as I step out into the professional world.
0:04:14.1 JC Heuer: Going through my MBA course at Iowa State, I was introduced to some text analytics classes and courses, which is really sort of my first real step into what I would call real data science, kinda that movement beyond traditional business intelligence, financial analytics, stuff like that. So, got some exposure there out of that. I started to really focus on “What is this career path that I want, where do I want to go, and how do I do this within this data science space?” So I started networking, as sort of cliche as that can be, just getting my name out there, meeting people, stepping out, being vulnerable, putting myself out there, connecting with different people, and I was able to take a role in data analytics with commercial real estate, which is… There’s some traditional applications of that. There’s also some… From when I was looking for a data science sort of transformative application. That was a new thing in commercial real estate at the time, and it’s still a relatively new thing. That industry is relatively data-tight; data is held close to the chest, it’s not publicly available all the time. And there’s ways to go around that and all that kinda stuff, but that was sort of my first big opportunity and big step into this journey of data science.
0:05:30.6 JC Heuer: And so I was able to finish my MBA, start this role with this commercial real estate company, leading their international commercial real estate research publication. So we’re doing analytics on Europe, on Australia, on the US, similar countries around the world, understanding different forecasts around interest rates, around metro markets, all this kinda stuff, drivers of hotness in the commercial real estate industry across these metros and things like that. That was sort of my real first taste of a data science professional setting. I’m really diving into this knee-deep. From there, this was kind of in tune with when more universities were now starting to catch up and launch their graduate programs around data science, so I decided to go back, earn my graduate degree in data science. Out of that, it was just kind of a launch pad to keep moving forward then. And I’ve always had this kind of notion in my mind, as I’ve gone down this journey is, there’s currently this double-edged sword of, how often should you change? Should you take an opportunity? And how long should you stay in that current role before you feel like you’ve learned? And… What’s that balance of, “Am I going too fast? Am I going fast enough?”
0:06:46.7 JC Heuer: And to me, I’ve landed on that side of trying to… As mystical as this can sound, listen to the universe; not give too much thought to it and just kinda let it flow. So when an opportunity comes along, it’s an assessment of, “Does this really feel right to me? If it does, let’s take it.” That’s given me the ability to practice and step into data science and work in the data space across a few different industries. So as I’ve gone forward, I’ve worked in… I mentioned commercial real estate, financial services, e-commerce, now manufacturing, the energy industry as well, and been able to experience, really, different company dynamics, different sizes of companies, and how they approach data, data science, data management. What the nuances of changing a culture to be more open, to being data-driven, what does that mean? What are the challenges of that? And that’s really been what’s led me to this state, and I think what’s kinda guiding me forward as well. It’s listening to the universe, listening to the flow, accepting kinda what comes next, and then just kinda moving forward with that. If that makes sense, hopefully, but…
0:07:58.8 Matthew: No, that’s outstanding. One of the things that struck me, and you may already be aware of this pattern, and I’m just catching up to you. In order to be an athlete on purpose, you have to be aware of a universe level or a system level, whole system level, set of variables, and all of these variables in the system are dynamic. Some of them might be static, some of them are variable. And all of these things are learning new skills, honing existing skills, deciding to try and make some things, some behaviors, some quirks, some types of behaviors go away. But your goal was to take all of these system variables, understand these variables in the mix, and move forward in some way, shape, or form. Whether you tacitly recognized it at the time or not, it seems like, as a purposed, goal-oriented athlete, you were already a systems thinker. What’s interesting then is how you translated that systems thinking into another, more… Well, defined for undergraduate school degree, finance, which was also systems thinking, also structure. Did you do that on purpose? Did you discover it along the way? That’s an interesting map from my perspective, right off the bat.
0:09:22.2 JC Heuer: I would say that wasn’t on purpose by any means. It was more of a, “This is my personality, this is sort of this… ” Again, I… Not to sound mystical, but it’s sort of that sense of, “This just seems to fit as the next step, and let’s take this, and put myself out there and see what happens.” I think you hit the nail on the head, Matthew, when you talk about that systems thinking from an athlete’s perspective. It’s having that sort of top to bottom, bottom to top, thorough understanding of: How does the team work? How do the pieces come together? What’s that more macro vision, that strategy that we’re going after, and how do we deliver that strategy within these sort of subcomponents? And something I’ve noticed, as I’ve gone further in my career with data science… There’s… And I think this is… It’s common across many disciplines, many practices, there’s sort of the balance of… Those with the ability to really… To be the… To have that real depth of technical skill set, and can knock out, “This is my task, I can do that task,” and those with the ability to really see what’s the relationship with that task into the bigger whole and connect these pieces together. And I’ll say, from a data science perspective, the skill set to really understand, “How does this algorithm, this thing I’m working on, tie in to that business impact, tie in to the bigger whole?” That’s a valuable skill set to have.
0:10:56.3 JC Heuer: And I’ll say, for me, having both an MBA, data science master’s degrees, and putting those two together has given that sort of benefit where I can understand how, if I’m building this algorithm, writing this code, what’s the impact to the business? And how do you speak to that impact to build those relationships with those that are ultimately going to adopt this output? That’s the feedback that we want, that we’re seeking, and why a common statistic for data science is that it’s something like 80% of models and algorithms never make it to production. That’s a huge failure rate. And a lot of that is, you’ll do all the legwork, the foundational work, getting it up to that state, and then go to that last mile to get adoption, you don’t get that buy-in from the business; that relationship isn’t there, that trust isn’t there. And that’s something where, on the athlete’s side, as a basketball player, you know if that’s gonna happen, more immediate. You know if I’m taking the shot or I’m passing the ball to this person, they’re either gonna take it and shoot it and score or not. You know that they’re accepting your pass. You know it’s gonna happen. Data science side, it may not be evident or obvious right away. You may go through all this work, three months down the line, just to find out that what you were building doesn’t get adopted, and it falls into this abyss of what could have been data science.
0:12:26.9 Matthew: That map, from your bachelor’s degree in finance to then doing an MBA to get a broader perspective, it almost looks like a funnel, as I’m visualizing some of your journey, where the athlete himself was starting out as a systems thinker, so that’s already a wide funnel, if you will. And then finance was starting to apply structure and discipline, and honing some of that stuff, but just raw talent’s not enough to be a pro ball player. Just raw talent gets you down the road, but it doesn’t help you last. So somewhere along the way, you said, “I must focus, I must have structure, I must have purpose.” Somewhere, you chose that. To your point, listen to what you’re hearing and make decisions contextually, but you became aware of the need for doing something on purpose, and thinking about all of the variables, you moved into the MBA conversation with a data focus. The interesting thing about the MBA, from my perspective, is it’s not designed to give you the answer to all possible questions, but it is designed to make you aware of how very many different bodies of knowledge exist to just even make an operation operational and then healthy and useful.
0:13:44.9 Matthew: So you have this interesting blend between you want to be a competitor, a high-performing competitor, who is disciplined, to someone who’s now focused it to, “I understand math, I understand models, I understand the value proposition of an idea,” to then moving into, “Hey, there’s all of these things it takes to run a business, not just data stuff. But data helps drive, equip, enable, educate people to make decisions, but there’s all these other things as well. They all require data, but they’re all different types of behaviors.” You’ve walked into this data role, being aware of the need for systems thinking, of discipline, knowing that you’re not the only person in the company with a brain doing thinking, but then also realizing that the things that you’re creating need to be relevant to all of the other people in the business, or else it inadvertently supports that 80% of all models never make it to production. 80% of all shots taken never making it into the basket; that would be a fairly brutal statistic as a pro ball player. So in the data industry, that seems like some people are getting a lot of forgiveness, if you don’t mind my… What I’m saying there fairly directly is, 80% as an industry number? That’s pretty tough, dude. What are your thoughts on that?
0:15:09.4 JC Heuer: Yes, you hit the nail on the head, Matthew. And I think the mindset with data science, with AI… On one side, there’s a lot of buzzword, a lot of media coverage of it that drives a lot of it, and while the media coverage can be hyperbole sometime, the foundations of it are real. And the reality is that I think a lot of organizations, a lotta industries want to jump to, “Let’s just throw an algorithm at it, let’s just throw machine learning at it, and it’ll work,” without really realizing that the foundations, the data foundations underneath of that, the quality of that data, the governance of that data, the culture around managing that data, that is what drives the success of those 20% of models that get into production. It’s coming from having robust foundations in your data.
0:16:06.9 JC Heuer: And that’s probably the biggest distinction there, is that… Any model, any analytics that you’re doing, really, is a small set that, once that data foundation is in place, it’s much easier to iterate, experiment, prove value to your business partners, your stakeholders, and have a shorter putt to get to that adoption, and push through the end zone with that, and that, I think, is what gets lost in that 80% that doesn’t make it to production. As much as part of that’s maybe because of the relationship with the business, well, that relationship struggles because of the complexities that you’re trying to go through on the data side, and any of the confusion around “Why is it taking so long? Why can’t you just push the easy button?”, all that kinda stuff comes with that sort of messiness in the underlying data. Does that makes sense?
0:17:05.5 Matthew: It’s the sausage-making conversation, right? Have you ever been to a product demo? Many people have. Have you ever been to a product demo where all of the technical people said all of the technical things, but the people that were paying for the product development didn’t understand a single word that was spoken, like, “I know you said things. You seemed very excited about them. You seem confident. That makes me confident. I still have no idea what I just bought.” That seems like an easy gap that could exist in the data science world, to the executive leadership world inside a company, for example. For all of the executive leaders out there who are making decisions based on a single pane of glass, or a dashboard, or they’ve got a lovely, lovely, dynamic Excel spreadsheet with wonderful graphics on one of the pages in the workbook. For people that are trying to distill a whole business down to a single pane of glass, they may or may not be interested in the sausage-making. So how have you found, given all of your background and your awareness of these situations, how do you bridge this gap between, “I’ve got this data science stuff,” and “These guys are just looking for pie graphs”? How do you become relevant when they’re only using a single pane of glass?
0:18:26.3 JC Heuer: Yes. And that is, in many ways, the core of the challenge, that’s the art. And really, it comes from… It’s the relationship building, it’s the conversations, it’s the honesty around the vulnerability of letting these stakeholders know, “If we want to step forward into becoming truly more data-driven, changing the way we think about our decision making, our leveraging, and turn data as an asset, data as a resource and so on, what does that mean?” The reality of it is, you need to find that balance between that single pane of glass and the guts of making that sausage, and you have to pull back the covers a little bit on that, and the term I use, it’s the art of the possible. The being able to set the stage of, “This is the art of the possible, this is what we can do, if we have the strong foundation underneath of it.” And starting at that, “Here’s the shiny object, and now let’s peel it back and dig further into this and make that journey known, of what’s needed to get to that vision and art of the possible, and now let’s go and resource and attack these sort of sub-components that let us get that far.” And that takes clear communication and vulnerability.
0:19:46.4 JC Heuer: Again, I use that term a lot, because there’s no easy button for data science, for AI, for ML. As much as companies and vendors will push, “This is auto-ML, you can point and click,” all that kind of stuff. There’s a lot of work that goes in underneath of that, to make that work and work well for changing a business, changing the way they operate. Again, it’s giving that kind of clear vision of, “What can we bring, from a data science in advance and Linux perspective, to the organization?” and then laying out in honest terms, “These are the steps that we need to take, where the gaps are and how we can start tackling that.” Because it’s that vision that can hook someone and then going on that journey on, “How do you fill in those gaps, to get to that?” that’s the key, and making the partnership known.
0:20:43.7 Matthew: So set expectations, manage expectations, and in all cases, communicate and over-communicate.
0:20:51.1 JC Heuer: Correct. Iterate and iterate.
0:20:51.5 Matthew: And iterate.
0:20:56.0 JC Heuer: One of the key things I like to do when I enter into an organization, it’s go around and have these data science road shows. So meeting with different groups, different departments, and just educating them upfront, on, “This is the data science thought process, the data science project process. And what does that mean? And how is that different from maybe traditional software development or traditional engineering and things like that?” The data science means experimentation, means iteration, means going down a path, learning something, and then having to go back three steps and do it again. And so, it’s not a linear process all the time, but it’s very circular and it’s very iterative. And even when we get to the end of that path, we produce something. That thing we produced, may need to be re-invented a couple of months later, or you launch an algorithm and a pandemic hits, and what was driving that algorithm no longer has as much meaning because of the new environment. So you have to go and re-build that algorithm again and re-launch it again, because there’s new information being fed into it.
0:22:04.0 Matthew: There’s an interesting parallel inside organizations, which I imagine you’ve already seen and noticed because of your bachelor’s and your master’s. The idea of financial modeling, modeling itself and forecasting, whether it’s a go get a brand new vertical market, whether it’s segment a market, it’s create a new product and create demand for the product. The idea of finance has been around for a long time, and it’s understood by most, it’s discussed in undergrad and grad school, and even if people don’t go to university of any kind, everybody is familiar with, “You need to make more money than you spend, or else you’re upside down, you have a problem, you won’t last long.” But if I want to live for a very long time, I need to forecast. In other words, I need to say, “Based on the things I know today and the things I think I know about tomorrow, what will it take for me to get from where I am to where I need to go?”
0:22:53.0 Matthew: That forecasting idea, that’s an old idea, and it’s in companies already, today. And I’ve seen it done wonderfully and I’ve seen it done horribly, and the difference was communication, where somebody took the time to say, “Look, man, based on these 15 assumptions and these 17 system variables, which I don’t control any of them, and based on the things you think you want to be when you grow up, 19 months from now, here is version A, B and C of my forecast,” and people tend to accept that as, “Okay, given all of the knowns and the unknowns, this makes a lot of sense. You made me feel good. Okay, goodbye.” In the data space, it seems to be similar, but I wonder if that’s just a new enough idea that people don’t understand what they’re buying yet or how to use it yet, and so when you mentioned that, “Let’s just grab some MLs, let’s just grab some AI, let’s just grab that little algorithm and put it into my Excel spreadsheet,” I wonder if people don’t fully understand exactly what it is, what to do with it and how to make best use of it right now.
0:24:02.9 JC Heuer: I think you’re correct in every aspect of that. It’s sort of the shining light on a hill, shining object that’s sort of lingering out there, that I want to grab on to, and it sounds great, it sounds cool. And again, and not to discount it, it is ML, AI is real, the expected benefits of it are real, the readiness for some organizations to really adopt it, may not be as real. And I think that’s a key concept to keep in mind. Depending on the organization, there can be a lot of ingrained processes, ingrained mindsets. I’m going to look at the data, to justify or justify a position I already have. The confirmation bias. I already know what I want to find out, I’m gonna go find it in the data.
0:24:53.2 JC Heuer: So if I apply a ML model to that and it tells me something different, I’m not gonna trust that, because I have… I know what I already think, and that’s what I want. That’s one of the walls that, as we build data science into an organization, how do we tear that wall down and change that mindset to overcome that confirmation bias, the selection bias that may be present? And it may be built on years of experience, “This has worked for me for 30 years. Why would I change now?” Well, there’s more data becoming available, the industry may be changing, the environment’s changing, we’re in a pandemic, we’re in whatever it is, that’s the promise of data science, is, it’s quicker, more consistent, in many ways, more accurate decision-making that can come out of those models, those efforts.
0:25:48.4 Matthew: It seems like, to me, based on my own journey, based on the increasing numbers or classes of data that we continue to collect, that we didn’t use to collect, when we collect so much more data today than we ever did, and it’s only increasing, that at some point, the idea of a super smart financial controller or CFO being able to take in all of this multi-dimensional data and make sense of it in order to create a credible forecast, it seems like the role of the manual forecast will become less and less and less reliable, as the multiple dimensions of data that we collect continues to increase and not even at the same rates of speed. My guess is, is that we’ll just be in denial about the reliability in our ability to forecast multiple dimensions in Excel, as opposed to recognize that, “Hey, I want to do the same thing, but now with all of this data, maybe I need to go figure out what this ML thing is, or what is this AI thing, or… ” It just seems like the magic of the forecaster needs to change.
0:27:00.8 JC Heuer: What I think of, when you mentioned that, Matthew… I don’t know if I’d call it the magic of the forecaster, the mindset needs to change, maybe. It’s the base skill sets that go into this, go into forecasting, go into modeling, it’s the understanding of, “As I obtain more data and try to translate that into an action, translate that into conversation that a leader can take an action on, what are the skill sets I need, to be able to make that translation happen?” Because the data, the ML, the algorithm, as companies become more refined, more robust in their ability to build that foundation of data, that will continue to improve and become, I think, easier to get to, “This is my forecast, and it’s a more robust forecast because I’m taking in so many more variables, many more features into this forecast, and I can account for having an expectation of different anomalies and things like that to occur.” But my role as a forecaster now, has to be, “How do I translate that into meaningful action for the business and tell that story and convince the leaders of that action?”
0:28:17.0 JC Heuer: And I think that’s something where, academically… And there’s many boot camps and things out there, that build the technical skill set for data science, but what’s still catching up, is that communication, it’s that relationship building, it’s, “How do I tell the story in a way that’s actionable and that drives trust in my forecast, in what I’m doing?”
0:28:41.9 Matthew: In my mind, at least, it is similar to the technical people who demo a technical, they say technical things during the product demo, but somehow, they’re completely irrelevant to the people that are supposed to be benefiting from that whole journey, ’cause I didn’t say anything that mapped. Let me tell you about your five-year goals say this, your current books say this, your forecast says this, we’ve aggregated this data. After we take that data and look at it multi-dimensionally and we forecast it out differently, you have to take all of this giant universe of stuff and not talk about it and distill it down to something that’s just plain relevant. In other words, what I think I’ve heard you say so far is, you could be the smartest data scientist in the earth, and if you don’t have the ability to communicate, you’re in that 80%.
0:29:36.4 JC Heuer: Yes, you hit the nail on the head, Matthew. That’s the key right now, it’s that communication, I think, that drives a lot of that adoption. There’s pockets of, I think, industry spaces where that may not be as necessary. I think of, if a company is founded around data and data is at the core of their organization, I think of a start-up, think of any… Put your tech company in here. Generally speaking, I think they have a stronger data culture, because their product is data. But when you’re talking about many other industries that are out there, manufacturing, energy, in many ways, things like that, where it’s… You’re stepping into a legacy company, a company that may be 100 years old, and it’s going through this transition to become data-driven, that’s where a lot of that challenge, and even more so, the emphasis on that communication becomes pertinent to the success, to changing that 80% failure rate to 50%, to majority of these are getting implemented. That’s where, at least in my experience, having worked in those industries that have some of these legacy old companies, that’s a key to success, is that communication, that relationship building.
0:30:57.8 Matthew: So, that 80%, really, may more accurately reflect just an inability or a lack of success in setting and managing expectations and communicating. It’s not a failure of science, it’s just a failure of us being people. Being a person is hard and communicating is hard, it’s the science where we can find peace.
0:31:21.0 JC Heuer: Yes. Right. To put it another way, the art is what’s hard, the science is straightforward. I know the math, I know the linear algebra, all that kind of stuff, and that’s the way it is right now, as far as we know. But it’s the art of, now, translating that into something meaningful. That’s a big component of it.
0:31:47.5 Matthew: So I’m… John, I haven’t done the things that you do, and I’m not even intending to assert that I know all of the things that you do. If I’m able to start in a greenfield project, that I’m able to do all of the things the way I think they should be done and anything that doesn’t happen as it showed, is on me. Often times though, to your point, we end up in legacy situations, where the company is 100 years old, 140 years old, or it’s been under the leadership of a particular C-suite for the last 45 years, whatever, in all of those situations, that does represent, probably, growth, it represents constancy or continuity, it represents a good strong company, all of the things. But it also represents the way things are done, and it might also then, be an additional challenge. So for me, if I need to take all of the data in an enterprise and take that all together and meld it together and do a single pane of glass for a C-suite for them to say, “Aah, I can now make a decision.”
0:32:42.3 Matthew: The journey to get to that lovely single pane of glass, like Star Trek, just walk around, hold it in my hand and I can see the entire stinking ship on that one screen, it’s ridiculous. ‘Cause I can have 105 different repos, data repos out there in various states of hellacious dirty data, to, “Oh my gosh, just flush this stuff,” to, “That is gorgeous. Where did that come from?” to stuff that’s in data prisons, the stuff that’s outside the walls. In the worlds that I’ve walked, to get to that single pane of glass, that journey is not peace, it’s just a lot of stinking work. But what’s it like, for you?
0:33:22.6 JC Heuer: I chuckled a little bit at that, because it’s chaos in many ways. That’s the reality of it. Because especially these old mature companies, generally, I don’t want to put a blanket statement out there, but just given what I’ve worked in, and there’s nothing… It’s just the reality of it. It’s the way they’ve gotten here, they’ve been around… The company may have been around for 100 years. They found success somehow, to be here for 100 years. But the result of it can be, from a data perspective, that you have many different systems, applications generating data, data that’s… It’s not built for data science, it’s maybe built for reporting, it’s… Term I give it is, data exhaust. It’s just not really in a usable format, and there’s knowledge gaps. There may not be… The person that built the database may not be with the company anymore, or still using the database, but no one has any real knowledge of what’s in there. There’s data flowing into it, but how do we map it and get it out? Things like that.
0:34:25.0 JC Heuer: And the path that has been useful, in trying to work through that, drive a transformation into something more modern, more updated, more usable for data science, it’s finding those champions within the owners of that data. So where that data is owned, going out, and again, it’s back to communication, it’s back to the art, but its finding those champions and not to get too granular on this, but something that’s worked for us is, it’s working to establish a true data council, data stewardship, where you have this representation, where you have, instead of data being this by-product, this… It doesn’t have a forefront, a key role in the business, it now takes a step in the forefront. The ownership is established, and the connection to the goals of the organization are built out. So now I have this council of individuals representing the different parts of the business that are generating the data, and they have a voice in, “How is this being used?” and have transparency and clarity into, “This is how we would like to use it.” Well, the conversation started, “Then, well, this is what we can do. I didn’t know that. That’s interesting.”
0:35:44.8 JC Heuer: You start that communication through that council, through that stewardship program, that is the first step to getting to that foundation of a robust data layer. Now you can build that data science on top of… Build that AI and ML on top of… And start that transformation. What can be, I think, challenging in that, depending on the goals of the organization, it’s the time and resources needed to really do that, and that’s a mountain to climb in itself, is, “How do you convince of that story, that this is what we need to get to that next step with data science, AI, ML, all that kind of stuff?” That’s a journey in itself.
0:36:29.7 Matthew: Do you find, in your profession, that you’re asked or expected to, or you find the need to differentiate or define what is data science, what is machine learning, what is artificial… Do you have to differentiate these things, and how would you define that for us today, knowing full well that you may have broader and deeper things to say, than we’re all prepared to receive?
0:36:53.3 JC Heuer: I think of it this way. It’s not uncommon. Anything that’s new, there’s a fair number of examples out there, where three different people, you ask them to define something, they can have three different definitions of it. What does this mean to you? And it’s the same thing with the data science space. The way I break it down is in a couple of ways. On one level, in terms of data science and data analytics, it really falls into three categories. There’s sort of the diagnostic, descriptive sort of category pillar, which is, many companies will have some version of this, where maybe we have a SQL server, we can do some reporting, maybe we visualize it in Power BI or Tableau, we can see what happened. That’s really that sort of descriptive diagnostic.
0:37:43.7 JC Heuer: The predictive element, that’s where we’re taking that sort of understanding of the past and now, giving some expectation of what’s to come, we’re guiding your decision on what we think is going to happen. Putting some balance on that, confidence interval, things like that. And then the third element or pillar, is the prescriptive pillar. This is where we’re taking those predictions, now giving that recommendation. What’s the action that we think will happen, because of our understanding of the data of the environment? If we tweak this lever or turn this knob, we can drive some outcome, and that’s our prescriptive recommendations. We’re gonna decrease price 10%, we’re gonna increase quantities sold 30%, elasticity.
0:38:29.3 JC Heuer: That’s kind of at a high level, how I start to define that, is those three pillars. And when you step into specific roles, you think data scientist, data analyst, machine learning engineer, data engineer, decision scientist is out there now, there’s all these different roles and variants that are beginning to evolve, in it’s many ways. You think back 20, 30 years ago, with software development and sort of that path of defining more niche roles and areas of that discipline, data science and the data space is going through that. The key difference goes to, I think about defining data analysts, data scientists and machine learning engineer. I think those are three important roles to understand in the space. And data analyst is very much on the side of, “I’m working with the business stakeholders to understand a particular problem in-depth and sort of lay the ground, the landscape of, this is what we have in the data and how maybe we can help answer some of that.” A lot of it’s in that descriptive side of those three pillars I mentioned.
0:39:41.2 JC Heuer: Data science, that’s really that algorithm building. It’s the prototyping, it’s the experimentation, it’s going out and we’re taking this chunk of data, adding more data to it, doing clustering on it, doing segmentation, exploring this in any great depth in perspectives and twisting and turning it. And we’re trying to find that algorithm, that mathematical equation, where you can input data and get an output that gives us a prediction or some prescriptive action. That’s data science. And the machine learning engineer, that machine learning engineer, that’s who’s productionalizing that data science output. So now you have data analysts that are defining and understanding. Data science, building of an understanding, that, “Let’s put this into an algorithm.” The machine learning, taking an algorithm and putting it into production. Those are three distinctions that I think, get misunderstood, but are important to understand, from a leadership standpoint, from the design of, “What do I need to do data science?” Those are skill sets that are essential for success with this.
0:40:44.4 Matthew: What’s interesting to me though, is how you’re differentiating the data scientist from the machine learning person or ML ops, and that it sounds like when you were talking about the data scientist, this sounded like a software developer to some extent, to me, or a developer, which is, I’m taking this idea and I’m building it into a real thing. Then there’s these other folks that they take it out and move it into the wild, and that’s an interesting thing to me, because often times in the software development space, the people, there’s the business analyst who may have contributed to the definition of done or the direction, then there’s the folks that are building the thing. But often times those folks that build the thing are the same folks that have to move it out into the ether and then live with it and support it and evolve it. So are you suggesting that is not the same thing in the data space?
0:41:36.1 JC Heuer: I think you’re tracking with me, Matthew. I think you got it right. With the data side of it, a lot of it is because of that iteration, and sort of the, I don’t want to say burden, but the role of having to integrate this back into the software development process and manage that integration and maintain model performance. So you think of… I think of… If I’m building an application that… I’m gonna build a web app, for example. In many ways, I can build it, put it out there and it lives. There’s quality testing, things like that, but the application I built, is pretty well-defined, serves its purpose. If I’m building a machine learning algorithm and putting them into production, once that’s put out into production, it’s not the last version of that, that will exist.
0:42:28.3 JC Heuer: And so, the infrastructure, to be able to monitor that, maintain that, score that model, understand drift in that model. So what I mean by that is, monitor it for, “This used to be 90% accurate, now it’s 50% accurate. Well, what happened?” So, that’s the importance of this machine learning engineering and ML ops side of this, it’s taking that off the plate of the data scientist who’s focused on, “Let’s prototype this, let’s go and explore this world of data that’s out there and keep iterating on this,” and let the ML ops, ML engineering, tie this into software development, into the applications that exists in the organization, into the rest of the IT space, within the organization. That’s probably the key distinction there, and why it’s slightly different, I think, from the data side than what it might be in the software development side, if that makes sense.
0:43:22.6 Matthew: These things sound actually very amazing, JC. Basically, I’m gonna have to cycle on this a little bit, because at first, I thought you were saying, the data scientist is like a developer, but then that developer typically has to go and live with the things and iterate on those things. Whereas, it seems like you’re suggesting these guys are going to invent, create, evolve, but then someone else was gonna move it into the ether. So that makes it almost sound like one version of the word architect in the software world, which has its own loaded… English is hard. Quality, what does that mean to 10 different people? Cloud, what does that mean to 10 different people? Same thing.
0:44:02.2 Matthew: Here’s what I’d like to do, because our time is coming to a close for today. I don’t think we’re anywhere close to talking about a lot of the even more interesting things. For example, you being a practitioner. How would you advise, coach, encourage, teach or lead other people to introduce data? The whole point of data, data science, data management end of their organization. What are those steps? What does it look like? What is good communication? I’d like to talk to you some more and I’d like to do that in our next session together. So, we’ll save some of it for the next time, but first and foremost, I wanted to thank you for taking this time to teach us.
0:44:41.1 JC Heuer: Thank you for having me here today, and we’re just scratching the surface on this, and I’m excited to continue the conversation and go from there.
0:44:54.0 Speaker 2: The Long Way Around the Barn is brought to you by Trility Consulting, where Matthew serves as the CEO and President. If you need to find a more simple, reliable path to achieve your desired outcomes, visit trility.io.
0:45:10.3 Matthew: To my listeners, thank you for staying with us. I hope you’re able to take what you’ve heard today, and apply it in your context, so that you’re able to realize the predictable repeatable outcomes you desire for you, your teams, company, and clients. Thank you.