In the second episode of this three-part series, Jacey Heuer helps us dive into the evolving roles and responsibilities of data science. We explore how individuals and organizations can nurture how data is purposefully used and valued within the company.
Missed the first part? Listen to Part I.
- Adopt a scientific mindset: The more you learn, the more you learn how much more there is to know.
- Hone storytelling capabilities to engage and build relationships that ensure the lifespan and value of data is woven into the culture.
- Set one-, five-, and 10-years goals and aim to achieve them in six months to fail fast and advance the work faster than expected.
- Create buy-in using the minimum viable product (MVP) or proof of concept approaches.
- Prepare to expand your capabilities based on the maturity and size of the team focused on data science work. As projects develop, you’ll move from experimenting and developing prototypes to developing refined production code.
- When your company begins to use data analytics, roles and responsibilities must expand and evolve. Ensure your people have opportunities to grow their capabilities.
- Data must be treated as an “asset” and viewed as a tool for innovation. It can’t be tacked on at the end. Ideally, it plays a role in both new and legacy systems when aggregating data and capturing digital exhaust.
- Engage and find common ground with all areas of business by helping them comprehend how data science “expands the size of the pie” rather than take a bigger slice.
Read the Transcript
0:00:05.5 Matthew Edwards: Welcome to the Long Way Around the Barn, where we discuss many of today’s technology adoption and transformation challenges, and explore varied ways to get to your desired outcomes. There’s usually more than one way to achieve your goals. Sometimes the path is simple, sometimes the path is long, expensive, complicated, and/or painful. In this podcast, we explore options and recommended courses of action to get you to where you’re going, now.
0:00:37.3 The Long Way Around the Barn is brought to you by Trility Consulting. For those wanting to defend or extend their market share, Trility simplifies, automates, and secures your world, your way. Learn how you can experience reliable delivery results at trility.io.
0:00:57.0 ME: In this episode of the Long Way Around the Barn, I picked up where Jacey Heuer and I left off in our first conversation on data science, which has now become a three-part series. Today’s conversation focuses on how both individuals and organizations can leverage data analytics and machine learning, to evolve and mature in their purposeful use of data science.
0:01:22.0 Jacey Heuer: It takes a diligent effort from the data team, the advanced analytics team, to engage with the architects, the developers, those groups, to get your foot in the door, your seat at the table. I think getting to that state means that data is seen as a valuable asset to the organization, and is understood as a tool to drive this evolution into a next stage of growth for many organizations, to achieve those dreams of AI, machine learning and so on, that lie out there.
0:01:57.7 ME: We start by diving into how the various roles fit into today’s data science ecosystem.
0:02:04.8 JH: To the primary roles that I define in a mature team, as it relates to the actual analytics, the data analyst, the data scientist, machine learning engineer, and their MLOps, and what’s becoming a newer term though, taking this further, it’s the notion of a decision scientist.
0:02:24.1 JH: There’s a lot of roots in, you could say, traditional software development in terms of defining, and what is becoming defined for data science, and I’ll say the space of advanced analytics. Generally speaking, not every organization, every team will be structured this way, but I think it’s a good aspirational structure to build into, and it’s the idea of that you have your data scientists and they shouldn’t… The real focus is on prototyping, developing the predictive, prescriptive algorithm, and taking that first shot at that. Then you have this data analyst role, which is really more of the traditional analytics role, where it’s closely tied into the organization, they’re doing a lot of the ad hoc work on, “I want to know why so and so happened. What’s the driver of X?” things like that.
0:03:20.2 JH: So there’s a little bit of predictiveness to it, but it’s a lot of that sort of, “Tell me what happened and help me understand what happened in that role.” And then you could start extending this out, and you start thinking about the machine learning engineer. That’s really taken the step now to go from the data scientist who’s made that prototype, to handing it off to the machine learning engineer, and their role is to now bring that to production, put it into the pipeline. Oftentimes, that may be also handling the productionalizing of the data engineering pipeline or the data pipeline is all right.
0:03:52.7 JH: So being able to go, in a production sense, from the data source, maybe it’s through your data lake, through transformations, and into this model that often it’s written in Python. R and Python are those two languages that dominate the space. Python is often the better language because it’s a general programming language, it integrates well with the more applications, things like that, but R still has its space or its place. I’m partial to R. Nothing wrong with either one.
0:04:23.1 JH: But that machine learning engineer, they’re really tasked with bringing this into production. And then the sort of next step in this is the MLOps. And machine learning engineer falls into that, MLOps, kind of a bigger category, but it’s that role of once that algorithm is in production, it’s up on the mobile phone, it’s up on the progressive web app, it’s being used, now it’s an ongoing process of monitoring that and being able to understand, “Is there drift occurring? Is your accuracy changing? Is performance in that model changing?” This gets into, if you’ve heard of the ROC curve, AUC, and things like that, that monitor performance of that model. And that in itself, can be… Depending on the number of models that have been deployed, can be a task. If you have a few hundred models out there and a changing data environment, there’ll be a need to update, to change, it may be that individual’s task to go in and re-train the model or work with the data scientist again to reprototype a new model.
0:05:31.4 JH: So that’s the general, I’d say the primary roles that I define in a mature team, as it relates to the actual analytics; the data analyst, the data scientist, machine learning engineer and their MLOps. And what’s becoming a newer term though, taking this further, it’s the notion of a decision scientist. This is really the person that is crossing the gap or bridging the gap from, “We’ve implemented or discovered an algorithm, discovered a model that can predict so and so with a high accuracy,” whatever it is. Now their role is to be able to take that and drive the implementation, the buy-in from the business partners to help them make better decisions. So they’re much more of a… Have a foot in both camps of, “I understand the models, I understand the technical side, but I can sell the impact of this and influence the decision that the business partner is making.”
0:06:31.2 ME: What is the name of this role, again?
0:06:34.5 JH: The term that I see for this and I like to give it, it’s decision scientist, is what it is. So it’s much more on the side of really focused on changing, improving the decision and having a tighter role on that side of it, as opposed to what can be more technical, which is the data scientist or machine learning engineer. They’re much more focused on the data, on the programming and so on.
0:07:00.1 JH: And reality of this is, many organizations won’t be at a maturity level to have those distinct functions and roles. And there’s going to be a blend, and it’ll be maybe one or two people that have to span the breadth of that and be able to balance traditional analytics with discovering new algorithms, to productionalizing it, the doing some data engineering, to MLOps, to speaking with the business partners and selling the decision, the new decision, the decision process to them, and so on. And that’s good and bad, obviously. You can overwhelm a small team with that, but you can also find great success in that. There’s a mindset involved in this. I don’t know who to quote this to, but it’s a good mindset that I like. It’s essentially, establish what your one, five, 10-year goals are and try to do it in six months. So you’re probably going to fail, but you’re going to be a lot further along than that person who is trying to walk to those longer-term goals.
0:08:02.0 ME: You’re saying that the larger the organization, the more likely these ideas or behavior classes will be shared across different roles but that then suggests, then small organizations or smaller organizations, one or more people may be wearing more than one hat.
0:08:19.6 JH: I think the better term is more mature data organizations. You could be a small or large organization, but what’s the maturity level of your usage of data, the support of the data needs, data strategy, data management, things like that. Often, it is… Kind of follows a sequence, where it may start with this data analyst role making the initial engagement. A business partner comes to the data team and says, “Hey, we have a desire to understand X better.” The data analyst can go and work on that, develop some initial insights. And out of those insights, that’s where the data scientist can step in and now take those insights and let’s build an algorithm for that. We understand that we reduce price, we drive up quantity, typical price elasticity.
0:09:03.8 JH: We see that in our data, our industry, our market reflects that. Well, let’s go and build an algorithm that can optimize pricing across our 80,000 SKUs. So we build this algorithm and we bring in environmental variables, variables for weather, regional variables, all this kind of stuff and really make this robust. Well, now we need to put it into production. So I hand it off to ML Engineering, they go and build this pipeline, write it in Python, maybe the data scientist worked in R, we do a conversion in the Python, they tie it into a mobile application, so sales reps can have pricing information at their fingertips while they’re having conversations.
0:09:45.6 JH: So now you have the sequence playing out, where again, often in a less mature data group in an organization, that’s going to be one or two people wearing those multiple hats. And if that’s the state, you’re a less mature organization, I think the best approach to it, and it kind of follows the notion of Agile methodology and things like that, but it’s really this MVP notion. The best way to eat an elephant is one bite at a time, is a real concept when you’re trying to grow your maturity of your data team. And let them focus on really developing the different pieces of it and getting it in place before expanding them to have to take on something more. Identify that project that you can get buy-in on, that… Expect to have some value for the organization and go and build that out, to really develop that POC and that first win.
0:10:36.6 ME: That’s interesting. That’s a fun evolution. One of the things we’ve watched change through the years is the idea of information security, regulatory compliance stuff. In days gone by in the software world, there were requirements which turned into designs, which turned into software, which turned into testing, which turned into production stuff, and that’s largely sequential. The serial dependency is going into production so waterfall-y And then as we’ve evolved and rethought the role of testing as everybody’s role and information security is everybody’s role and all of these things, and we introduced continuous integration, continuous delivery, it’s really thrown a lot of things on their head.
0:11:15.9 ME: Nowadays, we’re able to actually attach tools, and granted, sometimes they’re just literally hanging ornaments off trees, but we’re able to attach tools like vulnerability assessment tools, we can write penetration test suites or smoke suites, we can attach them to a pipeline that says, “For every new payload that comes down the line, apply these attributes, characteristics and ideas to it, and make sure that it’s heading in the direction that we all choose.” You can fail the build right there or you can flag it and send a love note to somebody and then you remediate it in a meeting later with coffee.
0:11:54.1 ME: And now, we’re all able to be together in one cross-pollinated team, bring in Infosec on purpose, so design with Infosec in mind, on purpose, from the beginning. And so, acceptance criteria and user stories and epics and all of these things have attributes that says, “For this, these things must exist and these other things can’t exist.” And now information security can be tested during the design, as well as the development continuously, instead of surprising people later like an afterthought, like salting after you’ve grilled the meat, as opposed to before, that type of thing. And even that’s its own religious conversation.
0:12:35.2 ME: With the data stuff, I’m curious. Do you feel like data is being included in… You mentioned Agile, so I’ll talk about scrum teams, delivery teams, strike teams, that type of thing. These cross-pollinated teams composed of developers, designers, human factors, folks, data folks, all of the different types of folks, one team, one priority, one deliverable, one win, one party, that type of thing. Do you feel like the idea of data is being proactively included in the design and development of ideas, or it’s an afterthought, or you’re getting Frankenstein on a regular basis and somehow you have to make magic out of a pile of garbage? How are you seeing things evolve and where do you hope it’s going?
0:13:18.1 JH: The Frankenstein is a good illustration of that. I think, often, data as it is for analytics needs is an afterthought when it comes to application design and development and everything that goes along with that. And a lot of that, I think it’s primarily due to the relative youth of advanced analytics, data science, machine learning, and so on. In reality, the moniker data scientist is maybe a decade old or so, there’s been statisticians and so on before that, and data science is really kind of just the next step down what was that path.
0:14:00.9 JH: So for example, for me, having practiced data science in a number of mature organizations, mature being they’re 90-plus years old or been around for a while and built systems to meet certain requirements, transactional requirements, things like that, and they perform their purpose well, but that purpose wasn’t necessarily with a mindset for, “How can we maybe improve this or leverage the knowledge that can come out of those systems to be applied elsewhere in the business, the data that can come out of that?”
0:14:33.5 JH: And the term I’d give that, it’s these applications are creating data exhaust, to give it a term, where it’s a byproduct, maybe it’s getting stored in a SQL server some place or some database, and maybe there’s some loose reporting being built on it, but it’s probably not easy to go and query, maybe it’s a production database by itself, so if you try to query a lot of it, you’re running into concerns of impacts on performance for the production database and production systems, and so on. And so one of the practices that I’ve been really focused on with this experience is injecting the presence of data science advanced analytics into that application design process, into the design of those new systems, to give a lens into, “What does the algorithm need to be performant? What kind of data do we need? And let’s ensure there’s a thought process behind how that data is being generated, the flexibility to test potentially within that system, how data is being generated and where it’s going, how it’s flowing out, how could it be accessed, how can it be queried?
0:15:53.2 JH: There’s a good example, this is going to be a bit of a technical example, so forgive me for this, but one of the systems in a prior organization I worked with, would move everything in very embedded, complex XMLs was how the ETL process happened. And so from a data science perspective, that’s not an easy thing to shred apart and dig into, to get to all these layers and hierarchies within a super complex XML, but the system performs to its purpose within the organization, and it does what it’s supposed to do. So from that side of it, it’s a great system that works.
0:16:36.0 JH: It’s an old system, but it works. But from the data side, it’s a mess. It causes us to have to Frankenstein things together to try to work with it, was what the outcome was. The idea is evolving, but I think it takes a diligent effort from the data team, the advanced analytics team, to engage with the architects, the developers, those groups to get your foot in the door, your seat at the table, to ensure now, as we go forward and new applications are being built and designed, there’s a mindset for, “What does data science need to be able to leverage this and take us from data exhaust into data gold or data as an asset?”
0:17:19.6 ME: This is a wonderful, wonderful, awesome mess that you’re talking about. We’ve watched the same thing through the years with testing, where it was always test in the arrears, but then people wanted to understand, “Why is the cost of acquisition and cost of ownership so darn high? Why does it hurt so badly to debug software when it’s in production?” Well, test in arrears is the answer, guys. So test-driven, moving testing or quality behaviors as far upstream as possible means consider quality while building, not later. And we’ve watched the same evolution in security, whereby we design with security in mind, as opposed to try and bolt that stuff on later.
0:18:04.6 ME: And that digital exhaust conversation that you’re talking about is a standard problem, even for old school production support people, whereby somebody built some software, they dropped a tarball, threw it over the wall, somebody pumped it on to some old rack and stack hardware in a brick and mortar, and now the developers went home and the infrastructure people have to figure out, “How are we going to make this sucker run?” And then after that, “Why is it broken? Oh gosh, we don’t have log files.” So we have all kinds of challenges through history of no logging, some logging, way too much logging, you’re killing me.
0:18:42.0 ME: And the Infosec people in particular have been on the wrong end of the stick for that and testers were too, where they had to go figure out why, not what, why. Well, hello logging. And Infosec people, they have inconsistent logging, so they trap everything, like they’re the Costco of data, just trying to find any action, so that they can then attach tools and do sifting on it. So we’ve watched software, in particular, change from, “I do my job, now you do your job,” to, “We are doing this job together,” and it sounds like you are smack in the middle of that outstanding, awesome, messy, sometimes painful evolution, which is, “This is a thing, but not enough people understand the value of the thing, so they’ve got us sitting in this room without windows.”
0:19:33.7 JH: Yes, you hit the nail on the head, Matthew. And that ties back into the conversation of roles and so on. If you go back to the development of a software engineering team or Infosec team, cyber security, things like that, we’re getting established, finding how we fit into the organization, depending on… There’s a lot of opinions on this too, right now, in terms of where should advanced analytics data sit within your organization? Do you report up through IT? Do you report up through marketing? Where do you touch? That’s another sort of big question that’s out there.
0:20:12.2 JH: My preference and what I’m coming to understand really works best is to really establish its own pillar in the organization. So the same way that you have marketing, same way you have IT, finance, you have data, having a chief data officer that has a C… And reports up to the CEO and everything underneath of that, that is really when I think getting to that state means that data is seen as a valuable asset to the organization and is understood as a tool to drive this evolution into a next stage of growth for many organizations trying to achieve those dreams of AI, machine learning and so on, that lie out there.
0:20:53.9 ME: A lot of these paradigms might be continually challenged, if not destroyed and re-factored. So the idea of these verticals have, how do I separate data from marketing, from IT, from ops. A lot of those things are HR, Human Resources derived frameworks, but they aren’t delivery frameworks. And so we’ve continued to have this interesting challenge in companies, of, “I have all of these vertically organized people, but they have to deliver horizontally.” So how that gets addressed on the CDO side or embedded or whatever, companies are going to figure that out on their own, they usually do. Although across whole careers, not necessarily on Saturday. An interesting thing you’ve said to me though, although you didn’t really say it like this, it makes me think that the idea of data by design is actually a thing, and that when we’re building systems, when we’re building out epics and user stories and acceptance criteria, the people that are there, the developers, the designers, the data folks, sometimes that gets messy where people think it’s an old kind of a database perspective as opposed to, “What do I actually want to know? What am I actually going to do?” And let that influence the design and the implementation thereafter. Without asking those questions, this is a Frankenstein conversation all day, every day. Data by design needs to become a thing and data needs to be included in strike teams or delivery teams on purpose on a regular basis.
0:22:30.6 JH: The importance of the presence of that knowledge on what’s needed to bring that data to value, to become an asset. So you mentioned asking the question of what do we need and what do we want to know, that really has to come from the data scientist, the advanced analytics team, having a voice in that conversation, to be able to say, “If we’re building an application that is going to provide recommendations for a product to an end user, well, in that application, I need to know potentially what algorithm is going to be applied, how it’s going to be applied, and what does that algorithm need to perform from a data perspective. How easy… Is it going to be a online versus offline learning environment, which essentially the differences between streaming versus batch in terms of how we model and build predictions. What does that mean? What is that going to take? Do I need certain REST APIs built in to access data in some way, or is it going to be a batch dump overnight, into the data lake for us to build something on?”
0:23:34.7 JH: All those questions really need to be designed and have a perspective from a data scientist or an engineer that has knowledge of the data science requirements, the process, and preferably it would be the joining of those two together to allow them to work and bridge that gap. But it’s in… The success that I’ve had in driving those conversations, it’s been, “How do you get creative in trying to convince people that doing so expands the size of the pie and doesn’t just take a bigger slice of the pie for me or for you?” So finding that benefit, that software developer, that systems architect, whoever you’re working with, engaging them in a conversation in a way that lets them see the benefit to them, from a data science perspective, so that you get that buy-in because I know now, with their support, my life’s going to be easier because I’m going to get the data, the access that I need to build a stronger model, a more robust model.
0:24:36.0 ME: One of the other interesting things that you said I’d like to amplify is, you talked about how in some environments where the idea of analytics wasn’t taken into consideration in advance, you end up having to go find out if data exists at all and if it does exist, in what state is it captured, if at all and is it fragmented, dirty, is it sporadic, what do you have available to you, and what state is it in? You have to do that before you can even decide, “Okay, here’s the problem we want to solve, here are the things we need to know, here are the desired outcomes, or the things we want to decide along the journey. So I need this data. What’s in the system already?”
0:25:19.4 ME: So that impacts people’s perceptions of the adoption velocity of data people too, I would think. In other words, somebody says, “Dude, all I want to know is… What I want to know what’s taken you so darn long.” And your answer is, “But you never looked at this before, so we don’t collect all of the data. Some of the data we do collect is in 700 repos spread out across… Who knows? Time and space, and most of it’s dirty. So before I can even get to my job, I have to find the data, clean the data, get the data, and then get people to re-factor stuff.” That makes it look like you guys are slow. So how do you handle that? What kind of experience are you having there?
0:26:04.9 JH: Yeah, so that is… Directly ties into the power of storytelling. The power of storytelling of the journey, not waiting until we have, “Here’s a shiny object, we built it and let’s show that,” but showing the journey that we’re on to get to that object, that output and so on. because you’re right, the reality is that often, the mindset from those requesting the insight is, “There’s got to be an easy button. You’re a data scientist, we have data, just click your button, hit your mouse and tell me my answer.” In many ways, those questions that are being asked of us are all in themselves mini-innovations, because they’re not standard run-of-the-mill questions. It’s…
0:26:53.8 JH: You captured it well, Matthew, in terms of, “We’ve got to go and find this data, clean the data, experiment, iterate on those experiments, potentially bring it to production, whatever, build an interface for it to be consumed” and so on. And so it’s important to be vulnerable and honest with that journey and educate those stakeholders on, “This is the reality of the current state, what we’re working with. We’ve dug into… You came to us with your question, we’ve gone out and did our initial assessment exploration, this is the current landscape that we have and because of that, this is going to be the roadmap, the timeline to achieve what we need, and we’ll engage with you as we go forward.”
0:27:39.0 JH: “We have a weekly, biweekly, whatever that time frame is, dialogue with you to update on progression, pivot and iterate and so on.” But it’s that storytelling that is essential. Going on a bit of a tangent here, that’s… I think, in terms of resources to go and educate and become a data scientist that are out there, those programs do great at learning the technical side of data science, but it’s that relationship, the storytelling side, again, that is as critical as any ability to write an algorithm, to program in Python and so on. How do you inform of what it takes, give transparency to that process, to build that relationship with your business partners, is essential.
0:28:30.5 ME: That makes sense. So the storytelling and the relationships. And it sounds like really, leadership needs to have an understanding of the value and need for analytics to start with, but then they need to have an additional understanding of, it needs to be data by design. And so, you could be walking into a legacy house and you need to figure out how to retrofit. Well, that’s going to have a slower adoption velocity than if I was starting with a brand new system, zero code on a blank screen and I can do data by design. And so the relationship, the communication, the story, that’s probably a pinnacle part of your entire existence, which is communicate.
0:29:11.2 JH: It is, and a good framework for it, that I think can help that story, it’s one, it’s positioning as… Often, it’s a capability, you’re developing a new capability for the organization, which is advanced analytics, assuming you’re not mature, it’s a different state. But that capability building, there’s really four pillars to that. It’s people, process, technology, and governance is kind of what I put into that. And so how do you, within those four pillars again, of people, process, technology and governance, what do you need to accomplish within those pillars? What gaps do you have? And tell the story around that. How do I go and resource this properly? Is it a data issue? Is it a application design issue? Is it a… We don’t have the right question coming from the business? We can’t answer that. This is a better question. Within that building of the capability, put the story together and I think that becomes useful to that dialogue, that relationship building with the business partners.
0:30:14.3 ME: As the idea of data, data science, data analytics is evolving, as its own body of knowledge, its own set of practices, you’re actually doing software development in Python and R. That being said, even though your output includes mathing, lots of it, the reality is, you’re delivering software in some way, shape or form that needs to be integrated into a larger ecosystem of some sort. So different question for you. Based on your experiences and the things that you’ve seen and just the general industry, given that it’s actually a software engineering craft, in addition to all of the wonderful analytical math and algorithm, all the things that you’re doing, do you feel like the data science industry itself recognizes that they are software developers, and therefore they also need to be pursuing software craftsmanship?
0:31:11.7 JH: Yes, mostly. That’s…
0:31:15.4 ME: I realize that was meaty. But anytime somebody says, “I build software,” we need to build reliable software, and that requires lots of good engineering practices.
0:31:26.4 JH: It does, right? So it’s a great question, and the reason I say yes, mostly, is because this relates back to the notion of the different roles and disciplines, data scientists, machine learning, engineering and so on, but I follow this as well. I’ll say, I came into this discipline from the statistics side, and not from the software engineering development side. And being vulnerable here, being candid, it shows in the way I write code. So it’s very much I write code for experiment and iteration and prototyping in that data science mindset. And you’re right, what’s needed though when you take that into production, you need quality code meets the Python style guide, stuff like that, commented well, if you believe in commenting, all that kind of stuff.
0:32:16.8 JH: That’s where that software development really comes into play. And I think the reality is, there’s probably a bit of a mismatch in skills there, if you can… But I think it’s evolving and becoming more refined as we go forward. There is a skill set difference between those two, even from the standpoint of… As we develop and leverage things like GitHub and code repositories and stuff like that, and everything that goes along with software, software engineering, software development, that’s a growing… Has a growing presence on the data science side as well, the collaboration of algorithm, coding and building a notebook, all that kind of stuff. So it’s a great question, but I would say it’s still predominantly kind of an experiment, prototyping side, and then… How do you refine that into well-written production code, on the other side of that.
0:33:16.6 ME: It’s an evolution for everybody. Even historical hardware-based, the rack and stack, brick and mortar, data center type folks, the infrastructure type folks, the people that were historically doing those types… Those focused operational behaviors, that world has changed out from under them as well, where we’ve moved into cloud engineering, and if I can have a 100% software to find everything, then that means all of a sudden, software developers can actually define all of their own infrastructure and networking and failover and all of the rubbish. But at the same time, now, the infrastructure folks actually need to become software developers. So we’re watching lots of amazing and awesome things change, and the data world is just another lovely facet of how we’re evolving, building things that are useful to us. Really, ultimately, you just have to figure out like we all are, is, “What problem are we trying to solve? What are the desired outcomes and what are the things that are necessary to get from there to there?” and then design it and do it in such a way, and especially attitudinally, be willing to change.
0:34:30.2 ME: “I am going to break something. I’m not as smart as I think that I am, and I have to be reminded daily,” and I do get reminded. It’s just an evolutionary thing. I think this journey that you’re on is phenomenal, and it’s not because you have all the answers, it’s because you don’t. That’s what makes it phenomenal. And I think people miss that, when they consider iterative development or iterative change, is, “It’s okay, tomorrow, I’m going to be plus one.” Is that where you think your industry is, is absolutely, plus one? Are you thinking you’re 10X daily like, “Dude, we have a long way to go”?
0:35:12.2 JH: No, I like the way you kind of illustrate that, Matthew. And what’s in that, I think, is most valuable there, it’s the realization that we don’t know everything, and the participants in the room don’t know everything. I think when you’re pursuing, whether it’s a data science objective, whatever it is, having that understanding that we’re all learning, is as valuable as anything, and allows for… I’ve used this term a few times, vulnerability to be present and to be comfortable with that, where I don’t know everything there is to be known about topic X, you may know more than me, but let’s be open about that and build our knowledge collectively, again, expand the size of our pie, as opposed to one of us taking a bigger slice, is I think, an important mindset to have, not only in building and maturing data science, advanced analytics, but in whatever you’re taking on is essential, the scientific mindset. Really, the understanding that once you realize that, you know enough to know that I don’t know, that is a good state to be in.
0:36:28.1 ME: There’s the interesting pure science of this whole conversation, the creation of and evolution of an idea, and then there’s the operational science of this idea, which is, “This business has allocated a million dollars to this project, and it has some amazing set of features that need to exist, that serve these users and these industries, and there’s a definition of done, desired outcomes and all that,” there’s a box. And so somehow, you have this amazing challenge of telling a story that makes the idea of data, where it is in its life span, and the value of data, as it relates to this business and project come to life for somebody to say, “Yep, we should be doing this for sure.” But then you have to figure out how to get inside this existing, moving organism as well, which is, “We build stuff, we move it into production, we generate revenue, serve clients, make them all smile.” You’re building a plane and flying it at the same time, and even though this isn’t a Zoom video for people that don’t know, we do Zoom so that we can interact with each other in video. Jacey, you’re still smiling this whole time like, “Yeah, this is a bunch of crazy, and I love every second of it.”
0:37:43.6 JH: Yes, it’s enjoying the journey, enjoying the grind, whatever term you want to give to it, is essential for, again, not just the path I’m on or you’re on, Matthew, whatever it is, falling in love with that journey and the chaos of that, and the opportunity to learn within that space. My personality, I’m driven by learning. If I see this as an opportunity to learn, that’s what motivates me to go and pursue it and take that on, and data science, advanced analytics, this whole discipline space is rich with that. It’s learning every day. For me, it’s learning a new algorithm, a new mathematical concept, a new development idea. How to integrate, move into a cloud environment. That’s a whole other beast in itself, as all the services of cloud and transforming from on-prem to cloud and everything goes along with that. So the space for learning is vast, that’s exciting, and it should be.
0:38:50.8 ME: So as we start to wrap up, I wondered if we could get your viewpoint on the idea of data and all of the roles, just example, the roles that you’ve talked about, they may or may not exist in all of the different companies or all of the different HR frameworks or whatever it is we want to talk about, and the value of data and when data and how data and where to include them, and when should it… The front… Did you do it in arrears? Am I good with Frankenstein? Why is… What’s my adoption velocity? Why did it cost so much money just to get this data? What is going… That crazy, crazy mess. If someone is going to say, “Hey, I want to figure out what data analytics is, and I want to figure out how I can leverage these things to evolve my company,” how do people figure out where to start? Is there a clean answer or is it context-driven? Is it just always messy?
0:39:44.8 JH: My perspective on it, it starts with understanding, “What are the desires of the organization?” Obviously, “Are we developing a new product? What’s our strategy look like?” All that kind of stuff, in terms of that vision going forward. And from that, it’s understanding, “What’s the current data landscape look like?” And that’s a beast in itself, in defining that. But it’s really getting your mind around that as a starting point, can often inform, “What are we capable of? What can we do now? And who or what resources do we need to level up and move forward?”
0:40:25.0 JH: As poor as this can sound, I think oftentimes, companies like to just jump to, “Let’s get a data scientist, they’ll solve it.” Well, the data scientist comes in, if they don’t have the data to work on, they’re just kind of floating out there, trying to figure that out or missing that piece. And so, data as a foundation and working on that, I don’t think it’s ever solved, but focusing on that, building it so it becomes a true resource and not just exhaust, that is… That’s, I think, the initial, essential key focus to launch off of. And in that, it may be a combination of data science and data engineering coming together, whatever that is, but I think, in my perspective, that foundation of building a strong, robust data environment is essential to any success that can come out of that, come out of the venture and the path into advanced analytics, machine learning, AI, and so on.
0:41:25.2 ME: If you don’t know what you want to know, or you don’t know where you want to be after this effort has happened, adding a data scientist isn’t going to change anything other than your budget, your run rate, but it’s not going to change your outcomes. So, it’s kind of like, you shouldn’t ever go to the grocery store on an empty stomach and you should know why you’re going there before you walk in, or don’t send me. That’s the net. You really need to know where you want to be, or else don’t just hire somebody.
0:41:55.7 JH: From a data science perspective, hearing the terms, “Go and discover something for me in the data” is often a little cringe-worthy. because then it’s a… You need that objective, I need to know, “Am I trying to make lasagna? So this is the ingredients I have to go get from the grocery store to make lasagna.” Sending us on that, just a wild goose chase, to say, “Go and find X millions of dollars in the data.” It’s possible, but it may not be super probable. But having an objective, “We’re trying to solve this question, this business problem,” then now we have something concrete to anchor around, to go look for in the data and build this for a purpose and objective and so on.
0:42:38.9 ME: Well, I think we ought to go explore some more of these subjects together. So for today, what I want to say is thank you, and I look forward to talking with you again real soon.
0:42:49.8 JH: Thank you, Matthew, I appreciate it.
0:42:55.0 Speaker 2: The Long Way Around the Barn is brought to you by Trility Consulting, where Matthew serves as the CEO and President. If you need to find a more simple, reliable path to achieve your desired outcomes, visit trility.io.
0:43:11.4 ME: To my listeners, thank you for staying with us. I hope you’re able to take what you heard today and apply it in your context, so that you’re able to realize the predictable, repeatable outcomes you desire for you, your teams, company and clients. Thank you.