A View of DevOps from the World of Data Science

Operations management as a discipline has taken many shapes and forms in different industries over the years, but there is perhaps something unique that is discussed in software development operations, commonly referred to as DevOps. Many of these considerations around DevOps also apply to the related and increasingly interesting subset of problems that is MLOps, which is the field of Machine Learning Operations. So, what is unique about DevOps and the discussions of software development operations in this context?

Perceptions of DevOps Today And Contrasts with Traditional Industry Operations

One of the tweets I came across recently by a manager hiring for DevOps roles was this one, that sparked an outpouring of ideas from me. The entire thread is below, with the original tweet for context.

Popular understanding of DevOps seems to revolve around tools. Tools for managing code, workflows, and applications for helping with this or the other thing encountered in the context of software development workflows. Strangely enough, operations management in industries that are more established, such as the manufacturing industry, oil and gas or energy industry, or telecommunications tend to have the following sets of considerations:

  1. People considerations: From the hiring and onboarding of talent for the organization, to the development of these as productive employees, to employee exit. Operational challenges here may be the development of role definitions, establishing the right hierarchy or interactions for smooth operations, and ensuring that the right talent is attracted and retained in the organization.
  2. Process considerations:  All considerations spanning the actual process of value delivery, whereby the resources available to the organization are put to use to efficiently solve day-to-day problems and meeting customer requirements on an ongoing basis. Some elements of innovation and continual improvement would also fall into the ambit of the process management that’s part of Operations.
  3. Technology considerations: All considerations spanning the application of various kinds of technology ranging from the established and mundane, to the innovative and novel – all of these could be considered a part of technology management within Operations in traditional organizations.

Anyone familiar with typical, product-centric or services-oriented software development organizations will observe that the above three considerations are spread out among other supporting functions of these organizations. Perhaps technically centred organizations with very specific engineering and development functions evolve this way, and perhaps there is research to show for this hypothesis. However, the fact remains that what is considered development operations doesn’t normally involve the hiring and development of talent for product/solution engineering or development, or the considerations around the specific technologies used and managed by the software developers. These elements seem to be subsumed by human resources and architects  respectively.

Indeed, the diversification of roles in software development teams is so prolific that delivery managers (of the so-called Scrum teams) are rarely in charge of the development operations process. They’re usually owners for specific solution deliverables. The DevOps function has come to be seen as a combination of software development and tooling roles, with an emphasis on continuous delivery and code management. This isn’t necessarily a bad thing, and there is a need for such capablities  – arguably there is a need for specialists in these areas as well. But here’s the challenge many managers hiring mid-senior professionals for managing DevOps:

Cross Functional DevOps and Lean in Manufacturing Operations

When we have DevOps engineers and managers only interested in setting up pipelines for writing and managing code, rather than thinking holistically about how value is being delivered, and whether it is, we miss crucial opportunities for continuous improvement.

As someone who has worked in both manufacturing product development and software product development teams, I find that there needs to be a greater emphasis in software development organizations on cross-functional thinking, and cross-functional problem solving. While a lot of issues faced by developers and engineers in the context of product or solution development are solved by technical know-how and technical excellence, there are broader organizational considerations that fit into the people, process and technology focus areas, that are important to consider – and without such considerations, wise decisions cannot be taken. A lot of these decisions have to do with managing waste in processes – whether that is wasted effort, time or creativity, or technical debt we build up over time, or redundancy for that matter. The Lean toolbox, which originated from the manufacturing industry, provides us a ready reckoner for this, titled the “eight wastes in processes”: inventory, unused creativity, waiting, excess motion, transportation, overproduction, defects and overprocessing. Short of seeing all development activities through these “waste lenses”, we can use them as general guidelines for keenly observing the interactions between a developer, his tools, other developers, and code. Studying these interactions could yield numerous benefits, and perhaps such serious studies are common in some large enterprise DevOps contexts, but at least in the contexts I’ve seen, there’s rarely discussions of this nature with nuance and deep observation of processes.

In fact, manufacturing organizations see Lean in a fundamentally different way from how software development teams see it.

Manufacturing organizations heavily emphasize process mapping, process observations and process walks. And I shouldn’t paint all manufacturing organizations by the same brush, because indeed, the good and the bad ones in this respect are like chalk and cheese – they’re poles apart in how well they understand and deploy efficient operational processes through Lean thinking. Many may claim to be doing Six Sigma and structured innovation, and in many cases, such claims don’t hold water because they’re using tools to do their thinking.

Which brings me to one of the main problems with DevOps as it is done in the software development world today – the tools have become substitutes for thinking, for many, many teams. A lot of teams don’t evaluate the process of development critically – after all, software development may be a team sport, but in a weird way, software developers can be sensitive to replay and criticism of their development approaches. This is reminiscent of artisans in the days before mass production, and how they developed and practised an art in their day to day trade. It is less similar to what’s happening in large scale car or even bottle manufacturing plants around the world. Perhaps there are good reasons for this too, like the development of complexity and the need for specialization for building complex systems such as software applications, which are built but once, but shipped innumerable times. All this still doesn’t imply, however, that tools can become substitutes for thinking about processes and code – there are many conversations in that ambit that could be valuable, eye-opening elements of any analysis of software development practices.

MLOps: What it Ought to Include

Now I’ll address machine learning operations (MLOps) which is a modern cousin of DevOps, relevant in the context of machine learning models being developed and deployed (generally as some kind of software service). MLOps have come to evolve in much the same we saw DevOps evolving, but there is a set of issues here that go beyond the software-level technicalities, to the statistical and mathematical technicalities of building and deploying machine learning systems.

MLOps workflows and lifecycles appear similar to software development workflows as executed in DevOps contexts. However, there ought to be (and are) crucial differences in how these workflows are different between these two disciplines (of software engineering and machine learning engineering).

Some of the unique technicalities for MLOps include:

  1. Model’s absolute performance, measured by metrics such as RMSE or F1 score
  2. Model deployment performance against SLAs such as latency, load and scalability
  3. Model training and retraining performance, and scalability in that context
  4. Model explainability and interpretability
  5. Security elements – data and otherwise, of the model, which is a highly domain-dependent conversation

In addition to these purely technical elements of MLOps, there are elements of the discipline in my mind, that should include people and processes:

  1. Do we have engineers with the right skills to build and deploy these models?
  2. Have we got statisticians who can evaluate the underlying assumptions of these ML models and their formulation?
  3. Do we have communication processes in the team that ensure timely implementation of specific ML model features?
  4. How do we address model drift and retraining?
  5. If new training data comes from a different region, can it be subject to the same security, operational and other considerations?

There may be more, and some of you reading this, who happen to have deployed and faced production scale ML model development/deployment challenges, may have more to add. MLOps should therefore see significant discussions around these elements, and these and other related discussions should happen early and often, in the context of ML model deployment and maintenance.

Statistical Competence and Its Importance for Good Data Science Careers

In 2019, enterprises routinely begin initiatives related to analytics, data science and machine learning that invoke specific technologies from a very early stage in their initiatives. This tendency to put technology ahead of value sometimes extends to analytics champions and managers who take up or lead data-intensive initiatives. While this may seem pragmatic at one level, at another level, it may lead to significant problems when ensuring successful outcomes from such analytics initiatives and programs. In this post, I’ll address the three-pronged conundrum of statistical competence in the data science world, specifically in the context of data science consulting and services, and specifically what it means for the careers of data science candidates now and in the future.

Hiring Statisticians: An Expert’s View

Kevin Gray is one of my connections on LinkedIn who posts insightful content on statistical analysis and related topics on a regular basis, including very good recommendations for books on various statistical and analytical techniques and methods. One of his recent posts was an article he’d authored titled “What to Look For in a Statistician” (the article, and my comment), which definitely resonated with my own experiences in hiring statistically competent engineers in different settings, such as data science and machine learning, between 2015 and today. In years past, I have had similar experiences when hiring competent product engineers and manufacturing engineers in data-intensive problem solving roles.

The importance of statistical thinking and statistical analysis in business problem solving cannot be underestimated. However, even good advice that is canon, and that is well-acknowledged, often falls on deaf ears in the hyper-competitive data science job market. Both hiring managers and recruiters tend to emphasize keywords comprising the latest framework or approach, over the ability to think critically about problem statements, carefully architect systems, and rigorously apply statistical analysis and machine learning to real world problems while keeping considerations of explainability in mind.

The Three-Pronged Conundrum of Data Science Talent

Now you might ask why I say this, and what I really mean by this. The devil, as they say, is in the details, and one essential problem with the broad and wide proliferation of tools, frameworks and applications of high capability, that can perform and automate statistical analysis of different kinds, is the following three-pronged conundrum:

  1. Lack of core statistical knowledge despite having a working knowledge of the practicum of advanced techniques: Most candidates in the data science job market who are deeply interested in building data science and ML applications have unfortunately not developed skills in the core statistical sciences and statistical reasoning. Since statistics is the foundation for machine learning and data science, this degrades the quality of projects and programs which have to rely on hiring such talent. When they prefer to use software to do most of or all the thinking for them, their own reasoning about the problem is rarely good enough to critically evaluate different statistical formulations for problems, because they think in very set and specific ways about problems thanks only to their familiarity with the tools.
  2. Tools as an unfortunate substitute to statistical thinking: Solutions, services and consulting professionals in the data science and advanced analytics space, who have to bring their best statistical thinking to client-facing interactions, are unable to differentiate between competence in statistical thinking, and competence in a specific software tool or approach.
  3. Model bloat and inexplicability: The use of heavy, general purpose approaches that rely on complex, less explainable models, than reliance on simpler models that are constructed upon a fuller understanding of the true dynamics of the problem.

These three sub-problems can derail even the best envisioned data science and machine learning initiatives in product / solution delivery firms, and in enterprises.

Some “Unsexy” Characteristics of Good Data Scientists

These are also not “sexy” problems – they’re earthy, multi-dimensional, real world problems that have many contributing factors, from business and how it is done, to the culture of education and the culture of software and solution development teams. Kevin Gray in his post touches upon attitudinal qualities for good statisticians, which could also be extended to data science leaders, data scientists and data engineers:

  1. Integrity and honesty are important in data science – this is true especially in a world where personal data is being handled carelessly and sometimes gratuitously by many applications without heed to data protection and privacy, and when user data is taken for granted by many technology companies. This is not an easy expectation or evaluation point for hiring managers, since it is only long association with anyone which allows us to build a model of their integrity, and rarely does one effectively determine such an attribute in short interviews. What’s dismal about data science hiring sometimes, is the proliferation of candidate resumes which are full of fluff, and the tendency of candidates to not stand up to scrutiny on skills they identify as “key” or “core” skills.
  2. Curiosity and a broad spectrum of interests – this cannot be understated in the context of a consulting data science or machine learning expert. The more we’re aware of different mental models and theoretical frameworks of the world and the data we see in it, the better we’re able to reason starting from hypotheses about the data. By extension, we’re better able to identify the right statistical approaches for a problem when we start from and explore different such mental models. The book I’ve linked to here by Scott E. Page is a fantastic evaluation of different mental models. But with models come biases, to restate George E. P. Box’s famous quote, “All models are wrong, some models are useful”.
  3. Checking for logical fallacies is key for data science reasoning – I would add to the critical thinking element mentioned in Kevin’s post, by saying that it behooves any thought leader such as a data science consultant to critically evaluate their own thinking by checking for logical fallacies. When overlooked, a benign piece of flawed reasoning can turn into a face-melting disaster. The best way to ensure this does not happen is to critically evaluate our ideas, notions and mental models.
  4. Don’t develop one hammer, develop a tool box – Like experienced plumbers, carpenters or mechanics, the tools landscape of a data scientist today should not be one of quasi-religious fervor in promoting one technique at the cost of others, such as how deep learning has come to be promoted in some circles as a data science panacea. Instead, the effective data scientist is usually pragmatic in their approach. Like a tailor or carpenter who has to cut or join different materials with different instruments, data scientists today do not have the luxury of getting behind one comfortable model of thinking about their tool set and profession – and any attempt to do this can be construed as laziness (especially for the consulting data scientist) at best. While the customer is always right, there are times when the client can be wrong and it is at these times that they need the advice of a qualified statistician or data scientist. If there is one time when data scientists should not abandon their statistical thinking, it is this kind of a situation.

Concluding Remarks

To conclude, data scientists ought not to be seen as resources that take data, analyze it using pre-built tools, and write code to explain the data using pre-built libraries of various kinds. They’re not software jockeys who happen to know some statistics and have a handle on machine learning workflows. Data scientists’ work scope and emphases as industry professionals and consultants go way beyond these limited definitions. Data scientists are expected to be dynamic, statistically sound professionals who critically evaluate real world problems based on theories, data and evidence drawn from many sources and contexts, and progressively build a deeper understanding of these real world problems that lead to tangible value for their customers, be they businesses or the consumers of products. The sooner data scientists realize this, the better off they will be while charting out a truly successful and fulfilling data science career.