Some Ideas on Combining Design Thinking and Data Science

Recently, I had the opportunity to finish Stanford SCPD’s XINE 217 “Empathize and Prototype” course, as part of the Stanford Innovation and Entrepreneurship Certificate, which emphasizes the use of design thinking ideas to develop product and solution ideas. It is during this course, that I wrote down a few ideas around the use of data in improving design decisions. Design thinking is a modern approach to system and product design which puts the customers and their interactions at the center of the design process. The design process has been characterized over decades by many scholars and practitioners in diverse ways, but a few aspects are perhaps unchanged. Three of these are as follows:

  1. The essential nature of design processes is to be iterative, and to constantly evolve over time
  2. The design process always oversimplifies a problem – and introduces side effects into the customer-product or customer-process interactions
  3. The design process is only as good as the diversity of ideas we use for “flaring” and “focusing” (which roughly translate to “exploring ideas” and “choosing few out of many ideas” respectively).

Overall, the essential idea conveyed in the design thinking process as explained in XINE 217, is “Empathize and Prototype” – and that phrase conveys a sense of deep customer understanding and focus. Coming to the process of integrating data into the design process – by no means is this idea new, since engineers starting from Genichi Taguchi, and perhaps even engineers a generation before Taguchi, have been developing systems models of processes or products in their designs. These systems models are modeled as factor-response models at some level, because they are converted to prototypes via parameter models and tolerance design processes.

Statistically speaking, these are analogues of the overall designed experiment practice, where a range of parameter variables may be considered as factors to a response, and are together modeled as orthogonal arrays. There’s more detail here.

Although described above in a simplified way, data-driven design approaches, grouped under the broad gamut of “statistical engineering” are used in one or other form to validate designs of mechanical and electrical systems in well-known manufacturing organizations. However, when you look at the design thinking processes in specific ways, the benefits of data science techniques at certain stages become apparent.

The design thinking process could perhaps be summarised as follows:

  1. Observe, empathise and understand the customer’s behaviour or interaction
  2. Develop theories about their behaviour, including those that account for motivations – spoken and unspoken aspects of their behaviour, explicit and implicit needs, and the like
  3. Based on these theories, develop a slew of potential solutions that could address the problem they face (“flare”)
  4. Qualify some of these solutions based on various kinds of criteria (feasibility, scope, technology, cost, to name some) (“focus”)
  5. Arrive at a prototype, which can then be developed into a product idea

While this summary of the design thinking approach may appear very generic and rudimentary, it may be applicable to a wide range of situations, and is therefore worth considering. More involved versions of this same process could take on different levels of detail, whether domain-specific detail, or process-wise rich. They could also add more fine-grained steps, to enable the designer to “flare” and “focus” better. As I’ve discussed in a post on using principles of agility in doing data science, it is also possible to iterate the “focus” and “flare” steps, to get better and better results.

Looking more closely at this five-step process, we can identify some ways in which data science tools or methods may be used in it:

  1. Observing consumer behaviour and interactions, and understanding them, has become a science unto itself, and with the advent of video instrumentation, accelerometers and behavioural analysis, a number of activities in this first step of the design thinking process can be improved, merely by better instrumentation and measurement. I’ve stressed the importance of measurement on this blog before – for one, fewer samples of useful data can be more valuable for building certain kinds of models. The capabilities of new sensors also make it possible to expand the kinds of data collected.
  2. Developing theories of behaviour (hypotheses) may be validated using various Bayesian (or even Frequentist) methods of data science. As more and more data gets collected, our understanding of the consumer’s behaviour can be updated, and Bayesian behavioural models could help us validate such hypotheses as a result.
  3. In steps 3 and 4 of the design thinking process I’ve outlined above, the “focusing and flaring” routine, is at one level, the core experimental design practice described by statistical pioneers including Taguchi. Using some of the tools of data science, such as significance testing, effect size determination and factor-response modeling, we could come up with interesting designs and validate them based on relevant factors.
  4. Finally, the process of prototyping and development would involve a verification and validation step, which tends to be data-intensive. From reliability and durability models (based on Frequentist statistics and PDF/CDF functions), to key life testing and analysis of data in that context, there are numerous tools in the data science toolbox, that could potentially be used to improve the prototyping process.

I realize that a short blog post such as this one is probably too short to explore this broad an intersection between the two domains of design thinking and data science – there’s the added matter of exploring work already done in the space, in research and industry. The intersection of these two spaces lends itself to much discussion, and I will cover related ideas in future posts.

Insights about Data Products

Data products are one inevitable result and culmination of the information age. With enough information to process, and with enough data to build massively validated mathematical models like never before, the natural urge is to take a shot at solving some of the world’s problems that depend on data.

Data Product Maturity

There are some fundamental problems all data products aim to address:

  1. Large scale mathematical model building was not possible before. In today’s world of Hadoop and R/Python/Scala, you can build a very specific kind of hypothesis and test it using data collected on a massive scale
  2. Large scale validation of an idea was not possible before. Taking a step back from the hypothesis itself, the presence of big data technologies and the ability to test hypotheses of various kinds ultimately helps validate ideas
  3. Data asymmetry problems can be addressed on a scale never seen before. Taking yet another step back from the ability to validate diverse ideas, the presence of such technologies and models allows us to put power in the hands of decision makers like never before, by arming them with data.

dataproduct_maturity

Being Data Driven: Enabling Higher Level Abstractions of Work

Cultivating a data-driven mindset is hard. I have blogged about this before. But when the standard process workflows (think Plan-Do-Check-Act and Deming) are augmented by analytics, it is amazing what happens to “regular work”. The need to collect, sort and analyze data in a tireless, diligently consistent and unbiased fashion gets delegated to a machine. The human being in organization is not staffed with the mundane activities of data collection and management. Their powers are put to use by leveraging higher reasoning faculties – to do the data analysis that results in insight, and to interpret and review the strategic outcomes. The higher levels of abstraction of work that data products enable help organizations and teams mature.

And this is the primary value addition that a lot of data products seem to bring. The tasks that humans are either too creative for (or too easily bored because of) get automated, and in the process, the advantages of massive data collection and machine learning are leveraged, to bring about a decision making experience that truly eclipses prior generations of managers in the ability and speed to get through complex decisions fast.

Data Product Opportunities

Data products will become a driving force for industrializing the third world nations, and may become a key element of the business strategy of the largest of the large corporations. The levels of uncertainty in business today echo the quality of tools available, and the leverage that this brings. The open source movement has accelerated product development teams in areas such as web development, search technologies, and made the internet the de-facto medium of information for a lot of youngsters. Naturally, these youngsters will warm up faster than the previous generations about the data products available to them. Data products could improve the lives of millions, by enabling the access economy.

dataproduct_opportunities

While the action is generally in the upper right quadrant here, with companies fighting it out for more subscribers and catering to modern segments of industry that are more receptive to ideas, the silent analytics revolution may actually happen in brick and mortar companies that have fewer subscribers and have a more traditional mindset or in a more traditional business. Wherever possible, companies are delivering value by digitization, but a number of services cannot be so digitized, and here is another enabling opportunity. The data products in this space may not attempt to replace the human, or replace the traditional value proposition. Instead, they can function in much the same way IoT is disrupting enterprises. Embedded systems and technologies are definitely one aspect of the silent analytics revolution in the bottom left quadrant, which may have large market fragmentation and entrenched business models that haven’t moved on from decades or centuries old ideas.