Andrew Ng’s DeepLearning.AI (Coursera) Certification

2017-10-21 19_43_58-Clipboard

One of the more interesting mental models of machine learning I’ve come to understand in the last month or so, is the “five tribes of artificial intelligence” model popularized in “The Master Algorithm” by Pedro Domingos. To summarize in a phrase, the master algorithm is that approach which can uncover all possible insight from data – and Prof. Domingos hypothesises that there are five distinct such “master algorithms”, one for each of these tribes. One of these “tribes” is the connectionists, whose master algorithm is, in fact, backpropagation, which is central to the design and operation of neural networks.

A Connectionist Tour Guide

In a sense, the deep neural network has become synonymous with artificial intelligence today. There are numerous other algorithms which could lend a sense of intelligence to machines – whether by communicating in natural language as a conversationalist (starting from rudimentary bots like ELIZA through Pootwattle and Smedley (of U Chicago fame), to modern chatbots), or by learning to differentiate different kinds of faces, or identify emotions of specific kinds. The deep neural network has successfully been applied to numerous such real world problems, and therefore stands out as being promising on this account. For the other tribes, we don’t yet have algorithms such as “advanced induction inference machines”, or “higher dimensional kernel machines” – whatever these may indicate (really or apocryphally). So it behooves us to pay attention to stories such as this one, which discuss the “unreasonable effectiveness” of neural networks.

 

DeepLearning.AI’s Course

There’s definitely a skills gap in the advanced machine learning and artificial intelligence space. Businesses are as yet unable to see value beyond the hype. Unsurprisingly, the skills gap has to be addressed at the very root – the fundamentals, where the ability to model problems, computationally solve them, and build systems out of such solutions intersect. Andrew Ng has, also unsurprisingly, taken a stab at the deep learning space, if his “AI is the new electricity” talk is anything to go by.

 

 

Over the last few weeks, I’ve had the opportunity to spend some time on Andrew Ng’s Deep Learning course from DeepLearning.ai. For me, this is like a tour guide to the world of the connectionists. The reality is that neural networks don’t work like the human brain apart from superficial similarities – as Ng himself explains in the course – but the term has stuck, since the motivations of early pioneers who also knew some neuroscience led to the moniker.

The Coursera certification is organized into five different courses, and the first of these lays the mathematical and programmatic foundation for implementing them. This first course, titled Neural Networks and Deep Learning has well-orchestrated exercises within Coursera’s integrated Jupyter notebook interface, and you can use the algorithm on your own data, to evaluate its performance. I’m currently some way through the second course, having finished the first one – and I have to say that the videos, programming exercises and other course aspects create a true learning feedback loop, which is effective in teaching the basics really well. I’m very impressed with the way the course has been put together and made accessible to those with a little bit of machine learning knowledge, who are starting out on neural networks and deep learning.

Course Experience

In the below section, I’ll outline my key learnings from the first course in the certification. I hope that you take the course, if you are a ML and AI enthusiast or young professional (or even an experienced one) interested in working on deep learning.

  1. The course introduced the most fundamental ideas of neural networks at the very start, with extensive coverage on how to implement a logistic regression model for classifying data. This intial discussion was built up rather nicely into a discussion on deep learning.
  2. As an intermediate course, it assumes some amount of knowledge of linear algebra and differential equations. As someone who works with machine learning models, I was able to grasp the intuitions with one repetition. If it has been a while since you worked through linear algebra and differential calculus (or thought through equations, at the very least), expect to take a while to find your feet.
  3. Some of the intuitions around gradient descent, the values of derivatives, and so on, were introduced very handily – and were reinforced through the exercises.
  4. The importance of vectorization and its central use in numpy (which is used extensively – nay, almost exclusively – throughout the course) was well brought out. Numpy is a powerful library and surprisingly, received its first funding only in 2017 after being useful for the development of numerous algorithms and tools. Some of its quirks, such as order (n,) vectors, were especially interesting and useful to learn about. Overall though this isn’t a numpy tutorial by any stretch, it is referenced extensively.
  5. During weeks 2 and 3, the logistic regression algorithm is taught in a different context – it is likened to neurons in a deep net, and the details of activation functions are discussed. This, to me, was the meat of the course.
  6. In weeks 2 and 3, a consistent methodology and notation was followed for the discussion of and the implementation of  forward and backward propagation, two of the key mechanisms in any neural network – and this was done entirely within numpy, and these are great hands-on lessons. Stochastic gradient descent was also explained and implemented.
  7. Finally, in week 4, deep neural networks were handled, and parametrization of the neural network topology was introduced. Ideas related to this, such as hyperparameter optimization were also discussed. Additionally, in both videos and assignments, Andrew Ng provided practical advice on how to get the matrix dimensions right for weight and bias vectors – without this and the consistent notation, a lot of the programming implementations of DNNs could potentially get very hairy, so I personally felt that this was very well handled.
  8. A cat classifier deep neural network in Week 4 – because who doesn’t like cats?
  9. Right through the course, there are optional video lectures, and interviews with well known researchers. One of them is with Geoff Hinton, and it was definitely instructive.

 

 

 

I’m about half-way through the second course, on Improving Deep Neural Networks, and my experience there has been similar to the first course. The content derives directly from the content of the first course, and therefore, going in sequence from the first to the second definitely has its advantages. If you were to start the second course of the specialization first, expect to spend some time to find your feet. So far, I only wish there had been better explanations of ideas like dropout and L2 regularization, especially given the tricky quizzes in Week 1. This is a 3-week course, and I wish an additional week, or a few more videos had been spent initially, explaining and firming up ideas around regularization. Additionally, the exploding/vanishing gradient problems could be better illustrated with videos and so on, although I felt the course generally does a good job of explaining the essentials of these ideas.

Concluding Remarks

To conclude, I’d recommend this certificate for those in the analytics, data science or machine learning space, who are a bit hands on, can grasp linear algebra and calculus, and can work with Python. You’ll find that since this is an “intermediate” specialization, neophytes will require multiple viewings of the videos to become conversant in the ideas and concepts. This still shouldn’t deter those who want to audit the course or learn the concepts therein for a deeper understanding to back up their direct experience in machine learning.

Related Content

  1. My Quora answer on Deeplearning.AI’s Coursera course

Crosspost: Some Hard Truths About Becoming a Data Scientist

I’ve spent a couple of years in a data and analytics startup, that has a consulting focus. As I’ve said elsewhere on this blog, the background in engineering and quality data analysis led me (now it seems inexorably) to this interesting role as a consultant with a focus on data and analytics. While I’ve worked on several solution development activities, my primary mandate in the organization is as a business consultant and a data scientist with experience in specific industry areas, such as manufacturing. Over the last couple of years, I’ve spent a significant amount of time doing data science, working with other data scientists, and leading and mentoring data science professionals in projects. I’ve also had to conduct numerous interviews (I’ve lost count) of engineers and non-engineers who are interested in breaking into the data science world. And with good reason – after all, data scientist careers received a lot of (sometimes undeserved) hype recently. Some of my insights below on becoming an effective data scientist were published as a Quora answer originally – but in this blog post, I hope to expand on that answer, and provide a bit of a guide for those charting out data science careers. So, here we go.

What are some hard truths about becoming a (good) data scientist?

  1. Your higher degree matters, but much less than you think. If you have a degree such as a PhD or a Masters in a specific area such as machine learning and computer science, you will do better at data science than many others who don’t have such credentials. If you have a PhD or Masters in a technical field that didn’t involve much data analysis, you’re likely to not be a great fit without acquiring new skills. However, the degree can only take you so far, as you have to be cognizant of the frameworks and technologies often used for doing data science, and constantly learn from them.
  2. Stay away from data analytics specific masters programs – or evaluate them very critically. In my experience, a number of these programs don’t teach what they claim to, and many are overpriced. The latter is in fact a big reason why I wouldn’t recommend a data science specific higher degree to anyone right now, especially given there is no dearth of MOOCs or such resources. If you come in with significant experience, you may be actually be diminishing your profile’s worth by studying in such a program (much like what certain MBA programs do to successful functional experts’ careers)
  3. Business experience counts a great deal in data science. Domain knowledge does too. If you have neither, expect to spend a good deal of time learning about a specific domain or requiring a subject matter expert to work with you.
  4. Hypothesis generation and validation are more important than you might imagine. Frameworks, tools and software are only one aspect of a data scientist’s work. You have to be able to think of business-relevant hypotheses and ideas based on the data, and ask hard questions. If you’re unable to do this, regardless of your degree, or your knowledge of this tool or framework or that one, you’ll not be a successful data scientist.
  5. Ignore the basics to your own peril. Many data scientists are hired without truly testing their knowledge in numerical analysis, linear algebra, optimization and machine learning. Numerous data scientists are also guilty of not checking underlying assumptions of algorithms, or making assumptions about their data in other ways. Few data scientists really understand computational engineering and how optimization is used in machine learning, to the point that they’re able to build algorithms on their own. If you want to stand out, make sure your basics in these areas are solid. It may mean going back to the books often, but it is rewarding and worth it, ultimately.
  6. Communication and presentation skills matter a great deal. Being a good data scientist also means having great communication and presentation skills – without which you’ll be a fish out of water, building models and systems but not able to stand up for why they work, and without being able to explain their benefits.
  7. The “ideal data scientist” unicorns are truly mythical. Data scientists who have the required domain experience, and have sufficient mathematics/statistics, programming and communication skills – these are unicorns, and you’ll rarely find someone who checks all the boxes. So if you’re looking to become a unicorn, expect to put in significant effort, time and energy in keeping yourself up to date.
  8. Prototyping is central to data science work. When you’re building models, more and more models will be throwaway models and prototypes, and a few will perform well enough with training and new data – often times, the only way to make your model perform better is knowing the domain well.
  9. Data platform understanding is more important than you might imagine. Without an understanding of data platforms on which data science is done, there is little chance of being a successful data scientist. You need to have sufficient knowledge of databases, query languages, data storage, management and governance, distributed databases, and so on.
  10. There is still a talent crunch in the data science world, but perhaps not for long. More management teams have come to prime their expectations on how to do data science and what to expect from data scientists. Additionally, data science frameworks and tools are being democratized, and many people are learning and skilling up on the job. This means that management teams have solved the data scientist skill shortage problem we used to hear about a lot in 2016, by using in-sourcing to a large extent.
  11. AI and knowledge modeling are key and underrated areas of data science. Because a number of companies that use data in company systems are also looking for expert systems, knowledge based systems are making a comeback – many of these are newfangled versions of old rule based systems that learn from data, and use different kinds of knowledge representations. AI is more closely related to data science than many people care to reason, and I think they will merge as a discipline.
  12. A static skill set will get you nowhere in data science. Perhaps a repetition, but worth repeating, if you’re an aspiring or current data scientist. Unless data scientists continue to learn new skills, new methods of analysis and new frameworks, there’s a very low chance of continued success and satisfaction at work for data scientists.

Note: This is originally an answer to a Quora question on becoming a data scientist.

Crosspost: Data Driven Organizational Change

In my work as senior consultant for businesses seeking to benefit from data, I’ve come across many different hard management problems that impede the progress of change initiatives, and data analytics initiatives.

The first among these has to be just the lack of knowledge of capabilities and possibilities from data, which is alleviated somewhat by the primary function of a consultant – to provide relevant advice. While it sounds simple, the process of providing the right, relevant advice can be complicated, and needs to take in the business context of the organization rather well, and should encapsulate implicit solutions for the key problems as characterized in our study of the client organization. A second hard management problem is feasibility and cost effectiveness. This is where, in addition to the specific set of ideas that are presented as solutions, and what kind of value they can bring to the organization, we’re interested in how the specific client organization’s investment appetite suits the solution in question.

These first and second hard problems I’ve mentioned generally intertwine with a third, more all-encompassing, and more foundational exercise, change management.

My podcast below discusses many of the real issues around change management in organizations, and suggests how data can be a differentiating factor in change programs. In the podcast are what I think are a couple of interesting perspectives on data driven decision making.