My Education in Machine Learning via Coursera, A Review So Far

As of today I’ve completed my fifth course at Coursera, all but one being directly related to Machine Learning. The fact that you can now take classes given by many of most well known researchers in their field who work at some of the most lauded institutions for no cost at all is a testament to the ever growing impact that the internet has on our lives.  It’s quite a gift that these classes started to become available at right about the same time as when Machine Learning demand started to sky rocket (and at right about the same time that I entered the field professionally).

Note that all effort estimations include the time spent watching lectures, reading related materials, taking quizes and completing programming assignments.  Classes are listed in the order they were taken.

Machine Learning (Fall 2011)
Estimated Effort: 10-20 Hours a Week
Taught by Andrew Ng of Stanford University, this class gives a whirlwind tour of the traditional machine learning landscape. In taking this class you’ll build basic models for Regression, Neural Networks, Support Vector Machines, Clustering, Recommendation Systems, and Anomaly Detection. While this class doesn’t cover any one of these topics in depth, this is a great class to take if you want to get your bearings and learn a few useful tricks along the way. I highly recommend this class for anyone interested in Machine Learning who is looking for a good place to start.

Probabilistic Graphical Models (Spring 2012)
Estimated Effort: 20-30 Hours a Week
Taught by Daphne Koller of Stanford University, who de facto wrote The Book on Probabilistic Graphical Models (weighing in at 1280 small print pages). This class was a huge time investment, but well worth the effort. Probabilistic Graphical Models are the relational databases of the Machine Learning world in that they provide a structured way to represent, understand and infer statistical models. While Daphne couldn’t cover the entire book in a single class, she made an amazing effort of it. After you complete this course you will most definitely be able to leverage many different kinds of Probabilistic Graphical Models in the real world.

Functional Programming Principles in Scala (Fall 2012)
Estimated Effort: 3-5 Hours a Week
Taught by Martin Odersky who is the primary author of the Scala programming language. I entered this class with an existing strong knowledge of functional programming and so I’d expect this class to be a bigger time investment for someone who isn’t quite as comfortable with the topic. While not directly related to Machine Learning, knowledge of Scala allows leverage of distributed platforms such as Hadoop and Spark which can be quite useful in large scale Entity Resolution and Machine Learning efforts. Also related, one of the most advanced frameworks for Probabilistic Graphical Models is written in and designed to be used from Scala.  In taking this class you’ll most certainly become proficient in the Scala language, but not quite familiar with the full breadth of its libraries. As far as functional programming goes, you can expect to learn quite a bit about the basics such as recursion and immutable data structures, but nothing so advanced as co-recursion, continuation passing or meta programming. Most interesting to myself was the focus on beta reduction rules for the various language constructs.  These together loosely form a primer for implementing functional languages.

Social Network Analysis (Fall 2012)
Estimated Effort: 5-10 Hours a Week
Taught by Lada Adamic of the University of Michigan. Social Network Analysis stands out from the others as I’ve never been exposed anything quite like it before. In this class you learn to measure various properties of networks and several different methods for generating them. The purpose in this is to better understand the structure, growth and spread of information in real human social networks. The focus of the class was largely on intuition and I was a bit unhappy with the sparsity of the mathematics, but this certainly makes it a very accessible introduction to the topic. After completing this class I guarantee you’ll see new insight into your corporate structure and will see your twitter network in a whole new way. If I were to pick one class from this list to recommend to a friend no matter background or interest, this would be it.

Neural Networks for Machine Learning (Fall 2012)
Estimated Effort: 10-30 Hours a Week
Taught by Geoffrey Hinton of the University of Toronto, who is a pioneer and one of the most well respected people in his field. Note that going in you’ll be expected to have a strong working knowledge of Calculus, which is not a prerequisite for any of the other classes listed here. I had hoped that this class would have been as worthwhile as the Probabilistic Graphical Models course given its instructor, but sadly it was not. Regretfully, I can only say that this class was poorly put together. It has a meager four programming assignments, the first two of which follow a simple multiple choice coding formula, and the following two of which are unexpectedly much more difficult in requiring you to both derive your own equations and implement the result of that derivation. It was extremely hard to predict how much time I would need to spend on any given assignment.  Having already learned to use Perceptrons and simple Backpropagation in Andrew Ng’s class, the only new hands-on skill I gained was implementing Restricted Boltzmann Machines. To be fair, I did acquire quite a bit of knowledge about the Neural Networks landscape, and Restricted Boltzmann Machines are a core component of Deep Belief Networks.  However, looking back at the sheer quantity of skills and knowledge I gained in the other classes listed here, I can’t help but feel this class could have been much better.

Enjoy this post? Continue the conversation with me on twitter.

Tags: , , , , , ,


  1. I think you’re too hard on NNs, at least art usefulness. But, I have an application in mind. The TA-ing was spotty as you said, but I think there was only one whereas Ng had a team. I never have heard NNs explained so well; but I do agree it was of uneven difficulty.

    • It was really that huge spike in PA3 that made me frustrated. I wouldn’t have minded if they were all that difficulty and the expectation of that was set ahead of time. By the time I started that assignment I was already behind after being displaced for a week by hurricane sandy which certainly didn’t help matters.

      I was also very disappointed by how little we went programming wise beyond what we learned in Andrew’s course. From the syllabus I was expecting to come out being equipped to play with Deep Belief Nets (the new neural networks hotness). Sure, we were exposed to the ideas, but the devil is always in the details with these things.

  2. Thanks for the write-up, very interesting. I’ve taken two Coursera courses in the past (databases, and algorithms), and I really enjoyed them.

    I’ve been thinking of continuing with machine learning, so your summary is really useful.

  3. Helpful review thanks. I’ve signed up, or considered signing up to most of the courses you listed. I’m still working through the assignments for Andrew Ng’s classes, but had discounted the Social Network Analysis course. I’ll sign up for it now if it’s likely to be worthwhile.

    Good luck!

  4. Hi Rick,

    I will be TAing for a coursera course next semester, any advice you could offer?

    • Man, I could go on for pages and pages. Maybe a blog post?

      I’ll keep it short for now though. Mostly, just keep in mind that most of the people taking your class will be full time employed, maybe even with kids :). That’s not to say we can’t sink in 30 hours a week, but we’re going to need to plan ahead with our time investment. It’s all about setting expectations and keeping a difficulty gradient that makes time investment approximately consistent between assignments while building on the knowledge already gained.

      My favorite format has one big topic per week, with several small related topics, which all come together in a programming assignment. This helps me out quite a lot as I, like many people, learn best hands on.

  5. Prof Andrew Ng put together a cohesive set of classes. The in-video quizzes all seemed on-topic and related to the most recently covered ideas. Prof Geoffrey Hinton’s assemblage of in-video quizzes felt more like surprise questions with little relevance to topics being addressed at that moment in the video, thus obfuscating a real narrative. Neural Networks was delivered in a dry manner, but eagerness to learn about the subject cuts straight through that.

  6. thanks for sharing. i will begin on the top of your list.

  7. Hi Rick, thanks for the helpful review. Every time I started a course, I gave up due to conflict of schedules with my real world stuff. Can you write more about how you balanced yours with the commitment towards these courses. Also, do you think these courses are helpful in finding better software engineering/data-science jobs or rather in improving your skills for the sake of launching some product in future. Thanks in advance.

    • For getting it done, it does take some sacrifice. I’m lucky to have a job already where the skills I learn directly benefit my employer and so they’re pretty lenient with letting me get some of my school work done on work hours. It’s pretty easy to find 20 hours of outside work time a week if you’re willing to sacrifice your video games, television and social events.

      As for jobs? Start practicing what you learn publicly and earn a reputation. Then they’ll come looking for you. It always works.

  8. Richard, thanks for this post! I’ve lately been interested in Machine Learning and have thought about diving into some Coursera courses. This post is very helpful. Thanks for investing the time to share it.

  9. It’s funny how people get different things out of courses. I did all of these, except the scala one, and actually got the most out of Hinton’s. However it wasn’t so much about neural networks but his way of describing a number of random bits and pieces that really stuck with me. I didn’t do _any_ of the course work, I have nowhere near the time. Overall the quality of these courses is amazing, 5 years ago even we had nothing like this! Fun times!

    • Alastair Aitchison

      Actually, we *have* had something quite a lot like this for a while – 4 1/2 years ago, Andrew Ng uploaded a complete set of 20 lectures from his Stanford Machine Learning class to YouTube, with handouts and lecture notes available from the Stanford site. You can still access them now:

      The syllabus is very similar to that covered in the Coursera course.

      And there were others before that – there’s some great courses available on the Open Yale site, for example, which was set up in 2007 –

      Don’t get me wrong, I love Coursera (and Udacity/edX etc.) but, like many other success stories from history, they’re not really doing anything new – they just happen to be doing it when the time is right.

      • I think you’re missing the importance of graded assignments and the forum. Those additions make a huge deal, I’ve been using material from MIT OCW and Itunes U before these MOOC sites came along and let me tell you (at least for me) the difference has been night and day.

  10. Have you noticed you never spelt Coursera correctly ?

  11. thanks for this

  12. Nice post, now i need to take scala and Probability courses .
    Thanks for the post.

  13. Between the Neural Networks class and the PGM class I have to say it’s the first that engaged me much more. It gave a lot more ‘Ohh!’ epiphany moments and I feel like gained a lot more insight into machine learning as a whole.

  14. Richard, thanks for this survey. Most people in comments defend Hinton’s course, while I tend to agree with you (probably because of I had similar intentions, namely learn deep learning deeply :).

    I took 3 of those 5 classes and would rank PGM first, since it was time- and math-intensive, still wonderfully planned (though the topic is close to my interests, so I may be biased), then goes Scala (as you already mentioned, they could take more time to cover more FP concepts, and also give more involving programming assignments), and NNs are the last in the list (both lectures and quizzes were often vague, more strict derivations would help; I needed to use external resources). I wonder why you spent to NNs as much time as to PGM, I typically spent 5­-7 hrs/week (it was similar to you for the remaining two courses). Unlike to your experience, the most time-consuming assignment for me was the 2nd, where they asked to try ?10 different sets of parameters, and training was like an hour in my machine, so I needed to run it overnight, which was okay, but one can eventually miss the deadline like that. :)

  15. Great useful summary. I did some research and experimenting with neural nets and genetic algorithms in the 90s up until about 2001. I haven’t kept up since, but I suspect I would still conclude that these techniques are useful (at least in areas related to search and optimization) when for whatever reason you cannot find the correct dedicated algorithm for your problem, whether for lack of proper definition of the problem, inadequate literature search, no one knows the algorithm, or the algorithm will only ever exist in the Platonic World. I wasn’t doing machine vision, so in areas like this my hypothesis could well be entirely off-base.

    PGM, on the other hand, I think has a lot of understandable theory behind it and can be employed for many practical problems. I have Koller’s book, which I have barely cracked, but I believe the level of effort her class takes (I’ve viewed the first few lectures, very impressive). I think it would take me personally 10 weeks of full-time effort. Sure would like to be able to set aside that kind of time commitment.

  16. Very Nicely summarized. Even I joined some of the above classes but couldn’t keep up the pace because of other studies. Can you give me any tips? An I know this is not place where to ask this question but you have any idea when the probabilistic graphical models is going to get offer

  17. Thanks a lot for your insights into the Machine Learning class. I read your post while I was considering to take the class, and it was a very good decision to do so. Now I am sharing my experiences of the class on my blog:

    Feel free to comment.

  18. Hi Rick,

    I am doing my neural networks for machine learning from coursera and completed 19/20 quiz and now struggling to completed 13th week programming test -restricted boltzmann machine. Please help and deadline is very near

  19. […] My Education in Machine Learning via Cousera, A Review So Far ( Share:Like this:LikeBe the first to like this. […]

Leave a comment