How to Start with Machine Learning | Hacker News


I hope this post helps someone looking to start with ML.

When I started my journey, I thought I should do what this article mentions.

>>First, you should learn the fundamentals:

>> Learn mathematics

My problem is that “learn mathematics” can take years. To me, it would be very frustrating.

So I started down that path… and got completely overwhelmed.

When everything took off and I went from 2 mph to 60 mph was when I did the fast.ai course. The fast.ai course is strongly recommended by me. You’ll start seeing results and “actionable” work very quickly. If you so desire, then you can keep on with the more fundamental stuff like mathematics. Also have to give a shout-out to their forums. The people are super nice.


IMO, for reaching PhD eligibility level at ML, you just need 2-3 core math courses. (as long as each is rigorous)

Linear Algebra, Probability and Statistics. (and some optimization)

The fast.ai course is great, but having the math background really helps bring the whole field together. Different ideas and models with the math, seem to stand completely independent of each other.

As a software engineer who wants to learn ML, the fast.ai course is great. But, if you find yourself in a situations where you are scoping out a data science problem, the math background helps immensely in coming up with a solution.


Agree 100%, but for “rigorous” courses in probability and statistics, that should be a prerequisite.


Yeah, that’s ridiculous. Who actually does that and is capable of sticking to it?

To me the best way of learning stuff is diving head first and playing around with it, making projects with it, and then going back and learning more theory when I start to understand why it’s important.


Reading the “buy my books!” blurbs, the “No C++, blah, blah, No C++!, blah, No C++!!” had me laughing.


I have a similar feeling. To learn the basic math (single, multivariable calculus, linear algebra) will probably take you 300 hours of serious study if you don’t already know any of it. That’s one hour a day for a year (if you actually succeed at studying 8 days out of 10, which is pretty good) without implementing anything. If you want to get to something like a beginner nonlinear optimization course you probably need to study proofwriting then advanced calculus first. Add in a serious undergrad probability and stats course and you’re at 2-3 years of study before you actually did anything…

If you’re someone who enjoys math you might do it for its own sake but it’s difficult for me to imagine someone making serious progress in this direction without either a passion for the subject, an already relatively strong background, or being in a university.


>To learn the basic math (single, multivariable calculus, linear algebra) will probably take you 300 hours of serious study if you don’t already know any of it.

Frankly, compared to many other disciplines, that’s pretty light. Single variable calculus, linear algebra and probability were all required in my undergrad CS curriculum. Multivariable calculus was the only extra course. And if you’re in most engineering disciplines, all of this is required.

I’m guessing the difference for CS folks is that they rarely use this stuff in the curriculum, whereas in most engineering programs, you’ll use calculus day and night.[0] So for someone like me (engineering background), it was easy to dive into even years out of school as I’d not forgotten a lot of math.

Now of course, if you want to get deep into ML, there’s a lot more math than that. However, most successful people using ML do not need to know that math.

And compared to other disciplines like control theory, communications theory, etc, the prerequisites for ML are a lot lighter.

[0] Only in school. Almost never in industry.


I agree with the overall sentiment as well. However, 300 hours of study to go from 0 to single & multivariable calculus plus linear algebra doesn’t actually sound that bad. 10 hours per week (2 hours per day post work, make up missed days on the weekend) for 30 weeks?


It depends on how many other responsibilities you have or other activities you engage in, and how much you care about the project, I suppose. To me 10 hours of solid study after work is a lot, but other people might throw out some other stuff I do and be willing to spend it studying. I can’t comment on whether it’s a good idea. (I believe there is very much such a thing as too much work ethic).


For us studying physics, thats like 30 hours of actually attending class for each course, so 90 total, then some days of study, say 3 x 8h, thats more like 114h, not 300…


The trick with high quality machine learning is that folks need to know programming, math, statistics and the industry they are in. Not everyone can do that.


Hahaha. You thought you would ascend to an engineering-type profession in less than “years”?

Maybe it’s easier to become a surgeon by reading blogposts? Better paid?


Another filler article. Learning or at least being proficient in machine learning never works in the order the article mentions if you have no background. I suggest following user knob’s suggestion in this thread. You can learn the mathematics while applying machine learning to solve a problem you care about. That is the way things get stuck into memory.


Amongst dozens guides on AI and ML (“how to start”), I found Siraj Raval’s YT channel really helpful and inspiring.


Siraj Raval’s channel is is among the worst resources for someone serious about ML.

His channel gives lip service to ML, and gives incorrect intuitions, that make concepts seem easy, but sets you up for making major mistakes down the line.

His channel is indeed disliked among most ML grad students, as it seems to be very much quantity over quality.

It is like watching science channel and thinking you are ready to be an engineer.


I noticed that most criticsm indeed concerns “shallow presentation” and lack od deep (e.g. mathematical) analysis and explanations – and I could agree with that. But by “inspiring” I meant “getting people excited about machine learning” rather than “teaching people how to ML” – and I think he is great in that sense. Maybe “AI coach” would be relevant. His choice of topics has some “clickbait” features, but that’s exactly what makes me subscribe his channel. One of my favs: [1]

He also has strict educational projects like “Shool of AI”, but I’m not familiar with them.

1: https://www.youtube.com/watch?v=mvwBgAqrheo


He is entertaining but You can’t learn from someone who skips the fundamentals and goes directly into the action. Shallow explanations, as others mentioned.


Even as someone with half an electrical engineering degree (three advanced calc courses, linear systems, difeq and linear algebra) and a full CS degree I still find myself sort of overwhelmed or unsure of where to start with ML. Even with a lot of guides. Sometimes I find that guides are too application focused and don’t seem to be of much value outside of the specific thing the author built and then structured the tutorial around.

Do any other people here with college math / cs experience (granted I wasn’t top of my class) find this stuff overwhelming or daunting?


Webdev here that just started ML a couple of weeks ago. I tried reviewing all the math and concepts and it didn’t help much. What really helped was signing up for Kaggle and working through example problems.

Here’s what I think I understand so far, feel free to correct me if I’m wrong:

* Most ML solutions come from published papers, and even ones that are years old are still effective and relevant. You’re going to be tweaking other people’s designs and that’s fine.

* Effectiveness of ML seems to come from two things: Network design improvements (some combination of brute forcing, guessing, understanding math, and implementing papers) and from better data (if you have access to more data sets, you can do better things).

* Before you start with the ML part, you need to first be able to do “Exploratory Data Analysis”. Which really just means to learn matplotlib, which means you’ll need to know how to use pandas, which means you’ll need to learn how to use numpy. You want to be able to understand your data, find outliers, graph some examples, etc.

* There’s a few different types of ML, but the ones that seem to solve problems are “reinforcement learning”, “supervised learning”, and “unsupervised learning”. The supervised learning is the most common one to get started with, and it seems to work well if you have a big set of “X should produce Y” type of data. Unsupervised learning is similar but you don’t have Y and you’re looking to find groupings of your data instead of matching to specific labels. Reinforcement learning seems most useful for things like game AI, where you have some state and you have various moves you can do to try and increase your score (OpenAI Gym is really fun for this).

* It is actually unreasonably effective. There are “pre-trained models” that offer a starting point for networks that have already seen huge data sets. You can “transfer” that learning and then train on top of it. It’s amazing how little effort is required to start classifying images.

* Keras backed by TensorFlow seems to have the best support all around, and the code is pretty easy to read. TensorFlow is a little like OpenGL where it has its own rules and state, and you should do some examples to see how it works, but it’s very low level. Keras is like any other high level python library and it does almost all of the TF interaction for you.

* If you’re doing it on your machine, use conda, because it manages all the python stuff for you. If you want the fastest way to get started online without your own GPU, use Kaggle. It has a maximum runtime of about 9 hours but that’s plenty to get started.


Another suggestion, get familiar with using auto ml tools, as those would likely perform most of the mathematics/tuning/training and deployment within the next 3 years.

The key to ML is:

1) The use case.
2) The data (collecting, pre processing)
3) The integration with other software.


If all you need to do is deploy AI without training it or developing the algorithm/architecture, then yes, fast.ai is fine. The problem occurs when people conflate the knowledge needed for deployment with the knowledge needed for development.


Download Anaconda, get a dataset that can help you solve some problem and figure out with Google what you should do if error messages appear. ML is easy.


“get a dataset that can help you solve some problem”.

Oftentimes designing and creating the dataset is as difficult as coming up with an experiment in the natural sciences. If your problem is solved by a toy dataset on the internet, then it’s not really a problem, and its only machine learning in the same way programming bubble sort is software engineering.