ML system design interview

- January 02, 2021

This was originally posted on https://ideafair.bearblog.dev/ml-system-design/ on 1 Jan 2021

ML System design interview

01 Jan, 2021

This post details how a typical Machine Learning (ML) system design interview is conducted. It is quite similar to a software engineering system design interview, but there are many differences which I will list here. The candidate is given a problem and asked how they would go about designing the solution for the problem. Some example problems are: plagiarism detector for a class, YouTube video recommender, grammar correction service etc. The interviewer then expects the candidate to lead most of the discussion, occasionally bringing in what-ifs, asking clarifying questions, digging down on the details etc.

My approach for solving ML design is to divide it to following sections.

Sections of interview

Problem clarification

It is good to ask as many questions as possible, state out all of your assumptions clearly. Not doing any one of the above and jumping right into solving the wrong problem is a clear fail.

Data

Finding out the right data for your ML problem is crucial. Understand how much data is needed, what is the quality of the data, how to obtain it, whether to go supervised or unsupervised, whether to label the data or use existing private/public data etc. It is worth spending time understanding what data already exists, analyzing it, figuring out how to clean and process it.

Metrics

Good data scientists know how to arrive at a proper set of metrics given a problem scenario and the target quality. e.g. if you are designing a spam classifier and you want to ensure no important mails are lost then focus on precision over recall. Understand statistical significance, lower limit, upper limit of your metrics etc.

Model

List out types of models that can be used to solve this problem, what are the tradeoffs, whether to use pre-trained or start from scratch, what features to use etc. Designing all the features requires understanding both the model and data. Don't always try to use the latest fancy model, some older models might be suited for specific scenarios. e.g. if we are designing spam filters for low end mobiles then logistic regression with word n-grams maybe better than BERT. Thus model choice also depends on the platform of deployment.

Deployment

Once training is done, you have to figure out how to use that model in production for inference. You have to estimate the input queries per second, understand whether compute, memory or network will be the bottleneck. In ML system, there will be one kind of bottleneck for training and another kind for inference.

Iterations, Safety & Security

Discuss about potential risks, biases, vulnerabilities this model may have and what would be the safeguards against it. e.g. If you are deploying a text generation model you may need to have a sensitive text classifier in front to prevent your model running on inappropriate inputs. Plan on what can be improved in future versions of the model.

In each section the interviewer may ask to go deeper. Make sure to list various different options, what are their tradeoffs etc.

Preparation

The best way to prepare is to work on some ML project end to end just as you would in a real company. If you are already in a workplace that uses ML, try to understand all the components of it. e.g. Create a website that has a translation service. Spend some time thinking how would you design some popular ML projects like Netflix Recommender system, Self driving car etc. for fun. Being curious minded is the best way to prepare.

Search This Blog

Abhishek Rao's blog