Difference between MLE , Data Scientist and Data Engineer

I’m new to the industry and I’m trying to understand the distinctions between MLE, Data Scientist, and Data Engineer roles but don’t seem to find a proper answer to this question

From what I gather, a Data Scientist is expected to model, train models, monitor them post-production, fine-tune, and possibly retrain, though that seems to involve a lot of bureaucratic hoops. They might also handle some production tasks.

Data Engineers, on the other hand, seem to focus on preprocessing, ETL, building data warehouses, writing SQL queries, setting up CI/CD pipelines, and data scraping. While Data Scientists might do some of this, I’m not very comfortable with it. I’m not the best coder but can manage to write pseudocode and work my way out with tools like GPT.

Analysts, I understand, handle insights and exploratory data analysis (EDA).

So, where does that leave Machine Learning Engineers (MLEs)? There seems to be a lot of overlap, but what exactly are their primary responsibilities? I assume it involves MLOps and some aspects of Data Engineering—essentially a bit of everything?

In companies, these roles might not be distinctly separate, often combined into one or two teams.

Shifting focus to Finance, I see roles like Quant Researchers and Quant Analysts, but there’s not much detailed information available. What do these roles entail? The requirements seem similar, but how does one choose their niche in such a complex field?

Because there is no proper answer. It varies from team to team.

I’m an MLE and one of the most frustrating things about it is that the role expectations are so different across companies and teams. For example, a lot of people here seem to expect MLEs to develop ML models. For many MLE positions (not all), they hardly do any model development. They just take what the data scientists hand off to them and scale it to deploy to production. In some teams like mine, MLE is pretty much synonymous with ML Infra engineering and MLOps. You might be better off investing into learning Kubernetes than trying to read Ian Goodfellow’s Deep Learning book for these kind of roles.

In other teams, they are expected to do all of that PLUS develop ML models and read ML papers. Personally, that’s a bit too much for one role imo.

Since every company’s definition is different, how might someone tell the specifics of position’s role? A lot of job descriptions I see are vague and just throws buzz words around. Is that something you’d ask in the interview process?

In the interview you just have to ask them what you’d be working on in the first 6 months. If it doesn’t sound like the speciality you’re going for don’t take the job.

Sometime people are at data analyst job doing EDA with data scientist title and they want to switch to modelling and become MLE. Sometimes MLE is software engineer responsible for MLOps, putting a model to production. Data Engineers some time responsible for dashboards as well. I would avoid using “Data Scientist” In bigger teams, for me it is easier to navigate the roles as data engineer (ingestion, storage, queries, ETL), business analyst (business hypothesis, business metrics), data analyst (EDA, discriptive analysis), modeller (decide on model type, model metrics, train, valuable), production engineer (someone taking the model to environment where it works, productionizing the model). On bigger organisations with many teams there may be data/model/production architects making infrastructure decisions for several teams.

Here’s a quick breakdown:

  • Data Scientist: Builds, trains, and tunes models; performs data analysis and interpretation.
  • Data Engineer: Focuses on data pipelines, ETL processes, and maintaining data infrastructure.
  • Machine Learning Engineer (MLE): Deploys and optimizes models in production, handles MLOps.

Finance Roles:

  • Quant Researchers: Develop complex models for market predictions.
  • Quant Analysts: Analyze financial data to support investment decisions.

Choosing Your Role:

  • Data Scientist: If you enjoy modeling and analysis.
  • Data Engineer: If you prefer building data systems.
  • MLE: If you’re interested in deploying and maintaining models.