Lesson 4: ML.NET Framework

ML.NET is Microsoft's open-source, cross-platform machine learning framework designed specifically for .NET developers. This lesson covers the fundamentals of ML.NET, including how to set up projects, create data pipelines, train models, and make predictions—all with a clean, familiar API.

What is ML.NET?

ML.NET is a free, production-ready machine learning framework that integrates seamlessly with C# and .NET applications. Unlike Python-heavy frameworks, ML.NET is built for .NET developers and allows you to:

Train models directly in C# (no Python required)
Use static typing for type safety and compile-time error checking
Deploy models within your .NET applications
Build end-to-end ML pipelines with a fluent API
Use popular algorithms without external dependencies

Setting Up ML.NET

To start using ML.NET, install the NuGet package in your .NET project:

// Install ML.NET via NuGet Package Manager
dotnet add package Microsoft.ML
dotnet add package Microsoft.ML.Vision   // For image tasks
dotnet add package Microsoft.ML.TextAnalytics // For NLP tasks

// Basic import
using Microsoft.ML;
using Microsoft.ML.Data;

The MLContext is the central object in ML.NET. It orchestrates all ML operations and contains methods for:

Loading and transforming data
Training models with different algorithms
Evaluating model performance
Making predictions

The ML.NET Pipeline Architecture

ML.NET uses a pipeline approach where you chain data transformations and model trainers together. A typical pipeline consists of:

ML.NET Pipeline Flow

1. Data Input

↓

2. Data Transformations (Normalization, Encoding, etc.)

↓

3. Feature Engineering

↓

4. Model Trainer (Algorithm Selection)

↓

5. Trained Model

// Complete ML.NET Classification Pipeline
using Microsoft.ML;

// 1. Create MLContext
var mlContext = new MLContext();

// 2. Load data
var data = mlContext.Data.LoadFromEnumerable(trainingDataList);

// 3. Create pipeline
var pipeline = mlContext.Transforms.Text.FeaturizeText("Features", "Text")
    .Append(mlContext.Transforms.NormalizeMinMax("Features"))
    .Append(mlContext.BinaryClassification.Trainers
        .SdcaLogisticRegression("Label", "Features"));

// 4. Train model
var model = pipeline.Fit(data);

// 5. Use model for predictions
var engine = mlContext.Model.CreatePredictionEngine(model);
var result = engine.Predict(new InputData { Text = "Sample text" });

Built-in ML.NET Algorithms

ML.NET provides ready-to-use trainers for common machine learning tasks:

Binary Classification

Yes/No predictions. Trainers: Logistic Regression, SVM, Decision Tree, Naive Bayes.

Multiclass Classification

Multiple category predictions. Trainers: Decision Tree, Random Forest, Neural Networks.

Regression

Continuous value prediction. Trainers: Linear Regression, Poisson, Decision Tree.

Clustering

Group similar data. Trainer: K-Means clustering algorithm.

Data Types and Schemas

ML.NET uses data types to define input and output structures. Define your data as strongly-typed classes:

// Define input data schema
public class HousingData
{
    [LoadColumn(0)]
    public float Price { get; set; }
    
    [LoadColumn(1)]
    public float Size { get; set; }
    
    [LoadColumn(2)]
    public float Bedrooms { get; set; }
}

// Define prediction output
public class HousingPrediction
{
    [ColumnName("Score")]
    public float PredictedPrice { get; set; }
}

Training, Evaluation, and Prediction

The typical ML.NET workflow:

Load Data: Read from CSV, database, or enumerable
Split Data: Divide into training and test sets
Build Pipeline: Chain transformations and trainers
Train Model: Fit the pipeline to training data
Evaluate: Measure performance on test data
Save/Load: Persist models for later use
Predict: Use the model on new data

// Complete workflow example
var mlContext = new MLContext();

// Load and split
var data = mlContext.Data.LoadFromTextFile("data.csv", hasHeader: true);
var splitData = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);

// Build and train
var pipeline = mlContext.Transforms.NormalizeMinMax("Features")
    .Append(mlContext.Regression.Trainers.Sdca());
var model = pipeline.Fit(splitData.TrainSet);

// Evaluate
var predictions = model.Transform(splitData.TestSet);
var metrics = mlContext.Regression.Evaluate(predictions);
Console.WriteLine($"R-squared: {metrics.RSquared:F4}");

// Save model
mlContext.Model.Save(model, data.Schema, "model.zip");

// Load and predict
ITransformer loadedModel;
using (var stream = new FileStream("model.zip", FileMode.Open))
    loadedModel = mlContext.Model.Load(stream, out var schema);

var engine = mlContext.Model.CreatePredictionEngine(loadedModel);
var prediction = engine.Predict(new HousingData { Size = 200, Bedrooms = 3 });

🧠 Quick Check — Lesson 4

What is the primary purpose of MLContext in ML.NET?

🧠 Quick Check — Lesson 4

In the ML.NET pipeline, what order should steps follow?

Lesson Summary

✅

ML.NET is a free, .NET-native machine learning framework that integrates seamlessly with C# applications.

✅

MLContext is the central object that orchestrates all ML operations including data loading, training, and evaluation.

✅

Pipelines chain together data transformations and model trainers for a complete ML workflow.

✅

ML.NET provides built-in trainers for classification, regression, clustering, and anomaly detection.

✅

Use strongly-typed classes to define data schemas for type safety and compile-time error checking.

Up Next

Lesson 5: Data Preprocessing & Features

Next Lesson →