Advertisement
Intermediate AI & ML Lesson 4 of 6

Lesson 4: ML.NET Framework

ML.NET is Microsoft's open-source, cross-platform machine learning framework designed specifically for .NET developers. This lesson covers the fundamentals of ML.NET, including how to set up projects, create data pipelines, train models, and make predictionsβ€”all with a clean, familiar API.

Advertisement

What is ML.NET?

ML.NET is a free, production-ready machine learning framework that integrates seamlessly with C# and .NET applications. Unlike Python-heavy frameworks, ML.NET is built for .NET developers and allows you to:

  • Train models directly in C# (no Python required)
  • Use static typing for type safety and compile-time error checking
  • Deploy models within your .NET applications
  • Build end-to-end ML pipelines with a fluent API
  • Use popular algorithms without external dependencies

Setting Up ML.NET

To start using ML.NET, install the NuGet package in your .NET project:

// Install ML.NET via NuGet Package Manager
dotnet add package Microsoft.ML
dotnet add package Microsoft.ML.Vision   // For image tasks
dotnet add package Microsoft.ML.TextAnalytics // For NLP tasks

// Basic import
using Microsoft.ML;
using Microsoft.ML.Data;

The MLContext is the central object in ML.NET. It orchestrates all ML operations and contains methods for:

  • Loading and transforming data
  • Training models with different algorithms
  • Evaluating model performance
  • Making predictions

The ML.NET Pipeline Architecture

ML.NET uses a pipeline approach where you chain data transformations and model trainers together. A typical pipeline consists of:

ML.NET Pipeline Flow
1. Data Input
↓
2. Data Transformations (Normalization, Encoding, etc.)
↓
3. Feature Engineering
↓
4. Model Trainer (Algorithm Selection)
↓
5. Trained Model
// Complete ML.NET Classification Pipeline
using Microsoft.ML;

// 1. Create MLContext
var mlContext = new MLContext();

// 2. Load data
var data = mlContext.Data.LoadFromEnumerable(trainingDataList);

// 3. Create pipeline
var pipeline = mlContext.Transforms.Text.FeaturizeText("Features", "Text")
    .Append(mlContext.Transforms.NormalizeMinMax("Features"))
    .Append(mlContext.BinaryClassification.Trainers
        .SdcaLogisticRegression("Label", "Features"));

// 4. Train model
var model = pipeline.Fit(data);

// 5. Use model for predictions
var engine = mlContext.Model.CreatePredictionEngine(model);
var result = engine.Predict(new InputData { Text = "Sample text" });

Built-in ML.NET Algorithms

ML.NET provides ready-to-use trainers for common machine learning tasks:

Binary Classification

Yes/No predictions. Trainers: Logistic Regression, SVM, Decision Tree, Naive Bayes.

Multiclass Classification

Multiple category predictions. Trainers: Decision Tree, Random Forest, Neural Networks.

Regression

Continuous value prediction. Trainers: Linear Regression, Poisson, Decision Tree.

Clustering

Group similar data. Trainer: K-Means clustering algorithm.

Data Types and Schemas

ML.NET uses data types to define input and output structures. Define your data as strongly-typed classes:

// Define input data schema
public class HousingData
{
    [LoadColumn(0)]
    public float Price { get; set; }
    
    [LoadColumn(1)]
    public float Size { get; set; }
    
    [LoadColumn(2)]
    public float Bedrooms { get; set; }
}

// Define prediction output
public class HousingPrediction
{
    [ColumnName("Score")]
    public float PredictedPrice { get; set; }
}

Training, Evaluation, and Prediction

The typical ML.NET workflow:

  1. Load Data: Read from CSV, database, or enumerable
  2. Split Data: Divide into training and test sets
  3. Build Pipeline: Chain transformations and trainers
  4. Train Model: Fit the pipeline to training data
  5. Evaluate: Measure performance on test data
  6. Save/Load: Persist models for later use
  7. Predict: Use the model on new data
// Complete workflow example
var mlContext = new MLContext();

// Load and split
var data = mlContext.Data.LoadFromTextFile("data.csv", hasHeader: true);
var splitData = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);

// Build and train
var pipeline = mlContext.Transforms.NormalizeMinMax("Features")
    .Append(mlContext.Regression.Trainers.Sdca());
var model = pipeline.Fit(splitData.TrainSet);

// Evaluate
var predictions = model.Transform(splitData.TestSet);
var metrics = mlContext.Regression.Evaluate(predictions);
Console.WriteLine($"R-squared: {metrics.RSquared:F4}");

// Save model
mlContext.Model.Save(model, data.Schema, "model.zip");

// Load and predict
ITransformer loadedModel;
using (var stream = new FileStream("model.zip", FileMode.Open))
    loadedModel = mlContext.Model.Load(stream, out var schema);

var engine = mlContext.Model.CreatePredictionEngine(loadedModel);
var prediction = engine.Predict(new HousingData { Size = 200, Bedrooms = 3 });
Advertisement

🧠 Quick Check β€” Lesson 4

What is the primary purpose of MLContext in ML.NET?

🧠 Quick Check β€” Lesson 4

In the ML.NET pipeline, what order should steps follow?

Lesson Summary

βœ…

ML.NET is a free, .NET-native machine learning framework that integrates seamlessly with C# applications.

βœ…

MLContext is the central object that orchestrates all ML operations including data loading, training, and evaluation.

βœ…

Pipelines chain together data transformations and model trainers for a complete ML workflow.

βœ…

ML.NET provides built-in trainers for classification, regression, clustering, and anomaly detection.

βœ…

Use strongly-typed classes to define data schemas for type safety and compile-time error checking.

Up Next

Lesson 5: Data Preprocessing & Features

Next Lesson β†’