Lesson 4: ML.NET Framework
ML.NET is Microsoft's open-source, cross-platform machine learning framework designed specifically for .NET developers. This lesson covers the fundamentals of ML.NET, including how to set up projects, create data pipelines, train models, and make predictionsβall with a clean, familiar API.
What is ML.NET?
ML.NET is a free, production-ready machine learning framework that integrates seamlessly with C# and .NET applications. Unlike Python-heavy frameworks, ML.NET is built for .NET developers and allows you to:
- Train models directly in C# (no Python required)
- Use static typing for type safety and compile-time error checking
- Deploy models within your .NET applications
- Build end-to-end ML pipelines with a fluent API
- Use popular algorithms without external dependencies
Setting Up ML.NET
To start using ML.NET, install the NuGet package in your .NET project:
// Install ML.NET via NuGet Package Manager
dotnet add package Microsoft.ML
dotnet add package Microsoft.ML.Vision // For image tasks
dotnet add package Microsoft.ML.TextAnalytics // For NLP tasks
// Basic import
using Microsoft.ML;
using Microsoft.ML.Data;
The MLContext is the central object in ML.NET. It orchestrates all ML operations and contains methods for:
- Loading and transforming data
- Training models with different algorithms
- Evaluating model performance
- Making predictions
The ML.NET Pipeline Architecture
ML.NET uses a pipeline approach where you chain data transformations and model trainers together. A typical pipeline consists of:
// Complete ML.NET Classification Pipeline
using Microsoft.ML;
// 1. Create MLContext
var mlContext = new MLContext();
// 2. Load data
var data = mlContext.Data.LoadFromEnumerable(trainingDataList);
// 3. Create pipeline
var pipeline = mlContext.Transforms.Text.FeaturizeText("Features", "Text")
.Append(mlContext.Transforms.NormalizeMinMax("Features"))
.Append(mlContext.BinaryClassification.Trainers
.SdcaLogisticRegression("Label", "Features"));
// 4. Train model
var model = pipeline.Fit(data);
// 5. Use model for predictions
var engine = mlContext.Model.CreatePredictionEngine(model);
var result = engine.Predict(new InputData { Text = "Sample text" });
Built-in ML.NET Algorithms
ML.NET provides ready-to-use trainers for common machine learning tasks:
Binary Classification
Yes/No predictions. Trainers: Logistic Regression, SVM, Decision Tree, Naive Bayes.
Multiclass Classification
Multiple category predictions. Trainers: Decision Tree, Random Forest, Neural Networks.
Regression
Continuous value prediction. Trainers: Linear Regression, Poisson, Decision Tree.
Clustering
Group similar data. Trainer: K-Means clustering algorithm.
Data Types and Schemas
ML.NET uses data types to define input and output structures. Define your data as strongly-typed classes:
// Define input data schema
public class HousingData
{
[LoadColumn(0)]
public float Price { get; set; }
[LoadColumn(1)]
public float Size { get; set; }
[LoadColumn(2)]
public float Bedrooms { get; set; }
}
// Define prediction output
public class HousingPrediction
{
[ColumnName("Score")]
public float PredictedPrice { get; set; }
}
Training, Evaluation, and Prediction
The typical ML.NET workflow:
- Load Data: Read from CSV, database, or enumerable
- Split Data: Divide into training and test sets
- Build Pipeline: Chain transformations and trainers
- Train Model: Fit the pipeline to training data
- Evaluate: Measure performance on test data
- Save/Load: Persist models for later use
- Predict: Use the model on new data
// Complete workflow example
var mlContext = new MLContext();
// Load and split
var data = mlContext.Data.LoadFromTextFile("data.csv", hasHeader: true);
var splitData = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
// Build and train
var pipeline = mlContext.Transforms.NormalizeMinMax("Features")
.Append(mlContext.Regression.Trainers.Sdca());
var model = pipeline.Fit(splitData.TrainSet);
// Evaluate
var predictions = model.Transform(splitData.TestSet);
var metrics = mlContext.Regression.Evaluate(predictions);
Console.WriteLine($"R-squared: {metrics.RSquared:F4}");
// Save model
mlContext.Model.Save(model, data.Schema, "model.zip");
// Load and predict
ITransformer loadedModel;
using (var stream = new FileStream("model.zip", FileMode.Open))
loadedModel = mlContext.Model.Load(stream, out var schema);
var engine = mlContext.Model.CreatePredictionEngine(loadedModel);
var prediction = engine.Predict(new HousingData { Size = 200, Bedrooms = 3 });
π§ Quick Check β Lesson 4
What is the primary purpose of MLContext in ML.NET?
π§ Quick Check β Lesson 4
In the ML.NET pipeline, what order should steps follow?
Lesson Summary
ML.NET is a free, .NET-native machine learning framework that integrates seamlessly with C# applications.
MLContext is the central object that orchestrates all ML operations including data loading, training, and evaluation.
Pipelines chain together data transformations and model trainers for a complete ML workflow.
ML.NET provides built-in trainers for classification, regression, clustering, and anomaly detection.
Use strongly-typed classes to define data schemas for type safety and compile-time error checking.