Lesson 3: Neural Networks Basics

Neural networks are the foundation of modern deep learning. In this lesson, you'll understand how neurons work, explore network architectures, and learn about activation functions and backpropagation—the core algorithm that powers neural network training.

The Biological Inspiration: How Neurons Work

Artificial neural networks are inspired by biological brains. The human brain contains approximately 86 billion neurons connected by synapses. Each neuron receives signals, processes them, and transmits outputs to other neurons.

An artificial neuron mimics this behavior by:

Receiving multiple inputs (weighted by importance)
Summing the weighted inputs
Applying a non-linear activation function
Outputting the result to the next layer

Artificial Neuron Model

Input: x₁, x₂, x₃, ... xₙ
Weights: w₁, w₂, w₃, ... wₙ
Bias: b
Computation:
z = (x₁×w₁ + x₂×w₂ + ... + xₙ×wₙ) + b
output = activation_function(z)

Perceptrons: The Simplest Neural Network

A perceptron is the simplest form of a neural network—a single artificial neuron. It's used for binary classification tasks. The perceptron algorithm learns by adjusting weights based on misclassifications.

// Simple Perceptron Example in C#
public class Perceptron {
    private double[] weights;
    private double bias;
    private double learningRate = 0.01;

    public Perceptron(int inputSize) {
        weights = new double[inputSize];
        bias = 0;
        // Initialize weights randomly
        var random = new Random();
        for (int i = 0; i < weights.Length; i++)
            weights[i] = random.NextDouble() - 0.5;
    }

    public int Predict(double[] inputs) {
        double sum = bias;
        for (int i = 0; i < inputs.Length; i++)
            sum += inputs[i] * weights[i];
        return sum >= 0 ? 1 : 0; // Activation: Step function
    }

    public void Train(double[][] trainingData, int[] labels, int epochs) {
        for (int epoch = 0; epoch < epochs; epoch++) {
            for (int i = 0; i < trainingData.Length; i++) {
                int prediction = Predict(trainingData[i]);
                int error = labels[i] - prediction;
                
                // Update weights
                for (int j = 0; j < weights.Length; j++)
                    weights[j] += learningRate * error * trainingData[i][j];
                bias += learningRate * error;
            }
        }
    }
}

Activation Functions

Activation functions introduce non-linearity into neural networks. Without them, stacking layers would be equivalent to a single linear transformation, limiting the network's ability to learn complex patterns.

Sigmoid

Maps input to range (0,1). Commonly used in binary classification. Suffers from vanishing gradient problem.

ReLU (Rectified Linear)

Returns max(0, x). Most popular in deep learning. Computationally efficient and helps avoid vanishing gradients.

Tanh

Maps input to range (-1,1). Zero-centered output. Better than sigmoid but still prone to vanishing gradients.

Softmax

Converts outputs to probability distribution. Used in multi-class classification output layers.

Network Architecture: Layers and Connections

Neural networks consist of three types of layers:

Input Layer: Receives raw data (not counted in layer depth)
Hidden Layers: Process information and learn patterns (can be 1 to thousands)
Output Layer: Produces final predictions

A network with more than one hidden layer is called a deep neural network. The depth allows the network to learn hierarchical representations—lower layers detect simple features, while deeper layers combine them into complex concepts.

// Simple Multi-Layer Network Architecture
Input Layer: 784 neurons (28×28 pixel images)
    ↓
Hidden Layer 1: 128 neurons (ReLU activation)
    ↓
Hidden Layer 2: 64 neurons (ReLU activation)
    ↓
Hidden Layer 3: 32 neurons (ReLU activation)
    ↓
Output Layer: 10 neurons (Softmax activation, for digits 0-9)

Forward Propagation: How Data Flows Through the Network

Forward propagation is the process of passing input data through the network to generate predictions:

Input data enters the input layer
Each neuron in the next layer receives weighted inputs from all previous neurons
Activation function transforms the sum
Process repeats through all hidden layers
Output layer produces final predictions

This process is efficient and allows networks to make predictions in milliseconds, even with millions of parameters.

Backpropagation: The Training Algorithm

Backpropagation is the algorithm that trains neural networks. It works in two phases:

Backpropagation Process

Phase 1: Forward Pass

• Input data flows through network

• Compute output and loss (error)

Phase 2: Backward Pass

• Calculate gradient of loss w.r.t. each weight

• Update weights using gradient descent

• Repeat until convergence

The name "backpropagation" comes from the fact that errors propagate backward through the network, from output layer to input layer, allowing each neuron to learn how much it contributed to the final error.

🧠 Quick Check — Lesson 3

What is the purpose of activation functions in neural networks?

🧠 Quick Check — Lesson 3

In backpropagation, which direction do gradients flow to update weights?

Lesson Summary

✅

Artificial neurons mimic biological neurons by combining weighted inputs with bias and applying activation functions.

✅

Perceptrons are single-neuron networks for binary classification; deeper networks learn more complex patterns.

✅

Activation functions (ReLU, Sigmoid, Tanh) introduce non-linearity, allowing networks to learn curved decision boundaries.

✅

Forward propagation passes data through layers; backpropagation trains the network by computing gradients.

✅

Deep networks have multiple hidden layers that learn hierarchical features from simple to complex.

Up Next

Lesson 4: ML.NET Framework

Next Lesson →