Home
/
Tutorials
/
Backpropagation means: send the error backward through the network so each weight learns how much it contributed to the mistake.
In our tiny model there is only one layer, so “backward” looks like:
error = prediction - actual
weight1 -= learning_rate × error × study_hours
weight2 -= learning_rate × error × sleep_hours
bias -= learning_rate × error
That error × input pattern is the seed of full backprop. Deep nets chain the same idea with calculus (chain rule) so every layer gets a fair share of blame.
error ∂ via inputs Loss Prediction Raw z Weights Bias ▶️ Play
⏸️ Pause
🔄 Reset
Backprop is not magic. It is bookkeeping: who caused how much error, nudge them opposite that direction.
Being upfront builds trust. Real stacks usually add:
More layers and neurons — depth creates feature hierarchies
Better losses — cross-entropy for classification, MSE for regression
Optimizers — Adam, momentum, weight decay
Regularization — dropout, early stopping, data augmentation
Scale — batching, GPUs, distributed training
Frameworks — PyTorch, TensorFlow, JAX
None of that changes the core loop you practiced:
Predict → measure error → adjust → repeat.
You built a tiny neural network by hand.
Idea You did it Neuron input × weight + biasProbability sigmoid Decision threshold 0.5 Loss squared error Training epoch loop updating weights Multi-input w1, w2Backprop preview error × input updates
Runnable code: githubRepo/2026/05/20/tiny-neural-network-by-hand/.
Knowledge Check This interactive quiz requires JavaScript to be enabled.
Question 1: In our tutorial, what is the main purpose of the bias term? A. To store the dataset labels B. To shift the raw output up or down independent of input size (Correct) C. To replace sigmoid during training D. To count how many epochs have run Explanation: Bias adds a constant offset so the model can fit patterns that do not cross zero the way weight-only scaling would.
Question 2: A student studies 2.5 hours. After training, sigmoid gives 0.52. What label do we assign with threshold 0.5? A. Fail B. Pass (Correct) C. Retry training D. Unknown Explanation: 0.52 ≥ 0.5, so we label Pass—even though the probability shows the example is near the boundary.
Question 3: Why does training try to reduce loss? A. So the dataset gets smaller each epoch B. So predictions move closer to actual labels on average (Correct) C. So we can remove sigmoid D. So learning rate increases automatically Explanation: Lower loss means smaller squared errors—predictions align better with truth.
Question 4: When we added sleep hours, how did weight2 get updated each row? A. weight2 += learning_rate × actual B. weight2 -= learning_rate × error × sleep_hours (Correct) C. weight2 = sigmoid(sleep_hours) D. weight2 is frozen; only weight1 trains Explanation: Each weight is nudged by error scaled by the input that used that weight—same pattern as weight1 with study_hours.
Question 5: Which statement is most accurate about our pass/fail model? A. It reasons about student motivation B. It applies fixed math with learned numbers (Correct) C. It only works with exactly four students in the world D. It cannot run in Python without PyTorch Explanation: The model applies learned weights and bias—no reasoning. Four rows were just our teaching dataset; you can add more rows and retrain.
Question 6: What is backpropagation, in one line? A. A way to download datasets from the cloud B. Sending error backward to decide how each weight should change (Correct) C. Replacing loss with accuracy D. A type of GPU driver Explanation: Backprop distributes error to weights (and deeper layers in big nets) so updates point toward lower loss.
Question 7: You set learning_rate = 1.0 and loss spikes wildly. What likely happened? A. Sigmoid broke because Python is slow B. Updates were too large and overshot good weights (Correct) C. Bias became unnecessary D. The model ran out of memory on four rows Explanation: A huge learning rate takes big steps; weights can overshoot and bounce instead of settling.
Add more rows and features in the repo’s examples/ folder
Read about logistic regression —you basically built one
Try a framework tutorial next, but trace one batch: loss.backward()
When you finish the quiz, head to the completion page for a summary and project ideas.