A: To create a perceptron in PyTorch and train it on the logical OR function, we can follow these steps:
-
Define the model: Create a simple single-layer perceptron (SLP) using PyTorch. We'll use 2 input neurons (since the OR function takes two inputs), 1 output neuron, and no hidden layer.
-
Set up the dataset: The logical OR truth table for two inputs is as follows:
Input 1 | Input 2 | Output (OR) |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
- Train the model: We'll use binary cross-entropy loss for the training, as this is a binary classification problem, and we'll use Stochastic Gradient Descent (SGD) for optimization.
import torch
import torch.nn as nn
import torch.optim as optim
# Define the Perceptron model (single-layer)
class Perceptron(nn.Module):
def __init__(self):
super(Perceptron, self).__init__()
# Input layer with 2 inputs, output layer with 1 output
self.fc = nn.Linear(2, 1)
# Sigmoid activation function for binary classification
self.sigmoid = nn.Sigmoid()
def forward(self, x):
# Forward pass through the perceptron
x = self.fc(x)
x = self.sigmoid(x)
return x
# Define the dataset for Logical OR
# X contains inputs, y contains the target output
X = torch.tensor([[0.0, 0.0],
[0.0, 1.0],
[1.0, 0.0],
[1.0, 1.0]], dtype=torch.float32) # Input data
y = torch.tensor([[0.0], # 0 OR 0 = 0
[1.0], # 0 OR 1 = 1
[1.0], # 1 OR 0 = 1
[1.0]], dtype=torch.float32) # 1 OR 1 = 1
# Initialize the model
model = Perceptron()
# Set up the loss function (binary cross-entropy) and optimizer (Stochastic Gradient Descent)
criterion = nn.BCELoss() # Binary Cross-Entropy Loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # SGD optimizer
# Train the model
epochs = 10000
for epoch in range(epochs):
# Zero the gradients
optimizer.zero_grad()
# Forward pass
output = model(X)
# Compute the loss
loss = criterion(output, y)
# Backward pass (compute gradients)
loss.backward()
# Update the weights
optimizer.step()
# Print the loss every 1000 epochs for monitoring
if (epoch + 1) % 1000 == 0:
print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")
# Test the model (after training)
with torch.no_grad(): # No need to compute gradients during testing
predictions = model(X)
predictions = predictions.round() # Round the output to 0 or 1
print("\nPredictions after training:")
for i, pred in enumerate(predictions):
print(f"Input: {X[i].numpy()} -> Prediction: {pred.item()} (Target: {y[i].item()})")
-
Model Definition:
- The perceptron model consists of a single fully connected (linear) layer (
nn.Linear(2, 1)
), which takes 2 inputs (one for each binary input of the OR function) and produces 1 output. - The output is passed through a sigmoid activation to ensure the result is between 0 and 1. This is important for binary classification.
- The perceptron model consists of a single fully connected (linear) layer (
-
Training Data:
X
is a tensor representing the inputs for the logical OR function, with each row being a pair of binary inputs.y
is a tensor representing the corresponding target outputs (logical OR of the inputs).
-
Training Loop:
- Forward pass: The input is passed through the perceptron model.
- Loss calculation: The binary cross-entropy loss function is used to measure how far the model’s predictions are from the true outputs.
- Backward pass: The
loss.backward()
function computes the gradients. - Optimizer step: The optimizer (
SGD
) updates the model's weights based on the computed gradients.
-
Testing:
- After training, the model is tested using the same inputs (
X
). The output predictions are rounded to 0 or 1 using.round()
, since this is a binary classification problem.
- After training, the model is tested using the same inputs (
Epoch [1000/10000], Loss: 0.6215
Epoch [2000/10000], Loss: 0.5305
Epoch [3000/10000], Loss: 0.4706
Epoch [4000/10000], Loss: 0.4301
Epoch [5000/10000], Loss: 0.3984
Epoch [6000/10000], Loss: 0.3724
Epoch [7000/10000], Loss: 0.3504
Epoch [8000/10000], Loss: 0.3315
Epoch [9000/10000], Loss: 0.3149
Epoch [10000/10000], Loss: 0.3006
Predictions after training:
Input: [0. 0.] -> Prediction: 0.0 (Target: 0.0)
Input: [0. 1.] -> Prediction: 1.0 (Target: 1.0)
Input: [1. 0.] -> Prediction: 1.0 (Target: 1.0)
Input: [1. 1.] -> Prediction: 1.0 (Target: 1.0)
- Model Structure: The model is simple with just one layer (the perceptron), and it learns to approximate the OR function.
- Loss Function: Binary Cross-Entropy (
BCELoss
) is used because it's ideal for binary classification tasks. - Optimization: Stochastic Gradient Descent (SGD) is used to optimize the model's weights during training.
This code defines and trains a single-layer perceptron on the logical OR function using PyTorch. After training for 10,000 epochs, the perceptron should successfully predict the OR function outputs for the given inputs. The results are close to the expected logical OR outputs: 0, 1, 1, 1
.