Playground icon Tinker With a Neural Network Right Here in Your Browser. Press the spacebar to play/pause

Epoch

Data

Which dataset do you want to use?

Features

Which properties do you want to feed in?

Click anywhere to edit.
Weight/Bias is 0.2.
This is the output from one neuron. Hover to see it larger.
The outputs are mixed with varying weights, shown by the thickness of the lines.

Output

Test loss
Test accuracy
Training loss
Colors shows data, neuron and weight values.

Federated Learning playground guide

Enable FL by pressing "Run federated", click on local models to start local-only training. Click play or step to aggregate all models after local training. View parameter descriptions below

Federated Learning parameters guide

This playground supports several federated learning algorithms with extensive hyperparameter control. Below is a comprehensive guide to all available parameters:

Playground icon Core FL Settings

Parameter
Description
Range
Algorithm
FL optimization algorithm: FedAvg (standard), FedProx (proximal), FedAdam (server-side Adam), SCAFFOLD (variance reduction)
Dropdown
Clients
Total number of participating clients in the federation
2-54
Client Fraction
Fraction of clients selected per round (C parameter)
0.05-1.0
Data Balance
Evenness of data distribution. 1 is even distribution, 0 is extremely skewed
0-1
Data Heterogeneity
Dirichlet distribution parameter controlling data class heterogeneity. Higher values = more IID, lower = more heterogeneous
1.0-100

Playground icon Advanced Settings

Parameter
Description
Range
Client LR
Local learning rate for client-side SGD optimization
0.001-0.2
Dropout
Probability that a selected client silently drops out (communication failure simulation)
0.0-0.9
Weight by size
Use weighted aggregation based on client dataset sizes (|D_k|)
Checkbox
μ (FedProx)
Proximal term strength for FedProx algorithm. Adds μ||w - w₀||² penalty to local objectives
0.0-1.0
Server LR (FedAdam)
Server-side learning rate for FedAdam optimizer
0.001-0.1

Playground icon Differential Privacy

Parameter
Description
Range
Diff. Privacy
Enable client-level differential privacy (vs server-level when unchecked)
Checkbox
DP Clip
L2 norm clipping threshold for gradient/update vectors
0.0-10.0
DP σ
Gaussian noise multiplier. Effective noise = σ × clip_norm
0.0-2.0

Playground icon Clustered Federated Learning

Parameter
Description
Range
Clustered FL
Enable clustering clients into groups with separate models
Checkbox
Similarity Metric
Distance metric for clustering: Cosine similarity or L2 distance
Dropdown
Clusters K
Number of client clusters (separate models)
1-8
Recluster Every
Frequency of reclustering clients (in rounds)
1-20
Warmup Rounds
Initial rounds before first clustering operation
0-10

Federated Learning Charts

When FL is enabled, four real-time charts track key metrics during training:

Client Participation

Blue line: Percentage of total clients selected per round (Client Fraction × 100)

Green line: Percentage of selected clients that actually participated (After dropout)

Higher participation generally leads to better convergence but increases communication costs.

Communication Cost

Orange line: Total communication overhead per round in KB

Includes model weights sent to clients and updates sent back to server. In clustered FL, includes per-cluster model synchronization.

Trade-off: More parameters = better expressivity but higher communication costs.

Client Loss Distribution

Red line: Maximum loss across all participating clients

Purple line: Average loss across all participating clients

Blue-grey line: Minimum loss across all participating clients

Large spread indicates unfairness - some clients perform much worse than others.

Convergence Rate

Brown line: Rate of training loss improvement per round

Calculated as: (previous_loss - current_loss) / previous_loss

Higher values indicate faster learning. Rate typically decreases as training progresses and approaches optimum.

Why both cosine similarity and l2 distance exists?

Ghosh et al. [26] proposes a l2-distance clustering algorithm to determine the distribution similarity of the clients. This approach has the limitation that it only works if the client’s risk functions are convex and the minima of different clusters are well separated. The l2-distance also is not able to distinguish congruent from incongruent settings. This means that the method will incorrectly split up clients in the conventional congruent non-iid setting.


Um, What Is a Neural Network?

It’s a technique for building a computer program that learns from data. It is based very loosely on how we think the human brain works. First, a collection of software “neurons” are created and connected together, allowing them to send messages to each other. Next, the network is asked to solve a problem, which it attempts to do over and over, each time strengthening the connections that lead to success and diminishing those that lead to failure. For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start. For a more technical overview, try Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

This Is Cool, Can I Repurpose It?

Please do! We’ve open sourced it on GitHub with the hope that it can make neural networks a little more accessible and easier to learn. You’re free to use it in any way that follows our Apache License. And if you have any suggestions for additions or changes, please let us know.

We’ve also provided some controls below to enable you tailor the playground to a specific topic or lesson. Just choose which features you’d like to be visible below then save this link, or refresh the page.

What Do All the Colors Mean?

Orange and blue are used throughout the visualization in slightly different ways, but in general orange shows negative values while blue shows positive values.

The data points (represented by small circles) are initially colored orange or blue, which correspond to positive one and negative one.

In the hidden layers, the lines are colored by the weights of the connections between neurons. Blue shows a positive weight, which means the network is using that output of the neuron as given. An orange line shows that the network is assiging a negative weight.

In the output layer, the dots are colored orange or blue depending on their original values. The background color shows what the network is predicting for a particular area. The intensity of the color shows how confident that prediction is.

What Library Are You Using?

We wrote a tiny neural network library that meets the demands of this educational visualization. For real-world applications, consider the TensorFlow library.

Credits

This was created by Daniel Smilkov and Shan Carter. This is a continuation of many people’s previous work — most notably Andrej Karpathy’s convnet.js demo and Chris Olah’s articles about neural networks. Many thanks also to D. Sculley for help with the original idea and to Fernanda Viégas and Martin Wattenberg and the rest of the Big Picture and Google Brain teams for feedback and guidance.