Inside the Black Box
Large Language Models (LLMs) often hallucinate. To trust them, we need to see what they are thinking.
The Visualization Pipeline
- Extract Activations: Hook into the PyTorch model layers.
- Reduce Dimensionality: Use t-SNE or UMAP to project 1024 dimensions to 3D.
- Render: WebGL scatter plot.
Neural Map
Python Snippet
pythonimport torch from sklearn.manifold import TSNE # Get hidden states with torch.no_grad(): outputs = model(input_ids, output_hidden_states=True) hidden_states = outputs.hidden_states[-1] # Project to 2D tsne = TSNE(n_components=2) projected = tsne.fit_transform(hidden_states.numpy())