← Hub

🔬 The Well — Physics Simulation Explorer

15TB Datasets Enhanced by Agent R
🌊

The Well — Physics Simulation Datasets

A 15TB collection of diverse physics simulations for machine learning. Explore 21 datasets spanning fluid dynamics, astrophysics, biological systems, and more — now with Agent R's normalization enhancements.

15TB
Total Data
21
Datasets
3
Normalizers
Potential

🤖 Agent R Enhancements

📐

MinMaxNormalization

New third normalization strategy alongside ZScore and RMS. Normalizes data to [0, 1] using per-field min/max statistics. Guards against constant fields with a configurable min_denom.

✓ PR #1 Merged
🐛

Mutable Default Args Fixed

WellDataset and WellDataModule had shared mutable default lists for include_filters/exclude_filters. Replaced with Optional[List[str]] = None — coerced inside the body.

✓ PR #1 Merged
⚙️

Static Method Signature

Metric.eval had a spurious self parameter in its @staticmethod definition, silently shifting positional arguments. Removed self from the declaration.

✓ PR #1 Merged
⚠️

Proper Warning Emission

Resize augmentation swallowed warnings with a bare print(). Replaced with warnings.warn(…, UserWarning, stacklevel=2) so callers can filter or catch it properly.

✓ PR #1 Merged
📦

Improved Exports

All three normalizers — ZScoreNormalization, RMSNormalization, and new MinMaxNormalization — are now importable directly from the_well.data.

✓ PR #1 Merged
🔢

total_trajectories Property

New WellMetadata.total_trajectories computed property returns sum(n_trajectories_per_file), replacing scattered sum(meta.n_trajectories_per_file) call sites.

✓ PR #1 Merged

📊 Interactive Normalization Comparison

20%

ZScore Normalization

Centers data to zero mean with unit standard deviation. Best for Gaussian data. Formula: z = (x − μ) / σ

from the_well.data import ZScoreNormalization norm = ZScoreNormalization(stats, fields, const_fields) x_norm = norm.normalize(x, "field") # z-score x_back = norm.denormalize_flattened(x_norm, "variable")

Stats Panel

Loading…

🎬 Live Physics Simulations

Rayleigh-Bénard Convection

Thermal convection driven by temperature gradients. A fluid heated from below becomes unstable and forms convective cells.

MHD Turbulence

Magneto-hydrodynamic turbulence as seen in extra-galactic fluids. Magnetic field lines interact with plasma flows.

Gray-Scott Reaction Diffusion

Turing-type pattern formation via two chemical species. Self-organizing spots and stripes emerge from random initial conditions.

Active Matter

Self-propelled particles exhibiting collective motion and flocking behavior — biological system dynamics.

🗂️ Dataset Catalog

💻 Usage Examples

# Install The Well (enhanced fork by Agent R) # pip install git+https://github.com/barbrickdesign/the_well-enhancedByAgentR from the_well.data import ( WellDataset, WellDataModule, ZScoreNormalization, RMSNormalization, MinMaxNormalization # ← new in Agent R's enhancement ) from torch.utils.data import DataLoader # Load a dataset (stream from HuggingFace) trainset = WellDataset( well_base_path="hf://datasets/polymathic-ai/", well_dataset_name="active_matter", well_split_name="train", ) loader = DataLoader(trainset, batch_size=8, shuffle=True) # Apply MinMax normalization (Agent R enhancement) norm = MinMaxNormalization( stats={"min": {"u": -3.0, "v": -2.0}, "max": {"u": 3.0, "v": 2.0}}, core_field_names=["u", "v"], core_constant_field_names=[], ) for batch in loader: u_norm = norm.normalize(batch["u"], "u") # → [0, 1] u_back = norm.denormalize(u_norm, "u") # → original # train your model... # WellMetadata.total_trajectories (Agent R enhancement) meta = trainset.metadata total = meta.total_trajectories # sum(n_trajectories_per_file) print(f"Total trajectories: {total}")