import numpy as np
np.random.seed(42)
samples = np.random.randn(1000)
print(f"mean={samples.mean():.4f}, std={samples.std():.4f}")mean=0.0193, std=0.9787
Dell Zhang
2026-05-11
This is a test post. Equations work: \nabla_\theta J(\theta) = \mathbb{E}_\pi[\nabla_\theta \log \pi_\theta(a|s) \cdot Q^\pi(s,a)].
Code chunks execute:
mean=0.0193, std=0.9787
And display math too: J(\theta) = \mathbb{E}_{\tau \sim \pi_\theta}\left[\sum_{t=0}^{T} r(s_t, a_t)\right]