# Main `ebnm`

methods and the normal prior family

Function `ebnm()`

is the main interface for fitting the
empirical Bayes normal means model; it is a “Swiss army knife” that
allows for various choices of prior family \(\mathcal{G}\) as well as providing multiple
options for fitting and tuning models. For example, we can fit a normal
means model with the prior family \(\mathcal{G}\) taken to be the family of
normal distributions:

```
<- wOBA$x
x <- wOBA$s
s names(x) <- wOBA$Name
names(s) <- wOBA$Name
<- ebnm(x, s, prior_family = "normal", mode = "estimate") fit_normal
```

(The default behavior is to fix the prior mode at zero. Since we
certainly do not expect the distribution of true wOBA skill to have a
mode at zero, we set `mode = "estimate"`

.)

We note in passing that the `ebnm`

package has a second
model-fitting interface, in which each prior family gets its own
function:

`<- ebnm_normal(x, s, mode = "estimate") fit_normal `

Textual and graphical overviews of results can be obtained using,
respectively, methods `summary()`

and `plot()`

.
The summary method appears as follows:

```
summary(fit_normal)
#>
#> Call:
#> ebnm_normal(x = x, s = s, mode = "estimate")
#>
#> EBNM model was fitted to 688 observations with _heteroskedastic_ standard errors.
#>
#> The fitted prior belongs to the _normal_ prior family.
#>
#> 2 degrees of freedom were used to estimate the model.
#> The log likelihood is 989.64.
#>
#> Available posterior summaries: _mean_, _sd_.
#> Use method fitted() to access available summaries.
#>
#> A posterior sampler is _not_ available.
#> One can be added via function ebnm_add_sampler().
```

The `plot()`

method visualizes results, comparing the
“observed” values \(x_i\) (the initial
wOBA estimates) against the empirical Bayes posterior mean estimates
\(\hat{\theta}_i\):

`plot(fit_normal)`

The dashed line shows the diagonal \(x = y\), which makes shrinkage effects clearly visible. In particular, the most extreme wOBAs on either end of the spectrum are strongly shrunk towards the league average (around .300).

Since `plot()`

returns a “ggplot” object (Wickham 2016), the plot can conveniently be
customized using `ggplot2`

syntax. For example, one can vary
the color of the points by the number of plate appearances:

```
plot(fit_normal) +
geom_point(aes(color = sqrt(wOBA$PA))) +
labs(x = "wOBA", y = "EB estimate of true wOBA skill",
color = expression(sqrt(PA))) +
scale_color_gradient(low = "blue", high = "red")
```

By varying the color of points, we see that the wOBA estimates with higher standard errors or fewer plate appearances (blue points) tend to be shrunk toward the league average much more strongly than wOBAs from hitters with many plate appearances (red points).

Above, we used `head()`

to view data for the first 6
hitters in the dataset. Let’s now see what the EBNM analysis suggests
might be their “true” wOBA skill. To examine the results more closely,
we use the `fitted()`

method, which returns a posterior
summary for each hitter:

```
print(head(fitted(fit_normal)), digits = 3)
#> mean sd
#> Khalil Lee 0.303 0.0287
#> Chadwick Tromp 0.308 0.0286
#> Otto Lopez 0.310 0.0283
#> James Outman 0.311 0.0282
#> Matt Carpenter 0.339 0.0254
#> Aaron Judge 0.394 0.0184
```

The wOBA estimates of the first four ballplayers are shrunk strongly toward the league average, reflecting the fact that these players had very few plate appearances (and indeed, we were not swayed by their very high initial wOBA estimates).

Carpenter had many more plate appearances (154) than these other four players, but according to this model we should remain skeptical about his strong performance; after factoring in the prior, we judge his “true” performance to be much closer to the league average, downgrading an initial estimate of .472 to the final posterior mean estimate of .339.