The Wikipedia statement is right, and the statement which you attribute to Billingsley is wrong, in some sense inherently so. Convergence in distribution is a notion of convergence of measures. A sequence of random variables can converge in distribution even if they live on totally different probability spaces, and their joint distribution never enters. In contrast, convergence in mean (i.e. $L^1$) only makes sense for a sequence of random variables defined on the same probability space (which is to say, you have to know all their joint distributions). So it doesn't make sense for the former to imply the latter.
Even if your random variables happen to be defined on the same measure space, it fails in general, as Jacob Katz pointed out. For a very explicit counterexample, let $\lbrace X_i \rbrace_{i=1}^\infty$ be iid Bernoulli. Trivially $X_i$ converges in distribution, and they are uniformly integrable because they are uniformly bounded. But $E|X_i - X_j| = 1/2$ for all $i \ne j$, so the $X_i$ are not Cauchy in mean.
It's possible that Billingsley is in error, but I think it's more likely that you have missed some context. Can you post the relevant text?