Below I show that Rudin's approximation arises simply by applying by the secant method - a difference analog of Newton's method for finding successively better approximations to roots.
As the linked Wikipedia article shows, the recurrence relation for the secant method is as below.
$$\rm S_{n+1}= \dfrac{S_{n-1}\ f\:(S_n) - S_n\ f\:(S_{n-1})}{f\:(S_n)-f\:(S_{n-1})}\qquad\qquad\qquad\qquad$$
For $\rm\ (S_{n-1},S_n,S_{n+1}) = (q,p,p')\ $ and $\rm\ f\:(x) = x^2-d\:,\:$ we obtain
$$\rm p'\ =\ \dfrac{q\:(p^2-d) - p\:(q^2-d)}{p^2-d-(q^2-d)}\ =\ \dfrac{(p-q)\:(p\:q+d)}{p^2-q^2}\ =\ \dfrac{p\:q+d}{p+q}$$
Finally specializing $\rm\: q = 2 = d\: $ yields Rudin's approximation $\rm\displaystyle\ p'\ =\ \frac{2\:p+2}{\ \:p+2}$
The secant method has beautiful connections with the group law on conics.
To learn about this folklore, I highly recommend Sam Northshield's Associativity of the Secant Method. The reader already familiar with the group law on elliptic curves, but unfamiliar with the degenerate case of conics, might also find helpful some of Franz Lemmermeyer's expositions, e.g. Conics - a poor man's elliptic curves.
Finally, note this the approximation can be derived purely algebraically as follows.
Given lower and upper approximations to a square-root, we may obtain a better lower approximation $\rm\ p'\ $ by $\:$ "composing" $\:$ them,$\ $ namely:
THEOREM $\rm\displaystyle\quad\ \ q\ >\ \sqrt d\ > \ p\ \ \:\Rightarrow\:\ \ \sqrt d\ > \ p'\ >\ p\quad\ \ for\quad\ p' \:=\ \frac{p\:q+d}{p+q} $
Proof: $\rm\quad\displaystyle 0\ \: >\ (q-\sqrt d)\ \ (p-\sqrt d)\ =\ p\:q+d - (p+q)\:\sqrt d\ \ \Rightarrow\ \ \sqrt d\ >\ p'$
Finally $\rm\quad\quad\displaystyle p'-p\ =\ \frac{p\:q+d}{p+q} - p\ =\ \frac{\ d - p^2}{p+q}\: >\ 0\ \ \Rightarrow\ \ p'\ >\ p$