I don't know about a geometric interpretation, but here is a brief sketch of a proof. First we need to be precise about what we mean by "convergence." In the naive sense, Fourier series don't always converge - that is, pointwise. (If you change the value of a function at a single point, the Fourier series remains unchanged.) The sense in which they do always converge is in the Hilbert space $L^2([0, 1])$, which has inner product defined by $\langle f, g \rangle = \int_0^1 \overline{g(x)} f(x) dx$ inducing a norm, which induces a metric. In $L^2([0, 1])$ let $X$ be the subspace spanned by the functions $e^{2\pi i nx}, n \in \mathbb{Z}$. It is fairly straightforward to verify that the functions $e^{2\pi i nx}$ are orthogonal and have norm $1$; generally I think about this in a representation-theoretic way, as a special case of the orthogonality relations for characters.
Then the statement that Fourier series converge is equivalent to the statement that $X$ is dense in $L^2([0, 1])$. Why? Given a sequence in $X$ converging to an element of $L^2([0, 1])$ we can compute the Fourier coefficients, which depend continuously on the sequence and hence which converge to a limit. That these coefficients actually represent the element of $L^2([0, 1])$ is a standard Hilbert space argument and you should take a course in functional analysis if you want to learn this kind of stuff thoroughly.
Now, something else you need to know about $L^2([0, 1])$ is that the subspace $Y$ consisting of all step functions is dense in it. (If you have trouble believing this, first convince yourself that $Y$ is dense in the continuous functions on $[0, 1]$ and then believe me that the continuous functions are dense in $L^2([0, 1])$. In fact, $L^2([0, 1])$ can be defined as the completion of $C([0, 1])$ with respect to the $L^2$ norm.) So to show that $X$ is dense, it suffices to show that the closure of $X$ contains $Y$. In fact, it suffices to show that $X$ has as a limit point a step function with a single bump, say
$$a(x) = \begin{cases} 0 \text{ if } 0 \le x \le \frac{1}{3}, \frac{2}{3} \le x \le 1 \\ 1 \text{ otherwise} \end{cases}$$
and to take linear combinations, translations, and dilations of this. In other words, it suffices to prove convergence for square waves. But one can do the computations directly here. There is a standard picture to stare at, and of course if you have ever actually heard a square wave you should believe that audio engineers, at least, are perfectly capable of approximating square waves by sines and cosines.