Is it possible to characterize the distributional derivative as some sort of "best linear approximation" of a distribution (a la the Fréchet/Gâteaux derivatives), if viewed in the appropriate spaces?
For example, let $f \in D'$, $\phi \in D$, and let $\psi_x \in D$ be a mollified ramp function so that $\psi_x(x,y,z,...)=x$ on the support of $\phi$. What can we say about the quantity:
$$\frac{f(\phi)-f(\phi+s\psi_x)}{s}-\frac{\partial f}{\partial x}(\phi)$$
where $\frac{\partial f}{\partial x}$ is the usual distributional derivative?
Going further, if we relax possibilities for "derivative direction", so that simply $\psi \in D$, does it even make sense to talk about the quantity "$Df:D'\rightarrow D'$":
$$Df(\phi)(\psi):=\lim_{s\rightarrow 0} \frac{f(\phi)-f(\phi+s\psi)}{s}$$ ?
In the space of distributions we don't have a norm, but we still have a vector space structure and a notion of convergence, so it seems that something like this is at least plausible.