Given sample data $x_1, \ldots, x_n$ generated from a probability distribution $f(x|\theta)$ ($\theta$ being an unknown parameter), a statistic $T(x_1, \ldots, x_n)$ of the sample data is called sufficient if $f(x|\theta, t) = f(x|t)$.
However, I'm always kinda confused by this definition, since I think of a sufficient statistic as a function that gives just as much information about $\theta$ as the original data itself (which seems a little different from the definition above).
The definition of Bayesian sufficiency, on the other hand, does mesh with my intuition: $T$ is a Bayesian sufficient statistic if $f(\theta|t) = f(\theta|x)$.
So why is the first definition of sufficiency important? What does it capture that Bayesian sufficiency doesn't, and how should I think about it?
[Note: I believe that every sufficient statistic is also Bayesian sufficient, but not conversely (the reverse implication doesn't hold in the infinite-dimensional case, according to Wikipedia).]