There are two ways to handle this.
The most common way is simply to assume some form of set theory in the metatheory, e.g. ZFC. Then you can use all the normal semantic and model-theoretic methods that you are used to. The advantages of this method are that it matches the way that mathematicians actually think about things, and that it allows us to use all the theorems we're used to.
Another option is to reinterpret claims made in the metatheory as syntactic claims. For example, the claim "the model must be infinite" can be reinterpreted to mean that the theory itself (!) proves each of the sentences
$$ \lnot (\exists c_0) (\forall x) (x = c_0) $$
$$ \lnot (\exists c_0, c_1) (\forall x) (x = c_0 \lor x = c_1) $$
$$ \lnot (\exists c_0, c_1, c_2) (\forall x) (x = c_0 \lor x = c_1 \lor c = c_2) $$
and more generally, for each $k > 0$,
$$
\lnot (\exists c_0, \ldots, c_k ) (\forall x) (x = c_0 \lor \cdots \lor c = c_k) $$
The fact that each of these sentences is provable in the theory can be verified using a very weak metatheory. There's a simple and uniform way, for any fixed $k$, to generate a proof from your two axioms for that particular $k$.
This reinterpretation process is well known to logicians, and it can be applied very generally. The main limitation is whether the property in question can be stated in the language of the theory itself (possibly as an infinite set of sentences, as with the example above).
Gödel's completeness theorem says that if a sentence in the language of a theory is true (semantically) in every model of the theory, then that sentence is provable (syntactically) within the theory itself. So we could, if we wanted, ignore the "true" part in those cases and focus just on the "provable" part. The cost of doing that is that it arguably misses the point, since we usually think semantically about models, and that it does not apply to properties of models that cannot be expressed within the language of the theory itself.
Because this reinterpretation process is usually very routine, it's rare for logicians to bother to mention it, even though we're well aware it could be done. We just write in the normal model-theoretic (set theoretic) way unless there is a particular reason that we need to focus on the syntactic interpretation.