0
$\begingroup$

If I have a distribution mass function p(Y) , and values of $X_0,X_1,...,X_i$, I want to assign values to $X_{i+1},...,X_n$ such that the series of X follows p(Y) distribution where Y is the set of all distinct values of X. Is there any technique to do this?

  • 0
    If this is a discrete distribution, then $p(x)$ gives the probability that a random variable is equal to some value. So then the question is: given the probabilities what values should one assign to the random variables $X_{i+1}, \dots, X_n$?2010-11-13
  • 0
    @trevor:yes. you are right.2010-11-13
  • 0
    What is $X$ here? Is it the sequence $(X_0,\ldots,X_n)$, interpreted as $n$ samples of a discrete probability distribution? If so, I'm not sure there is a meaningful sense in which a sequence of specified samples follows a given distribution, whereas a different sequence does not. For example, does the sequence $(\mathrm{H,H,H,H,H})$ follow the distribution of a fair coin toss? If so, would this be a satisfactory result for you?2010-11-13

1 Answers 1

1

As long as the sequence doesn't fall outside of the support of the function, it is a sequence that could be generated by the function (the all-heads sequence for a fair coin toss is a good example).

What you can try to do is minimize the distance b/w the empirical CDF of the data and the CDF of your distribution. You have to decide what distance metric to use (maximum absolute difference is one obvious choice) and once you have that it boils down to solving an optimization problem with each new data point added.