One approach to model comparison in a Bayesian framework uses a Bernoulli indicator variable to indicate which of two models is likely to be the “true model”. When applying MCMC-based tools for fitting such a model, it is common to use pseudo-priors to improve mixing in the chains. See here for a very accessible treatment of why pseudo-priors are useful. Apparently, it is well known that the use of pseudo-priors does not affect the long-term behavior of the chains (and therefore inference from the model); it only affects the short-term mixing.
This question seeks intuition about how it can possibly be the case that the pseudo-priors do not affect inference from the model. In particular something must be wrong either with the understanding that I have expressed above or with the reasoning that I describe below. I want to know exactly what I am getting wrong.
Suppose that in reality Model 1 and Model 2 are roughly equally probable given the data. But suppose that Jane inadvertently chooses pseudo-priors for Model 1 that are far away from the region of high posterior density. Suppose that Jane’s pseudo-priors for Model 2, by contrast, fall directly in the region of high posterior density. Now consider what happens to a chain as it wanders around. When it lands in Model 1, it might remain in Model 1 for a little while, but can jump to Model 2 fairly easily. When it lands it Model 2, the poorly chosen pseudo-priors on Model 1 make it difficult to jump back to Model 1. It sure seems to me that as a result, the chain will spend a lot more time in Model 2 than in Model 1, which would lead to invalid inference.
So which part of this all am I misunderstanding?