当前位置: 动力学知识库 > 问答 > 编程问答 >

bayesian - Neural Nets with Pymc3

问题描述:

I am trying to use pymc3 to sample from the posterior, a set of single-hidden layer neural nets so that I could then convert the model to a hierarchical one, same as in Radford M.Neal's thesis. my first model looks like this:

def sample(nHiddenUnts,X,Y):

nFeatures = X.shape[1]

with pm.Model() as model:

#priors

bho = pm.Normal('hiddenOutBias',mu=0,sd=100)

who = pm.Normal('hiddenOutWeights',mu=0,sd=100,shape= (nHiddenUnts,1) )

bih = pm.Normal('inputBias',mu=0,sd=100 ,shape=nHiddenUnts)

wih= pm.Normal('inputWeights',mu=0,sd=100,shape=(nFeatures,nHiddenUnts))

netOut=T.dot( T.nnet.sigmoid( T.dot( X , wih ) + bih ) , who )+bho

#likelihood

likelihood = pm.Normal('likelihood',mu=netOut,sd=0.001,observed=Y)

start = pm.find_MAP()

step = pm.Metropolis()

trace = pm.sample(100000, step, start, progressbar=True)

return trace

and in the second model hyperpriors are added which are precisions of noise, input-to-hidden and hidden-to-output weights and biases( e.g. bihTau = precision of input->hidden bias ). The parameters for the hyperpriors are chosen so that they could be broad and also log-transformed.

 #Gamma Hyperpriors

bhoTau, log_bhoTau = model.TransformedVar('bhoTau',

pm.Gamma.dist(alpha=1,beta=1e-2,testval=1e-4),

pm.logtransform)

WhoTau, log_WhoTau = model.TransformedVar('WhoTau',

pm.Gamma.dist(alpha=1,beta=1e-2,testval=1e-4),

pm.logtransform)

bihTau, log_bihTau = model.TransformedVar('bihTau',

pm.Gamma.dist(alpha=1,beta=1e-2,testval=1e-4),

pm.logtransform)

wihTau, log_wihTau = model.TransformedVar('wihTau',

pm.Gamma.dist(alpha=1,beta=1e-2,testval=1e-4),

pm.logtransform)

noiseTau, log_noiseTau = model.TransformedVar('noiseTau',

pm.Gamma.dist(alpha=1,beta=1e-2,testval=1e+4),

pm.logtransform)

#priors

bho = pm.Normal('hiddenOutBias',mu=0,tau=bhoTau)

who = pm.Normal('hiddenOutWeights',mu=0,tau=WhoTau,shape=(nHiddenUnts,1) )

bih = pm.Normal('inputBias',mu=0,tau=bihTau ,shape=nHiddenUnts)

wih= pm.Normal('inputWeights',mu=0,tau=wihTau ,shape= (nFeatures,nHiddenUnts))

.

.

.

start = pm.find_MAP()

step = pm.NUTS(scaling=start)

where bho,who,bin and win are biases and weights of hidden-to-output and input-to-hidden layers.

To check my models 3 to 5 sample points [0,1) was drawn from a one dimensional toy function of the following form

def g(x):

return np.prod( x+np.sin(2*np.pi*x),axis=1)

The first model (constant hyper-parameters) works fine! but when I sample from the posterior of the hyperparameters+parameters e.g. replace the priors in the first (above)listing with the ones in the second, neither the find_MAP() nor sampling method converges regardless of the number of samples and the resulted ANNs won't interpolate the sample points. Then I tried to integrate hyperpriors into my model one by one. The only one which could be integrated without problem was the noise precision. if I include any of the others then the sampler won't converge to the posterior. I tried this using one 'step function' for all the model variables and also combination of two separate step methods over parameters and hyperparams. In all the cases and with different number of samples the problem was still there.

分享给朋友:
您可能感兴趣的文章:
随机阅读: