Functions to set up optimisers (which find parameters that maximise the joint density of a model) and change their tuning parameters, for use in opt(). For details of the algorithms and how to tune them, see the SciPy optimiser docs or the TensorFlow optimiser docs.

nelder_mead()

powell()

cg()

bfgs()

newton_cg()

l_bfgs_b(maxcor = 10, maxls = 20)

tnc(max_cg_it = -1, stepmx = 0, rescale = -1)

cobyla(rhobeg = 1)

slsqp()

l1_regularization_strength = 0, l2_regularization_strength = 0)

momentum(learning_rate = 0.001, momentum = 0.9, use_nesterov = TRUE)

adam(learning_rate = 0.1, beta1 = 0.9, beta2 = 0.999,
epsilon = 1e-08)

ftrl(learning_rate = 1, learning_rate_power = -0.5,
initial_accumulator_value = 0.1, l1_regularization_strength = 0,
l2_regularization_strength = 0)

l1_regularization_strength = 0, l2_regularization_strength = 0)

l1_regularization_strength = 0, l2_regularization_strength = 0)

rms_prop(learning_rate = 0.1, decay = 0.9, momentum = 0,
epsilon = 1e-10)

## Arguments

maxcor maximum number of 'variable metric corrections' used to define the approximation to the hessian matrix maximum number of line search steps per iteration maximum number of hessian * vector evaluations per iteration maximum step for the line search log10 scaling factor used to trigger rescaling of objective reasonable initial changes to the variables the size of steps (in parameter space) towards the optimal value the decay rate a small constant used to condition gradient updates initial value of the 'accumulator' used to tune the algorithm the current training step number initial value of the accumulators used to tune the algorithm L1 regularisation coefficient (must be 0 or greater) L2 regularisation coefficient (must be 0 or greater) the momentum of the algorithm whether to use Nesterov momentum exponential decay rate for the 1st moment estimates exponential decay rate for the 2nd moment estimates power on the learning rate, must be 0 or less discounting factor for the gradient

## Value

an optimiser object that can be passed to opt.

## Details

The cobyla() does not provide information about the number of iterations nor convergence, so these elements of the output are set to NA

## Examples

# NOT RUN {
# use optimisation to find the mean and sd of some data
x <- rnorm(100, -2, 1.2)
mu <- variable()
sd <- variable(lower = 0)
distribution(x) <- normal(mu, sd)
m <- model(mu, sd)

# configure optimisers & parameters via 'optimiser' argument to opt
opt_res <- opt(m, optimiser = bfgs())

# compare results with the analytic solution
opt_res\$par
c(mean(x), sd(x))
# }