Resolved: Why are there extra equations when using extra neurons in GEKKO?

In this post, we will see how to resolve Why are there extra equations when using extra neurons in GEKKO?


I’m a bit puzzled on why there are extra equations introduced in the optimization problem (using GEKKO) when increasing the amount of neurons in an ANN that e.g. is used within the objective function or in the constraints. I was hoping the find the answer in this paper, but I can’t seem to pinpoint the reason.
This is the log of a baseline example I made, using 2 Gekko_NN_SKlearn functions.
When I change the amount of neurons of 1 function from [25,20,20,10] to [50,40,40,40] I get the following log:
Hence, a significant amount of extra objects, variables, intermediates, connections, equations and residuals are introduced.
Many thank in advance for your replies!

Best Answer:

the paper talks about this a little in section 3.1.3.

The prediction functions used in both TensorFlow and Scikit-learn neural networks use linear algebra to relate the layers and neurons of the neural network to one another. Each neuron in an input layer has a specific weight that corresponds to an output neuron in the following layer. Each neuron in an output layer also has a corresponding bias value as shown in Figure 1.

Figure 1

For each neuron connection, there is a bias, weight, and an additional activation function. An additional activation equation, such as the linear rectifier or hyperbolic tangent functions, is generally used to normalize the activation between 0 and 1.

As you increase the number of neurons from [25,20,20,10] to [50,40,40,40], a new weight, bias, and activation function is being introduced for each new neuron connection. These objects are represented by an increase in the number of equations, variables, etc… in the optimization problem.
Hopefully this helps!

If you have better answer, please add a comment about this, thank you!