Question on new syntax to train GENN (JENN) models #545

bpaul4 · 2024-04-08T15:40:14Z

I am updating an example that previously implemented the old GENN methods to be compatible with the new JENN dependency, and I am encountering issues with calling the training methods.

Originally, I built my training method as:

import numpy as np
from smt.utils.neural_net.model import Model

# reshape training data arrays
X = np.reshape(x_train, (n_x, n_m))
Y = np.reshape(z_train, (n_y, n_m))
J = np.reshape(grad_train, (n_y, n_x, n_m))

# set up and train model

# Train neural net
model = Model.initialize(X.shape[0], Y.shape[0], deep=2, wide=6)  # 2 hidden layers with 6 neurons each
model.train(
    X=X,  # input data
    Y=Y,  # output data
    J=J,  # gradient data
    num_iterations=num_iterations,  # number of optimizer iterations per mini-batch
    mini_batch_size=mini_batch_size,  # used to divide data into training batches (use for large data sets)
    num_epochs=num_epochs,  # number of passes through data
    alpha=alpha,  # learning rate that controls optimizer step size
    beta1=beta1,  # tuning parameter to control ADAM optimization
    beta2=beta2,  # tuning parameter to control ADAM optimization
    lambd=lambd,  # lambd = 0. = no regularization, lambd > 0 = regularization
    gamma=gamma,  # gamma = 0. = no grad-enhancement, gamma > 0 = grad-enhancement
    seed=None,  # set to value for reproducibility
    silent=True,  # set to True to suppress training output
)

I updated to:

import numpy as np
from jenn.model import NeuralNet

# reshape training data arrays
X = np.reshape(x_train, (n_x, n_m))
Y = np.reshape(z_train, (n_y, n_m))
J = np.reshape(grad_train, (n_y, n_x, n_m))

# set up and train model

# Train neural net

hidden_layer_sizes = [6, 6]
model = NeuralNet([X.shape[0]] + hidden_layer_sizes + [Y.shape[0]])
model.parameters.initialize()
model.fit(
        x=X,  # input data
        y=Y,  # output data
        dydx=J,  # gradient data
        is_normalize=is_normalize,
        alpha=alpha,  # learning rate that controls optimizer step size
        lambd=lambd,  # lambd = 0. = no regularization, lambd > 0 = regularization
        gamma=gamma,  # gamma = 0. = no grad-enhancement, gamma > 0 = grad-enhancement
        beta1=beta1,  # tuning parameter to control ADAM optimization
        beta2=beta2,  # tuning parameter to control ADAM optimization
        epochs=epochs,  # number of passes through data
        batch_size=batch_size,  # used to divide data into training batches (use for large data sets)
        max_iter=max_iter,  # number of optimizer iterations per mini-batch
        shuffle=False,
        random_state=None,
        is_backtracking=False,
        is_verbose=True,
        )

but encountered an error in model.fit():

Traceback (most recent call last):
  File "C:\[...]\jenn\model.py", line 141, in fit
    self.history = train_model(
  File "C:\[...]\jenn\core\training.py", line 121, in train_model
    batches = data.mini_batches(batch_size, shuffle, random_state)
  File "C:\[...]\jenn\core\data.py", line 229, in mini_batches
    batches = mini_batches(X, batch_size, shuffle, random_state)
  File "C:\[...]\jenn\core\data.py", line 51, in mini_batches
    if mini_batch:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Is this the correct way to call the new interface, or is there an SMT module that may be utilized instead?

The text was updated successfully, but these errors were encountered:

Paul-Saves · 2024-04-08T18:55:29Z

@shb84 do you know the problem ?

shb84 · 2024-04-08T22:40:45Z

It does indeed look like a bug on the JENN side. Thank you for finding it. It will be fixed in the next release.

When shuffle is False and there are more than one mini batches, the indices are generated using numpy.arange, which is what is yielding the error shown. When shuffle=True, indices are correctly cast as a list suitable for logical comparison. The easy fix, for now, is to set shuffle=True.

To answer the other question, the API shown in the user's example is actually that of the upstream JENN library. There is a separate SMT API (which simply maps to the JENN API under the hood). Here is an example from the SMT docs:

import numpy as np
import matplotlib.pyplot as plt

from smt.surrogate_models import GENN

# Test function
def f(x):
    import numpy as np  # need to repeat for sphinx_auto_embed
    return x * np.sin(x)

def df_dx(x):
    import numpy as np  # need to repeat for sphinx_auto_embed
    return np.sin(x) + x * np.cos(x)

# Domain
lb = -np.pi
ub = np.pi

# Training data
m = 4
xt = np.linspace(lb, ub, m)
yt = f(xt)
dyt_dxt = df_dx(xt)

# Validation data
xv = lb + np.random.rand(30, 1) * (ub - lb)
yv = f(xv)
dyv_dxv = df_dx(xv)

# Instantiate
genn = GENN()

# Likely the only options a user will interact with
genn.options["hidden_layer_sizes"] = [6, 6]
genn.options["alpha"] = 0.1
genn.options["lambd"] = 0.1
genn.options["gamma"] = 1.0  # 1 = gradient-enhanced on, 0 = gradient-enhanced off
genn.options["num_iterations"] = 500
genn.options["is_backtracking"] = True

# Train
genn.load_data(xt, yt, dyt_dxt)
genn.train()

# Plot comparison
if genn.options["gamma"] == 1.0:
    title = "with gradient enhancement"
else:
    title = "without gradient enhancement"
x = np.arange(lb, ub, 0.01)
y = f(x)
y_pred = genn.predict_values(x)
fig, ax = plt.subplots()
ax.plot(x, y_pred)
ax.plot(x, y, "k--")
ax.plot(xv, yv, "ro")
ax.plot(xt, yt, "k+", mew=3, ms=10)
ax.set(xlabel="x", ylabel="y", title=title)
ax.legend(["Predicted", "True", "Test", "Train"])
plt.show()

bpaul4 · 2024-04-09T14:55:59Z

@shb84 thank you for your answer, and the example. In my original setup with the old (non-JENN) implementation, I had difficulty using the SMT load_data method to load a dataset with 6 inputs, 2 outputs, and a derivative set containing partials for both outputs - X is (6, 102), Y is (2, 102) and J is (2, 6, 102). The method seemed to expect that J had a shape (n_x, n_m).

What is the proper way to have the SMT GENN training consider both sets of derivatives (dy1/dx and dy2/dx for all x)?

shb84 · 2024-04-09T22:44:18Z

For all SMT work, the correct format to use is the one found in the SMT docs.

For clarity, JENN is a separate library that does indeed use a different data format, but you can ignore that as an SMT user. The update GENN module expects SMT formatted data only. Refer to the docs for what that format is.

Under the hood, the load_data method is implemented as follows:

def load_data(self, xt, yt, dyt_dxt=None):
        """Load all training data into surrogate model in one step.

        :param model: SurrogateModel object for which to load training data
        :param xt: smt data points at which response is evaluated
        :param yt: response at xt
        :param dyt_dxt: gradient at xt
        """
        m, n_x = (xt.size, 1) if xt.ndim <= 1 else xt.shape
        m, n_y = (yt.size, 1) if yt.ndim <= 1 else yt.shape

        # Reshape arrays
        xt = xt.reshape((m, n_x))
        yt = yt.reshape((m, n_y))

        # Load values
        self.set_training_values(xt, yt)

        # Load partials
        if dyt_dxt is not None:
            dyt_dxt = dyt_dxt.reshape((m, n_x))
            for i in range(n_x):
                self.set_training_derivatives(xt, dyt_dxt[:, i].reshape((m, 1)), i
)

It is simply a convenience method that feeds the SMT API methods set_training_values and set_training_derivatives in one step. Hence, you can always fall back on those methods to load your data if you keep having trouble.

Please let me know if you still have issues or, better yet, if you can provide a self-contained example with the data that generates the issue, I'd be happy to get it running.

shb84 · 2024-04-15T19:33:46Z

@bpaul4 Just circling back to see if these answers had resolved your issue or if you still needed help.

bpaul4 · 2024-04-19T13:59:12Z

Hi @shb84, thank you for your help. I am studying the documentation and working through my example, and I will let you know if I have further questions. I believe this issue may be closed for now.

relf added the question label Apr 18, 2024

relf closed this as completed Apr 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on new syntax to train GENN (JENN) models #545

Question on new syntax to train GENN (JENN) models #545

bpaul4 commented Apr 8, 2024 •

edited

Loading

Paul-Saves commented Apr 8, 2024

shb84 commented Apr 8, 2024 •

edited

Loading

bpaul4 commented Apr 9, 2024 •

edited

Loading

shb84 commented Apr 9, 2024

shb84 commented Apr 15, 2024

bpaul4 commented Apr 19, 2024 •

edited

Loading

Question on new syntax to train GENN (JENN) models #545

Question on new syntax to train GENN (JENN) models #545

Comments

bpaul4 commented Apr 8, 2024 • edited Loading

Paul-Saves commented Apr 8, 2024

shb84 commented Apr 8, 2024 • edited Loading

bpaul4 commented Apr 9, 2024 • edited Loading

shb84 commented Apr 9, 2024

shb84 commented Apr 15, 2024

bpaul4 commented Apr 19, 2024 • edited Loading

bpaul4 commented Apr 8, 2024 •

edited

Loading

shb84 commented Apr 8, 2024 •

edited

Loading

bpaul4 commented Apr 9, 2024 •

edited

Loading

bpaul4 commented Apr 19, 2024 •

edited

Loading