Начала машинное обучение потенциала MACE для исследования системы CuZr на машине с GPU A100 SXM4. прошло 8 эпох из 200 за час. 26 часов примерно будет идти обучение. GPU нагружен на 97 %.
по ssh сейчас такой вывод получаю
026-04-03 07:31:07.499 INFO: Started training, reporting errors on validation set
2026-04-03 07:31:07.499 INFO: Loss metrics on validation set
2026-04-03 07:31:30.769 INFO: Initial: head: Default, loss=46.37037277, RMSE_E_per_atom= 873.34 meV, RMSE_F= 1930.13 meV / A
2026-04-03 07:39:21.923 INFO: Epoch 0: head: Default, loss=0.39649794, RMSE_E_per_atom= 366.33 meV, RMSE_F= 184.76 meV / A
2026-04-03 07:47:07.410 INFO: Epoch 1: head: Default, loss=0.30796188, RMSE_E_per_atom= 369.49 meV, RMSE_F= 162.78 meV / A
2026-04-03 07:54:58.245 INFO: Epoch 2: head: Default, loss=0.27049792, RMSE_E_per_atom= 362.55 meV, RMSE_F= 152.84 meV / A
2026-04-03 08:02:38.333 INFO: Epoch 3: head: Default, loss=0.24780937, RMSE_E_per_atom= 358.34 meV, RMSE_F= 146.50 meV / A
2026-04-03 08:10:27.889 INFO: Epoch 4: head: Default, loss=0.22986519, RMSE_E_per_atom= 352.11 meV, RMSE_F= 141.31 meV / A
2026-04-03 08:18:17.769 INFO: Epoch 5: head: Default, loss=0.21684781, RMSE_E_per_atom= 346.24 meV, RMSE_F= 137.28 meV / A
2026-04-03 08:25:58.736 INFO: Epoch 6: head: Default, loss=0.20675728, RMSE_E_per_atom= 342.44 meV, RMSE_F= 134.09 meV / A
2026-04-03 08:33:38.457 INFO: Epoch 7: head: Default, loss=0.19868423, RMSE_E_per_atom= 339.41 meV, RMSE_F= 131.45 meV / A