A Basic Training Example Using ggml
I just want to share what I have been working on recently. This is an example of training a MNIST VAE. The goal is to use only ggml
pipeline and its implementation of ADAM optimizer.
There aren’t many training examples using ggml
. The only one I found is baby-llama. But I think its way of doing opmization is not quite right. Found another training example in llama.cpp
which shows a proper way of using Adam.
Some of the mods I have to add
- Reuse the same forward and backward graph during training
- Change in Adam and LBFGS optimizer to make GPU backend work
- Add several missing OPs in both CPU and CUDA backends
- Hooks (callbacks) added in optimizer to do tests and sample work
Below are some samples from the VAE trained on MNIST after each epoch (total 10 epochs).