Day 21

Last Friday my advisors showed me how to connect to the RIT servers from home so I worked on my project over the weekend. After deconstructing my training code and reconstructing it again I finally figured out it was the metric calculations that were causing the accumulating memory usage. After fixing it I was able to run the code once on the simplified model (however I didn’t get any graphs from it because of an error), but now that I'm sure it’s working I’m going to train the original model.

The system administrator also taught me how to use a screen session so I can detach the computer and still have the code running in the background (in case the computer loses connection, which happened twice at home).

Right now I’m training three models at the same time- it’s slower (it’s been almost 7 hours and the one with the simplified model has gone through around 15/30 epochs and the ones with the original model have gone through around 10/30 epochs) but I think it’ll still take less time overall than training them one at a time. The first one is with the unpaired datasets and the original model, the second one is with the paired datasets and the original model, and the third one is with the unpaired datasets and the simplified model (because I’m curious). I’ll start testing them after they're done, and potentially also train with the paired data and simplified model if I have time.

I also finished the testing code (I think)- I ended up changing the structure quite a bit. I was having issues changing the outputs from the model into images that could displayed. At first I tried to convert the tensors into PIL images but couldn’t figure out how to print them, so instead I converted the tensors into numpy arrays and used Matplotlib (it worked). I wasn’t running it with the actual updated model parameters, so the printed outputs looked pretty bad- hopefully my actual results won’t...

Comments

Popular posts from this blog

Day 30 (Last Day!!)

Problem Statement

Day 28