6 年之前 · 56c0e2f24e
--- a/README.md
+++ b/README.md
@@ -101,11 +101,7 @@ Except the generator is a **pretrained U-Net**, and I've just modified it to hav
 
				 This is also very straightforward – it's just one to one generator/critic iterations and higher critic learning rate. This is modified to incorporate a "threshold" critic loss that makes sure that the critic is "caught up" before moving on to generator training.  This is particularly useful for the "NoGAN" method described below.
			
 
				 
			
 
				 #### **NoGAN**
			
 
				-There's no paper here! This is a new type of GAN training that I've developed to solve some key problems in the previous DeOldify model. The gist is that you get the benefits of GAN training while spending minimal time doing direct GAN training. During this very short amount of GAN training the generator not only gets the full realistic colorization capabilities that used to take days of progressively resized GAN training, but it also doesn't accrue any of the artifacts and other ugly baggage of GANs. As far as I know this is a new technique. And it's incredibly effective. 
			
 
				-
			
 
				-The steps are as follows: First train the generator in a conventional way by itself with just the feature loss. Then you generate images from that, and train the critic on distinguishing between those outputs and real images as a basic binary classifier. Finally, you train the generator and critic together in a GAN setting (starting right at the target size of 192px in this case).  You do this only until the critic loss levels out (which is usually within 1%-5% of training data), and then you repeat the cycle starting from generating generator images again, finetuning the critic on those, and GAN training in the same manner.  This is repeated until there's no longer a noticeable benefit (about 4 to 7 of these cycles based on experience). You'll notice the critic loss won't do the usual "dip" at the beginning of training by this point.  This approach requires very little actual GAN training- only 5%-40% of the Imagenet dataset is iterated through in total!  More data is required for larger models.  I attribute the limiting of GAN training to greatly reduced artifacts and errors in the end result.
			
 
				-
			
 
				-This builds upon a technique developed in collaboration with Jeremy Howard and Sylvain Gugger for Fast.AI's Lesson 7 in version 3 of Practical Deep Learning for Coders Part I. The particular lesson notebook can be found here: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-superres-gan.ipynb  
			
 
				+There's no paper here! This is a new type of GAN training that I've developed to solve some key problems in the previous DeOldify model. The gist is that you get the benefits of GAN training while spending minimal time doing direct GAN training.  More details are at the bottom of the readme (it's a doozy).
			
 
				 
			
 
				 #### **Generator Loss**
			
 
				 Loss during NoGAN learning is two parts:  One is a basic Perceptual Loss (or Feature Loss) based on VGG16 – this just biases the generator model to replicate the input image.  The second is the loss score from the critic.  For the curious – Perceptual Loss isn't sufficient by itself to produce good results.  It tends to just encourage a bunch of brown/green/blue – you know, cheating to the test, basically, which neural networks are really good at doing!  Key thing to realize here is that GANs essentially are learning the loss function for you – which is really one big step closer to toward the ideal that we're shooting for in machine learning.  And of course you generally get much better results when you get the machine to learn something you were previously hand coding.  That's certainly the case here.
			
@@ -158,7 +154,9 @@ jupyter lab
 
				 
			
 
				 From there you can start running the notebooks in Jupyter Lab, via the url they provide you in the console.  
			
 
				 
			
 
				-#### More Details for Those So Inclined
			
 
				+
			
 
				+--------------------------
			
 
				+#### Installation Details
			
 
				 
			
 
				 This project is built around the wonderful Fast.AI library.  Prereqs, in summary:
			
 
				 * **Fast.AI 1.0.46** (and its dependencies)
			
@@ -166,7 +164,9 @@ This project is built around the wonderful Fast.AI library.  Prereqs, in summary
 
				 * **Tensorboard** (i.e. install Tensorflow) and **TensorboardX** (https://github.com/lanpa/tensorboardX).  I guess you don't *have* to but man, life is so much better with it.  FastAI now comes with built in support for this- you just  need to install the prereqs: `conda install -c anaconda tensorflow-gpu` and `pip install tensorboardX`
			
 
				 * **ImageNet** – Only if you're training, of course. It has proven to be a great dataset for my purposes.  http://www.image-net.org/download-images
			
 
				 
			
 
				+--------------------------
			
 
				 #### Pretrained Weights 
			
 
				+
			
 
				 To start right away on your own machine with your own images or videos without training the models yourself, you'll need to download the weights and drop them in the /models/ folder.
			
 
				 
			
 
				 [Download image weights here](https://www.dropbox.com/s/3e4dqky91h824ik/ColorizeImages_gen.pth)
			
@@ -180,6 +180,20 @@ And you can do video colorization in this notebook:  [VideoColorizer.ipynb](Vide
 
				 
			
 
				 The notebooks should be able to guide you from here.
			
 
				 
			
 
				+-------------------------
			
 
				+#### **What is NoGAN???**
			
 
				+This is a new type of GAN training that I've developed to solve some key problems in the previous DeOldify model. The gist is that you get the benefits of GAN training while spending minimal time doing direct GAN training.  Instead, most of the training is done pretraining the generator and critic separately with more straight forward, fast and reliable conventional methods. During the very short amount of actual GAN training the generator not only gets the full realistic colorization capabilities that used to take days of progressively resized GAN training, but it also doesn't accrue nearly as much of the artifacts and other ugly baggage of GANs. In fact, you can pretty much eliminate glitches and artifacts almost entirely depending on your approach. As far as I know this is a new technique. And it's incredibly effective. 
			
 
				+
			
 
				+The steps are as follows: First train the generator in a conventional way by itself with just the feature loss. Next, generate images from that, and train the critic on distinguishing between those outputs and real images as a basic binary classifier. Finally, train the generator and critic together in a GAN setting (starting right at the target size of 192px in this case).  Now for the weird part:  All the useful GAN training here only takes place within a very small window of time.  There's an inflection point where it appears the critic has transfered everything it can that is useful to the generator. Past this point, image quality just go back and forth between either the best that you can get at the inflection point, or bad in a predictable way (orangish skin, overly red lips, etc).  There appears to be no productive training after this point.  And this point lies within training on just 1% to 3% of the Imagenet Data!  That amounts to about 30-60 minutes of training at 192px.  
			
 
				+
			
 
				+The hard part is finding this inflection point.  So far, I've accomplished this by making a whole bunch of model save checkpoints (every 0.1% of data iterated on) and then just looking for the point where images look great before they go totally bonkers with orange skin (always the first thing to go). What I'd really like to figure out is what the tell-tale sign of the inflection point is.  Unfortunately, nothing definitive is jumping out at me yet.  For one, it's happening in the middle of training loss decreasing- not when it flattens out, which would seem more reasonable on the surface.  If there is an easy number or formula to look out for to detect this inflection point, then what's really nice is that we'll have the ability to do an early stop at this point and have a very reliably good generator as a result. 
			
 
				+
			
 
				+Another key thing about NoGAN training is you can repeat pretraining the critic on generated images after the initial GAN training, then repeat the GAN training itself in the same fashion.  This is how I was able to get extra colorful results with the "artistic" model.  But this does come at a cost currently- the output of the generator becomes increasingly inconsistent and you have to experiment with render resolution (render_factor) to get the best result.  But the renders are still glitch free and way more consistent than I was ever able to achieve with the original DeOldify model. You can do about five of these repeat cycles, give or take, before you get diminishing returns, as far as I can tell.  
			
 
				+
			
 
				+Keep in mind- I haven't been entirely rigorous in figuring out what all is going on in NoGAN- I'll save that for a paper. That means there's a good chance I'm wrong about something.  But I think it's definitely worth putting out there now because I'm finding it very useful- it's solving basically much of my remaining problems I had in DeOldify.
			
 
				+
			
 
				+This builds upon a technique developed in collaboration with Jeremy Howard and Sylvain Gugger for Fast.AI's Lesson 7 in version 3 of Practical Deep Learning for Coders Part I. The particular lesson notebook can be found here: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-superres-gan.ipynb  
			
 
				+
			
 
				 
			
 
				 ### Want More?