You are here : Home > Deep learning for non-specialists now possible with transfer learning

Deep learning for non-specialists now possible with transfer learning


CEA-List recently came up with a method for selecting the most suitable existing neural network for adaptation and reuse for new target applications. This advance makes it possible to train classification neural networks, or CNNs, without subject matter expertise or large datasets. 

Published on 18 October 2022

In a context where demand for artificial intelligence is growing faster than ever, new uses for classification neural networks, or CNNs, are emerging every day. Unfortunately, training CNNs requires the input of subject matter experts—of which there are simply not enough. Transfer learning offers a potential solution that could make it easier for non-specialists to develop new AI-powered solutions and fill the gap in cases where data is lacking. It works by adapting a neural network that has already been trained on a large dataset to a new target application. Specifically, a function called softmax is implemented on the last layer in the network to output a classification prediction.

While transfer learning is widely used, it is also time consuming. Multiple neural networks, usually from an open-source library, must be adapted, and then tested to determine the best model for the target use case. In short, each potential neural network must learn the weight of the final layer relative to the classification problem at hand.

CEA-List came up with a novel approach for determining how suitable a model is for a given target application—without the need to train the model on the final layer. Instead, a theoretical analysis of the softmax layer's statistical behavior is performed using a set of parameters from the available data. It is a quick, easy way to get a general idea of the layer's characteristics. The researchers tested the approach on several use cases, where it proved effective at assessing how suitable a potential network is with a given target application, saving a significant amount of time on the selection of the initial network. A patent has been filed on this new method.

In the short term, the goal is to scale the method up to larger datasets related to real-world business scenarios. Later, the researchers will apply it to problems other than classification, such as detection or segmentation.


Top page