VGGNet

Content sourced from Wikipedia, licensed under CC BY-SA 3.0.

VGGNet is a family of deep convolutional neural networks developed by the Visual Geometry Group at the University of Oxford. The most common models are VGG-16 and VGG-19, named for their number of weight layers: VGG-16 has 13 convolutional layers and 3 fully connected layers, while VGG-19 has 16 convolutional layers plus 3 fully connected layers.

The key idea is to use small 3×3 convolution filters throughout the network, stacked in deep blocks with 2×2 max pooling between blocks. This simple, uniform design makes the model very deep while keeping the number of parameters manageable. For example, two 3×3 convolutions can achieve the same receptive field as a single 5×5 convolution but with far fewer parameters.

VGG nets achieved strong results in ImageNet in 2014 and were widely used as baselines in later work, including ResNet, Fast R-CNN, and neural style transfer. They showed that deep, narrow networks can outperform shallow, wide ones and helped popularize the use of small filters over larger ones. Over time, newer architectures like Inception, ResNet, and DenseNet overtook VGG in performance and efficiency. RepVGG (2021) is an updated variant of the idea.

Originally implemented in Caffe and trained with data parallelism on multiple GPUs, training a single VGG model could take about 2–3 weeks on a 4-GPU setup like NVIDIA Titan Black.

This page was last edited on 3 February 2026, at 19:56 (CET).