Pollen recognition has a wide range of industrial and scientific applications. It guides the energy industry to potential oil and gas deposits, it is proxy data for climate-change scientists, and it increases agricultural production. However, pollen recognition is time consuming because it is usually done by visual inspection. Current automated solutions rely on pre-designed measurements of texture and contours, which require tuning for optimal features of a dataset. Also, most methods classify pollen using single-focus images, which require pollen grains to be captured at specific focal planes. We take a difference approach. Instead of using single-focus images, we use stacks of multifocal images (i.e., z-stack) to account for both visual characteristics and 3-D information. We automatically learn from the data the best visual characteristics for classifying pollen using deep-learning methods. Here, we train convolutional and recurrent neural networks (CNN and RNN) to learn the optimal features and recognize a pollen grain as a sequence of multifocal images acquired by an optical microscope. Additionally, we transfer the knowledge pre-trained network to ours to improve its classification and convergence speed. We evaluated our method using 392 stack sequences of 10 types of pollen grains with 10 images for each sequence. Our method achieved a remarkable classification rate of 100%.