Nanophotonics (Jan 2023)

Deep reinforcement learning empowers automated inverse design and optimization of photonic crystals for nanoscale laser cavities

  • Li Renjie,
  • Zhang Ceyao,
  • Xie Wentao,
  • Gong Yuanhao,
  • Ding Feilong,
  • Dai Hui,
  • Chen Zihan,
  • Yin Feng,
  • Zhang Zhaoyu

DOI
https://doi.org/10.1515/nanoph-2022-0692
Journal volume & issue
Vol. 12, no. 2
pp. 319 – 334

Abstract

Read online

Photonics inverse design relies on human experts to search for a design topology that satisfies certain optical specifications with their experience and intuitions, which is relatively labor-intensive, slow, and sub-optimal. Machine learning has emerged as a powerful tool to automate this inverse design process. However, supervised or semi-supervised deep learning is unsuitable for this task due to: (1) a severe shortage of available training data due to the high computational complexity of physics-based simulations along with a lack of open-source datasets and/or the need for a pre-trained neural network model; (2) the issue of one-to-many mapping or non-unique solutions; and (3) the inability to perform optimization of the photonic structure beyond inverse designing. Reinforcement Learning (RL) has the potential to overcome the above three challenges. Here, we propose Learning to Design Optical-Resonators (L2DO) to leverage RL that learns to autonomously inverse design nanophotonic laser cavities without any prior knowledge while retrieving unique design solutions. L2DO incorporates two different algorithms – Deep Q-learning and Proximal Policy Optimization. We evaluate L2DO on two laser cavities: a long photonic crystal (PC) nanobeam and a PC nanobeam with an L3 cavity, both popular structures for semiconductor lasers. Trained for less than 152 hours on limited hardware resources, L2DO has improved state-of-the-art results in the literature by over 2 orders of magnitude and obtained 10 times better performance than a human expert working the same task for over a month. L2DO first learned to meet the required maxima of Q-factors (>50 million) and then proceeded to optimize some additional good-to-have features (e.g., resonance frequency, modal volume). Compared with iterative human designs and inverse design via supervised learning, L2DO can achieve over two orders of magnitude higher sample-efficiency without suffering from the three issues above. This work confirms the potential of deep RL algorithms to surpass human designs and marks a solid step towards a fully automated AI framework for photonics inverse design.

Keywords