In this paper, we propose a deep learning approach for forward modeling and inverse design of photonic devices containing embedded active metasurface structures. In particular, we demonstrate that combining neural network design of metasurfaces with scattering matrix-based optimization significantly simplifies the computational overhead while facilitating accurate objective-driven design. As an example, we apply our approach to the design of a continuously tunable bandpass filter in the mid-wave infrared, featuring narrow passband (∼10 nm), high quality factors (Q-factors ∼ 102), and large out-of-band rejection (optical density ≥ 3). The design consists of an optical phase-change material Ge2Sb2Se4Te (GSST) metasurface atop a silicon heater sandwiched between two distributed Bragg reflectors (DBRs). The proposed design approach can be generalized to the modeling and inverse design of arbitrary response photonic devices incorporating active metasurfaces.