Traffic congestion and the occurrence of traffic accidents are problems that can be mitigated by applying cooperative adaptive cruise control (CACC). In this work, we used deep reinforcement learning for CACC and assessed its potential to outperform model-based methods. The trade-off between distance-error minimization and energy consumption minimization whilst still ensuring operational safety was investigated. Alongside a string stability condition, robustness against burst errors in communication also was incorporated, and the effect of preview information was assessed. The controllers were trained using the proximal policy optimization algorithm. A validation by comparison with a model-based controller was performed. The performance of the trained controllers was verified with respect to the mean energy consumption and the root mean squared distance error. In our evaluation scenarios, the learning-based controllers reduced energy consumption in comparison to the model-based controller by 17.9% on average.