Page object detection is crucial for document understanding. Different granularities for objects can result in different performances. In this study, block level region object detection is considered among the inherent hierarchical structure for document images. Inspired by Mask R-CNN (Region-based Convolutional Neural Networks) method, an end to end network is proposed to perform object classification, bounding box identification, and page object mask generation at the same time. Latex based synthetic document generation is designed for enlarging the training data. A large number of synthetic page images are generated for training to alleviate the insufficient dataset problem. Compared with existing page object competition methods, the proposed method achieves better results, with mAP of 0.917 on page objects such as table, figure and maths detection.