Imaging systems have long been designed in separated steps: experience-driven optical design followed by sophisticated image processing. Although recent advances in computational imaging aim to bridge the gap in an end-to-end fashion, the image formation models used in these approaches have been quite simplistic, built either on simple wave optics models such as Fourier transform, or on similar paraxial models. Such models only support the optimization of a single lens surface, which limits the achievable image quality. To overcome these challenges, we propose a general end-to-end complex lens design framework enabled by a differentiable ray tracing image formation model. Specifically, our model relies on the differentiable ray tracing rendering engine to render optical images in the full field by taking into account all on/off-axis aberrations governed by the theory of geometric optics. Our design pipeline can jointly optimize the lens module and the image reconstruction network for a specific imaging task. We demonstrate the effectiveness of the proposed method on two typical applications, including large field-of-view imaging and extended depth-of-field imaging. Both simulation and experimental results show superior image quality compared with conventional lens designs. Our framework offers a competitive alternative for the design of modern imaging systems.