The catalog of gravitational-wave events is growing, and so are our hopes of constraining the underlying astrophysics of stellar-mass black-hole mergers by inferring the distributions of, e.g., masses and spins. While conventional analyses parametrize this population with simple phenomenological models, we propose an innovative physics-first approach that compares gravitational-wave data against astrophysical simulations. We combine state-of-the-art deep-learning techniques with hierarchical Bayesian inference and exploit our approach to constrain the properties of repeated black-hole mergers from the gravitational-wave events in the most recent LIGO/Virgo catalog. Deep neural networks allow us to (i) construct a flexible population model that accurately emulates simulations of hierarchical mergers, (ii) estimate selection effects, and (iii) recover the branching ratios of repeated-merger generations. Among our results we find that: the distribution of host-environment escape speeds favors values <100 km/s but is relatively flat; first-generation black holes are born with a maximum mass that is compatible with current estimates from pair-instability supernovae; there is multimodal substructure in both the mass and spin distributions due to repeated mergers; and binaries with a higher-generation component make up at least 15% of the underlying population. The deep-learning pipeline we present is ready to be used in conjunction with realistic astrophysical population-synthesis predictions.