A benchmark of expert-level academic questions to assess AI capabilitiesLong Phan, Alice Gatti, Ziwen Han, Nathaniel Li, et al. (large collaboration) January 29 2026 Nature