Zhonghao He

profile_columbia_totleneck.JPG

Hi! I am Zhonghao, a master’s student at the University of Cambridge. I do AI alignment, interpretability, human-AI interaction, and machine ethics research, with a focus on creating AI-assistants for human moral & cultural progress and preventing LLM-induced lock-in. I will be graduating in 2025 and I am seeking PhD positions!

Research

You may read my published work on Google Scholar. My ongoing and past projects are more updated in the CV page of this site.

My current “Hamming Problems” (the most important problems I can work on) are:

  • Will we be experiencing an LLM-induced value lock-in? (See this manuscript for a technical paper we propose “the lock-in” hypothesis, where we study lock-in with data analysis, simulations, and formal modeling.)
  • Can we train LLM to uplift humans by using truth-seeking as underlying objective, opinion-change data as ground truth for RLHF, and explictly evaluting AI-assisted human performance (An Algorithmic Paper Aiming For NeurIPS 2025)?

I strive to become a “full stack researcher,” which, in my definition, is to have technical sophistication (experiments, mathematical formulation, and engineering) and deep engagements with problems (technical and societal ones). Building technologies for human betterment is hard, and let’s get this one right.

Ultimately, I want to build AIs for human excellence (or “arete”, in Greek conception) and moral progress, which requires both sound societal mechanism design and epistemic tools with which individuals can better exercise their agency.

A lot of effort is required to operationalize those concepts, but currently I am actively exploring the following topics: mechanistic interpretability, computational neuroscience, AI ethics, alignment, political philosophy, virtue ethics, multi-agent systems, AI for science, and human-computer interface, and collective intelligence.

Contacts

You may simply book a quick call via Calendly