Zhonghao He

profile_columbia_totleneck.JPG

Hi! I am Zhonghao, a master’s student at the University of Cambridge. I do AI alignment, interpretability, human-AI interaction, and machine ethics research, with a focus on creating truth-seeking AI, improving collective epistemic environment, and facilitating moral progress. My previous work got accepted by ICML, AIES, ICLR (workshop). I am graduating in this year and I will be seeking PhD positions!

Research

You may read my published work on Google Scholar. My ongoing and past projects are more updated in the CV page on this site.

My current “Hamming Problems” (the most important problems I can work on) are:

  • Will humanity experience an LLM-induced lock-in because of the feedback loops between humans and LLMs? (Accpted by ICML; Read The Lock-in Hypothesis)
  • Can we train truth-seeking AI when agentic AI starts to mediate almost all our information intakes? (A benchmark paper under review; Read a manuscript here)

I strive to become a “full stack researcher,” which, in my definition, is to have technical sophistication (experiments, mathematical formulation, and engineering) and deep engagements with problems (technical and societal ones). Building technologies for human betterment is hard, and let’s get this one right.

Ultimately, I want to build AIs for human excellence (or “arete”, in Greek conception) and moral progress, which requires both sound societal mechanism design and epistemic tools with which individuals can better exercise their agency.

A lot of effort is required to operationalize those concepts, but currently I am actively exploring the following topics: mechanistic interpretability, computational neuroscience, AI ethics, alignment, political philosophy, virtue ethics, multi-agent systems, AI for science, and human-computer interface, and collective intelligence.

Contacts

You may simply book a quick call via Calendly