people
members of the lab or group
555 your office number
123 your address street
Your City, State 12345
layout: about title: about permalink: /
profile: align: right image: profile_columbia_totleneck.JPG # original prof_pic.jpg. May still persist in different locations. image_circular: false # crops the image to make it circular # more_info: > # <p>555 your office number</p> # <p>123 your address street</p> # <p>Your City, State 12345</p>
news: true # includes a list of news items selected_papers: true # includes a list of papers marked as “selected={true}” social: true # includes social icons at the bottom of the page —
Hi! I am Zhonghao (何忠豪), a master’s student at the University of Cambridge. I work on AI alignment, interpretability, and human-AI interaction research.
My previous work got accepted by NeurIPS, ICML, ACM FAccT, and ICLR (workshop), etc. My major interests are to design machines that help humans learn, think, and deliberate. Currently I focus on two things, to develop truth-seeking AI (Bayesian & exploring truth), and to solve “positive feedback loop” problems in tech products: LLM sycophancy, confirmation bias in reasoning models, social media echo chamber, and polarization.
I am serving as a mentor at the Supervised Program for Alignment Research and the Algoverse AI Safety Fellowship. Research our research updates and idea portal if you would like to work with me!
I am graduating in this year and I will be seeking research & PhD positions!
Research
You may read my published work on Google Scholar. My ongoing and past projects are more updated in the CV page on this site.
My current “Hamming Problems” (the most important problems I can work on) are:
- Will humanity experience an LLM-induced lock-in because of the feedback loops between humans and LLMs? (Accpted by ICML; Read The Lock-in Hypothesis)
- Can we train truth-seeking AI when agentic AI starts to mediate almost all our information intakes? (Accepted by NeurIPS;Read a manuscript here)
I strive to become a “full stack researcher,” which, in my definition, is to have technical sophistication (experiments, mathematical formulation, and engineering) and deep engagements with problems (technical and societal ones). Building technologies for human betterment is hard, and let’s get this one right.
Ultimately, I want to build AIs for human excellence (or “arete”, in Greek conception) and moral progress, which requires both sound societal mechanism design and epistemic tools with which individuals can better exercise their agency.
A lot of effort is required to operationalize those concepts, but currently I am actively exploring the following topics: mechanistic interpretability, computational neuroscience, AI ethics, alignment, political philosophy, virtue ethics, multi-agent systems, AI for science, and human-computer interface, and collective intelligence.
Contacts
You may simply book a quick call via Calendly
555 your office number
123 your address street
Your City, State 12345
layout: about title: about permalink: /
profile: align: right image: profile_columbia_totleneck.JPG # original prof_pic.jpg. May still persist in different locations. image_circular: false # crops the image to make it circular # more_info: > # <p>555 your office number</p> # <p>123 your address street</p> # <p>Your City, State 12345</p>
news: true # includes a list of news items selected_papers: true # includes a list of papers marked as “selected={true}” social: true # includes social icons at the bottom of the page —
Hi! I am Zhonghao (何忠豪), a master’s student at the University of Cambridge. I work on AI alignment, interpretability, and human-AI interaction research.
My previous work got accepted by NeurIPS, ICML, ACM FAccT, and ICLR (workshop), etc. My major interests are to design machines that help humans learn, think, and deliberate. Currently I focus on two things, to develop truth-seeking AI (Bayesian & exploring truth), and to solve “positive feedback loop” problems in tech products: LLM sycophancy, confirmation bias in reasoning models, social media echo chamber, and polarization.
I am serving as a mentor at the Supervised Program for Alignment Research and the Algoverse AI Safety Fellowship. Research our research updates and idea portal if you would like to work with me!
I am graduating in this year and I will be seeking research & PhD positions!
Research
You may read my published work on Google Scholar. My ongoing and past projects are more updated in the CV page on this site.
My current “Hamming Problems” (the most important problems I can work on) are:
- Will humanity experience an LLM-induced lock-in because of the feedback loops between humans and LLMs? (Accpted by ICML; Read The Lock-in Hypothesis)
- Can we train truth-seeking AI when agentic AI starts to mediate almost all our information intakes? (Accepted by NeurIPS;Read a manuscript here)
I strive to become a “full stack researcher,” which, in my definition, is to have technical sophistication (experiments, mathematical formulation, and engineering) and deep engagements with problems (technical and societal ones). Building technologies for human betterment is hard, and let’s get this one right.
Ultimately, I want to build AIs for human excellence (or “arete”, in Greek conception) and moral progress, which requires both sound societal mechanism design and epistemic tools with which individuals can better exercise their agency.
A lot of effort is required to operationalize those concepts, but currently I am actively exploring the following topics: mechanistic interpretability, computational neuroscience, AI ethics, alignment, political philosophy, virtue ethics, multi-agent systems, AI for science, and human-computer interface, and collective intelligence.
Contacts
You may simply book a quick call via Calendly