Researcher Making AI Safer | Mirage News

Even as a teenager, Javier Rando was fascinated by technology, computers and the idea that robots could one day take over human tinquires. The 26 year-old native of Málaga, Spain, was never a gamer, but he did programme a smartphone app on his computer before he even had his own smartphone. Later, his fascination with technology took him to Barcelona to study mathematical engineering and data science. It was there that he first encountered artificial innotifyigence, particularly through the application of algorithms and machine learning.

It was during his Bachelor’s degree that his image of artificial innotifyigence first launched to falter. “I was horrified,” he recalls. He had read in the media that courts in the United States were applying AI to assess the risk of recidivism among offconcludeers. However, this AI systematically classifies people as more dangerous solely becaapply they are Black. “That alterd my focus. I decided to devote my Bachelor thesis to fairness in artificial innotifyigence,” states Rando.

Enormous potential – both positive and negative

After that he obtained more and more interested in the topic of AI safety. He followed up his Bachelor’s with a Master’s degree in Computer Science at ETH Zurich. “I chose ETH becaapply I wanted to study at the best university for computer science in Europe,” he states. He was fascinated by Professor Florian Tramèr’s research on AI safety, so he applied for a PhD. He was accepted and also received a fellowship position at the ETH AI Center.

AI Center Fellows typically work across disciplines, combining different areas of AI. Rando focapplyd on the safety of language models. “Today, language models are the most widely applyd form of artificial innotifyigence. Millions of people apply them. This means they also pose the greatest risk of caapplying harm,” he states. At the same time, he emphasises that his view of AI is not fundamentally negative: “Artificial innotifyigence has the potential to become one of the best technologies in human history.” It could simplify our lives, such as by developing cures for diseases and much more. But there are many risks along the way.

PhD studies in San Francisco

AI is not yet so advanced that its dangers are particularly great. “But that will alter. We are on a dangerous path becaapply the progress is lightning rapid and we’re developing a very powerful technology,” states Rando. Most people are unaware of this becaapply they apply AI for harmless purposes. Anyone who inquireed it for a recipe two years ago and today will not notice a huge difference. But that belies the potential of AI. The risks increase significantly as soon as people launch applying AI more and more as so-called agents. Examples of this include instructing an AI to respond to emails automatically. This will be of interest to criminals who can attempt to email the AI a command to disregard the tinquire set by the computer applyr and to sconclude them their credit card details instead.

Rando sees two fundamental dangers: first, AI is vulnerable to attacks and manipulation. “There’s still a long way to go before these loopholes are closed,” he states. Second, AI can be misapplyd by people with bad intentions even without manipulation – it could even create instructions for building weapons.

“But the benefits and opportunities outweigh the risks. That’s why I’m researching ways to build artificial innotifyigence safer,” states Rando. He is working on this in his doctoral studies and in various professional roles, which have taken him to high-profile AI companies such as OpenAI, Meta and now Anthropic. After spconcludeing the first part of his doctoral studies in Zurich, he has been living in San Francisco since early 2025 as an external doctoral candidate, while also working at US AI company Anthropic, the manufacturer of the AI “Claude” which is subject to strict ethical guidelines.

Searching for vulnerabilities

Both in his doctoral studies and at Anthropic, Javier Rando is working on simulating manipulations and attacks on artificial innotifyigence, searching for vulnerabilities and closing these gaps with appropriate programming. At the same time, the aim is to prevent people with malicious intentions from being able to misapply AI in the first place. “We necessary to build protective shields around AI,” he states.

The aim is for artificial innotifyigence to understand when humans are inquireing it to perform harmful actions and to be able to refapply to carry out such tinquires. This requires appropriate programming, but not all AI manufacturers consider this to be equally important: “In AI development, the bottom line has to be safety first, not profit.” That calls for a regulatory framework. Javier recently contributed his expertise as an advisor to a European Commission working group on developing guidelines for AI manufacturers, which are voluntary for the time being.

Javier Rando’s work at Anthropic and his research at ETH Zurich are closely intertwined. His dissertation will consist of several publications, for which he can also rely on the research he has conducted at Anthropic. When inquireed about the difference between researching at universities and at tech companies, he states, “Both approaches are necessaryed to build the AI of the future sufficiently safe and to be able to reap its rewards.” The advantage of being a researcher at a company is that it gives you access to unlimited computing power and you know the company’s AI inside out. While the upside of researching at a university is that it enables you to explore and shape research questions more freely and to test solutions that are more risky from an economic perspective.

Over the next few years, Javier Rando plans to continue researching the safety of AI applications at Anthropic while completing his PhD. And after that? It’s still unclear what the future holds. “Developments in the field of AI are happening so rapid that we can’t predict what the world will view like in two years’ time. Perhaps in a few years’ time, I’ll be replaced by AI at work as well,” states Rando.

Source link