Multi-model validation: How Elfworks uses four AI models to de-risk Australian tax research

Technology

In the Australian accounting industry, artificial intelligence (AI) has moved from an interesting toy to a strategic necessity. However, the primary barrier to adoption remains the "hallucination" risk, which is the tendency for Large Language Models (LLMs) to generate confident but incorrect assertions. Elfworks, an Australian-built AI tax research platform, is tackling this challenge through a sophisticated, multi-model architectural approach.

14 May 2026 By Elfworks 6 minutes read
Share this article on:

Elfworks was co-founded by tax professional Jimmy McPhedran and technologist Ian Youngman, and built specifically for the way Australian accounting firms actually work.

Unlike generic AI tools that rely on a single engine, Elfworks runs every query through four leading LLMs: Grok, ChatGPT, Gemini and Claude. By leveraging these models simultaneously, the platform mitigates individual algorithmic bias and maximises technical accuracy.

But first, the data

The foundation of the Elfworks methodology is the belief that an LLM is only as reliable as the data it is permitted to access. While general-purpose models often draw from the vast, unverified expanse of the open internet, Elfworks narrows the scope to what Youngman describes as "the bedrock of truth." "We recognised early on that even the most advanced model will fail if it is scanning blog posts instead of legitimate information sources," says Youngman. "By directing our AI agents exclusively at legitimate, authoritative sources, specifically Federal and State legislation, case law and ATO public releases, we immediately see a significant jump in reliability. However, the data source is only half the battle. The real breakthrough was in how we manage the 'personalities' of the models themselves."

Multi-model validation

Just like humans, every AI model possesses inherent strengths and weaknesses. Elfworks rigorously tests the relative strengths and weaknesses of new versions of LLMs and new entrants into the AI market (such as DeepSeek) to develop an "Agentic AI" framework.

In this system, the models do not simply provide four separate answers. Instead, they operate as a collaborative committee. "Getting these four models to work together was a monumental task," Youngman explains. "We’ve developed a process where they challenge each other’s logic and assumptions. If Claude applies a tax law principle to a search query, Gemini might be tasked with checking Claude’s logic. They often then defer to one another or agree to disagree. Not unlike the peer-review process in professional services firms.”

The consensus and the alternative view

In the world of Australian tax, the "correct" answer is often subject to interpretation, particularly when dealing with the Gordian knot of Australian tax law and interpretation. This is where Elfworks’ "Consensus" engine provides a critical advantage for accountants.

McPhedran emphasises that while a unified answer is helpful, it is often the outliers that provide the most value to a tax partner.

"Our platform is built to seek a consensus view on identified issues, the point where all four models agree on the application of the law," says McPhedran. "But we don’t stop there. We specifically task the agents to collate 'Alternative Views'. In Australian tax, being aware of a potentially contrary interpretation by the Commissioner or a differing judicial perspective is vital for risk management."

This approach ensures that practitioners are not just receiving a streamlined summary, but a comprehensive briefing note that mirrors the due diligence of a senior tax associate.

A new standard for the Australian firm

By balancing the specific computational biases of Grok, ChatGPT, Gemini, and Claude, Elfworks provides Australian accountants with a research tool that is grounded in law, refined by debate, and delivered with the necessary professional scepticism. 

For firms looking to navigate the complexities of the Australian tax system, the message from Elfworks is clear: four heads, especially when they are the world's most advanced AI models, are significantly better than one.

See Elfworks multi-model validation in action. Visit Elfworks.ai for a free trial.

Accountants DailyWant to see more stories from trusted news sources?
Make Accountants Daily a preferred news source on Google.
Tags: