Recoletos Multidisciplinary Research Journal (Jun 2025)
Using Large Language Models to Transform Estate Planning
Abstract
Background: Advancements in generative artificial intelligence (AI) and large language models (LLMs) have introduced new possibilities in democratizing access to professional services. Malaysia's non-Muslims can rely on AI-driven tools to draft their wills without legal assistance, but empirical evaluations of LLM chatbots' reliability are absent. Methods: Grounded in the Technology Acceptance Model, this study assesses the accuracy, legal validity, comprehensibility, and reliability of five prominent LLM chatbots, namely ChatGPT 3.5, ChatGPT 4.0, Claude Sonnet, Gemini Pro, and Microsoft Copilot. Results: ChatGPT 4.0 consistently outperformed other models across all complexity levels in succession and drafting-related question tasks, showing the highest reliability and accuracy. Gemini Pro performed well for introductory and intermediate queries, particularly in drafting simple wills. In contrast, Copilot and Claude Sonnet exhibited high variability and struggled with complex queries. Across all chatbots, performance declined significantly with increased query complexity. Qualitative assessment reveals inconsistencies, misinterpretations, and occasional legal inaccuracies, particularly when prompts contain incomplete information. Conclusion: While specific LLM chatbots, particularly ChatGPT 4.0, demonstrate potential as reliable tools for basic estate planning, their limitations in handling complex legal instructions underscore the need for caution. By shedding light on the role of AI in legal contexts, this research significantly enriched both scholarly and practical dialogues, enhancing our understanding of AI’s potential to revolutionize the legal landscape, particularly in estate planning.
Keywords