Pages

Thursday, 7 September 2023

FactLLaMA: A Smart Model for Automated Fact-Checking

FactLLaMA: The Smart Fact-Checker - symbolic image

Introduction

Fact-checking is a crucial task for verifying the accuracy and reliability of information, especially in the era of social media and fake news. However, fact-checking is also a challenging task that requires complex reasoning and external knowledge. How can we leverage the power of natural language processing (NLP) and artificial intelligence (AI) to automate fact-checking and make it more efficient and scalable?

FactLLaMA is a model developed by researchers at the Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong. The model was developed with the goal of optimizing instruction-following language models with external knowledge for automated fact-checking. The motivation behind developing FactLLaMA was to address the limitations of existing instruction-following language models (IFLMs) for fact-checking. IFLMs are models that can follow natural language instructions to perform various tasks, such as answering questions, generating summaries, or verifying facts. However, IFLMs often lack external knowledge and rely on shallow heuristics to make decisions. For example, an IFLM may verify a fact by simply matching keywords or phrases in the instruction and the input text, without understanding the meaning or context of the information.

What is FactLLaMA? 

FactLLaMA is a model that uses external knowledge to optimize instruction-following language models for automated fact-checking. The model is designed to improve the accuracy of fact-checking by incorporating external knowledge into the language model’s predictions.

Key Features of FactLLaMA

FactLLaMA has several key features that make it a novel and effective model for automated fact-checking. Some of these features are:

  • Instruction-following: FactLLaMA can follow natural language instructions to perform fact-checking tasks. This makes it more flexible and user-friendly than models that require fixed or predefined formats or templates for input or output.
  • External knowledge: FactLLaMA can leverage external knowledge from factual statements extracted from reliable sources. This makes it more knowledgeable and reliable than models that rely on shallow heuristics or internal knowledge only.
  • Adaptive attention: FactLLaMA can adaptively attend to different types of external knowledge based on the instruction and the input text. This makes it more attentive and selective than models that use uniform or fixed attention mechanisms.
  • Verdict and explanation: FactLLaMA can output a verdict (such as “True” or “False”) and an explanation (such as “According to Wikipedia, …”) for each fact-checking task. This makes it more informative and transparent than models that output a verdict only or an explanation only.

Capabilities/Use Case of FactLLaMA

FactLLaMA has many capabilities and use cases for automated fact-checking. Some of these are:

  • Social media fact-checking: FactLLaMA can verify facts from social media posts, such as tweets, Facebook posts, or Instagram captions. For example, given an instruction like “Verify if Elon Musk tweeted that he will donate $6 billion to end world hunger” and an input text like “Elon Musk tweeted on November 2, 2023: ‘If WFP can describe on this Twitter thread exactly how $6B will solve world hunger, I will sell Tesla stock right now and do it.’”, FactLLaMA can output a verdict like “False” and an explanation like “According to Snopes, Elon Musk did not tweet that he will donate $6 billion to end world hunger, but rather he challenged the World Food Programme (WFP) to prove how $6 billion can solve world hunger”.
  • News article fact-checking: FactLLaMA can verify facts from news articles, such as headlines, summaries, or quotes. For example, given an instruction like “Verify if the headline ‘China launches world’s first quantum satellite’ is true” and an input text like “China launches world’s first quantum satellite”, FactLLaMA can output a verdict like “True” and an explanation like “According to BBC News, China launched the world’s first quantum satellite in August 2016, which aims to establish ‘hack-proof’ communications between space and the ground”.
  • Web page fact-checking: FactLLaMA can verify facts from web pages, such as Wikipedia articles, product reviews, or personal blogs. For example, given an instruction like “Verify if the Wikipedia article ‘List of highest-grossing films’ is accurate” and an input text like “The following is a list of the highest-grossing films of all time, ranked by worldwide box office gross revenue adjusted for inflation as of 2023”, FactLLaMA can output a verdict like “False” and an explanation like “According to Box Office Mojo, the list of the highest-grossing films of all time, ranked by worldwide box office gross revenue adjusted for inflation as of 2023 is different from the Wikipedia article. For example, the Wikipedia article ranks Avatar (2009) as the highest-grossing film with $3.3 billion, while Box Office Mojo ranks Gone with the Wind (1939) as the highest-grossing film with $3.8 billion”.

By incorporating external knowledge into its predictions, FactLLaMA can help to identify false or misleading information more accurately than traditional fact-checking methods.

How does FactLLaMA work?

FactLLaMA is a model that incorporates external knowledge into instruction-following language models to make more accurate predictions about whether a given statement is true or false. The model’s architecture includes components such as an instruction encoder, a knowledge encoder, and a decoder.

Optimization instruction-following models with external evidence using LORA
source - https://arxiv.org/ftp/arxiv/papers/2309/2309.00240.pdf

As shown in figure above, the methodology for instruct-tuning FactLLaMA with external evidence for automatic fact-checking consists of two key components: the generation of instruction-evidence-input claim samples and the instruct-tuning of a generative pretrained language model using these samples. The instruction-evidence-input samples are generated by combining the instruction, evidence, and input claim into a single sequence. The evidence is collected using the Google API to retrieve relevant information from reputable sources. The factual classification task is converted into a sequence-to-sequence problem suitable for generative transformer models by framing it as text generation for automatic fact-checking. The pretrained LLaMA model is then instruct-tuned using the LORA algorithm, which aims to optimize the model’s parameters to minimize a loss function that measures the difference between the predicted fact-check results and the ground truth of the training dataset.

Performance evaluation of FactLLaMA model

To evaluate the performance of FactLLaMA, researchers conducted experiments on two widely used fact-checking datasets: RAWFC and LIAR. The results demonstrate that the approach achieves state-of-the-art performance in fact-checking tasks. The methods were compared based on precision, recall, and F1-score, which are commonly used metrics to assess the performance of classification tasks. 

Results on the RAWFC dataset
source - https://arxiv.org/ftp/arxiv/papers/2309/2309.00240.pdf

As shown in table above for the RAWFC dataset, it can be observed that traditional machine learning methods achieve moderate results, while more advanced models outperform them. Interestingly, LLaMA without tuning performs relatively poorly compared to the other methods. However, when Instruct-tuning is applied, there is a significant improvement in performance, particularly when external knowledge is incorporated. Instruct-tuned LLaMA with external knowledge achieves the highest F1-score, surpassing all other methods and demonstrating the effectiveness of leveraging external evidence. 

Results on the LIAR dataset
source - https://arxiv.org/ftp/arxiv/papers/2309/2309.00240.pdf

Similar patterns can be observed on the evaluation on the LIAR dataset, as shown in Table above. Once again, LLaMA without tuning performs poorly, but instruct-tuning leads to substantial improvements. Incorporating external knowledge in the instruct-tuning process further enhances the performance, with LLaMA achieving the highest F1-score.

How to access and use this model?

FactLLaMA is a model that can be accessed and used through its GitHub repository. The repository contains the official code of this paper. The raw datasets used in the project can be downloaded from the CofCED GitHub repository. To use the model, users can follow the instructions provided in the repository to install the required packages and run the code.

If you are interested to learn more about FactLLaMA model, all relevant links are provided under the 'source' section at the end of this article.

Limitations

FactLLaMA is a novel and effective model for automated fact-checking, but it also has some limitations and challenges that need to be addressed in future work. Some of these are:

  • Data quality: FactLLaMA relies on external knowledge from factual statements extracted from reliable sources, but these sources may not always be accurate or up-to-date. Therefore, FactLLaMA may inherit these errors or inaccuracies from its external knowledge sources and produce incorrect or inconsistent outputs.
  • Data coverage: FactLLaMA uses a large-scale dataset (CofCED) that covers various topics and domains, but it may not cover all possible facts or scenarios that may appear in real-world fact-checking tasks.  Therefore, FactLLaMA may lack external knowledge or relevant factual statements for some fact-checking tasks and produce vague or generic outputs.
  • Data diversity: FactLLaMA uses a single type of external knowledge (factual statements) for fact-checking tasks, but there may be other types of external knowledge that can be useful or informative for fact-checking. Therefore, FactLLaMA may benefit from incorporating multiple types of external knowledge for fact-checking tasks and producing more diverse and rich outputs.
  • Model generalization: FactLLaMA is evaluated on two benchmark datasets (RAWFC and LIAR) for fact-checking tasks, but it may not generalize well to other datasets or domains that have different formats, styles.

Conclusion  

FactLLaMA is a powerful tool for improving the accuracy of automated fact-checking. By incorporating external knowledge into instruction-following language models, FactLLaMA offers a unique approach to identifying false or misleading information online. With continued development and research, FactLLaMA has the potential to become an even more valuable tool for promoting truth and accuracy in online content.


Source
research paper - https://arxiv.org/abs/2309.00240
research document - https://arxiv.org/ftp/arxiv/papers/2309/2309.00240.pdf
project details - https://thcheung.github.io/factllama/
GitHub repo - https://github.com/thcheung/FactLLaMA
Raw dataset - https://github.com/Nicozwy/CofCED


1 comment:

Qwen2.5-Coder: Advanced Code Intelligence for Multilingual Programming

Introduction Code models have improved by leaps and bounds and now take on much more with higher accuracy levels. At the beginning, they exp...