Data in Brief (Apr 2025)
A dataset dedicated to the training of large- language models for agronomic management practices and production in Norwegian agricultureGithubKaggle
Abstract
This dataset focuses on the agricultural management practices and production in Norway, derived from the websites Nibio.no, Plantevernleksikonet.no, and nlr.no. All gathered data is in Norwegian. The data is in JSON files (RAW format) and covers topics pertinent to Norwegian agriculture, such as crop rotation, soil health, plant protection and sustainable farming techniques. The data was collected by three Python scripts specially adapted to each website. The cleaned text data is valuable for training or evaluating Natural Language Processing (NLP) Models in an experimental context in Norway or adapting Large-Language Models (LLM) to the domain of Norwegian agriculture within the Norwegian language.