LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models
Contributo in Atti di convegno
Data di Pubblicazione:
2025
Abstract:
Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Human-in-the-loop Workflow; Large Language Models; Schema Discovery; Schema Mining; Scientific Schemas
Elenco autori:
Sadruddin, Sameer; D'Souza, Jennifer; Poupaki, Eleni; Watkins, Alex; Babaei Giglou, Hamed; Rula, Anisa; Karasulu, Bora; Auer, Sören; Mackus, Adrie; Kessels, Erwin
Link alla scheda completa:
Titolo del libro:
Lecture Notes in Computer Science
Pubblicato in: