1. Introduction to Core Concpts#
This section introduces the foundational concepts that underpin the tools, recipes, and applications in this cookbook. Understanding these core ideas will help you see where value can be added, which capabilities are feasible, and how to approach implementation.
Here are the central concepts we will cover:
Linked Open Data
What it is: the principles, formats, and best practices.
Why it matters: LOD enables interoperability among heritage datasets, and supports enriching content by linking entities (e.g. people, places, events) across sources.
Large Language Models
What they are: how LLMs are trained, and what kinds of tasks they can perform (generation, classification, summarization, etc.).
Caveats: limitations and biases to watch out for.
Named Entity Recognition (NER)
The process of identifying salient entities (names of people, places, dates, etc.) in unstructured text.
Helps structure heritage data and make it linkable.
Entity Disambiguation
Once entities are recognized, how do we determine which real-world entity each mention refers to?
Involves mapping to candidate entities in LOD resources, handling ambiguity, and ensuring accuracy.
Relationship Extraction
Going beyond isolated entities: identifying how entities are connected (e.g. artist created artwork, place located in region).
These relationships can be encoded in graphs or knowledge bases.