2.14. Disambiguation with Candidates#
When using LLMs, we can leverage context to help determine correct identifiers for entities found. One of the largest challenges with LLMs is getting them to generate the correct identifier for specific entities. Without context, an LLM will confidently generate a believable looking identifier code. When checked, however, users will often find these codes do not exist or are entirely wrong.
We solve this problem with context. LLMs can receive context in one of two ways: either we can give it the context or we can use an LLM agentically with tools so that it can retrieve the context for itself. Both have their advantages, but both work within the same principal: context allows the LLM to get the correct identifier code so that it does not need to hallucinate one. While hallucinations are still possible, the chances are reduced if we provide a list of options to an LLM to choose from.
In this notebook, we will explore the first of these options, where we provide the LLM with a list of candidates that were generated in the previous data notebook. To make things easier, we have pasted the output from that notebook here.
It is also worth noting that providing the LLM with the necessary context is often quite cheaper (assuming you are using a paid-model), than letting the model agentically query the web or use other tools. We will see this in the next notebook.
2.14.1. Installing Packages#
!pip install pydantic openai spacy pandas
2.14.2. Getting our ENV Varaibles.#
First, we need to set up our environment variables to access the OpenAI API. We’ll use the python-dotenv package to load environment variables from a .env file, which should contain our OPENAI_API_KEY. This keeps our API key secure by not hardcoding it directly in our code.
import sys
sys.path.append("..")
from dotenv import load_dotenv
import os
import pandas as pd
import json
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
2.14.3. Visualization Functions#
This bit of code is just for making it easier to visualize our data at the end of the notebook.
client = OpenAI(api_key=OPENAI_API_KEY)
CANDIDATES = [{
"text": "Monet",
"label": "PERSON",
"start_char": 22,
"end_char": 27,
"candidates": [
{
"id": "person/31450df4-cb6b-44f0-8335-38593ea70104",
"type": "Person",
"name": "Jean-Baptiste de Lamarck",
"classifications": [
{
"id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
"type": "Type",
"name": "male"
},
{
"id": "concept/e46688bf-8720-4f67-85b2-d9e048b95506",
"type": "Type",
"name": "Naturalists"
},
{
"id": "concept/b3a2d21c-2782-4da3-aaa4-53c444c4735e",
"type": "Type",
"name": "Biologists"
},
{
"id": "concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9",
"type": "Type",
"name": "Officers"
},
{
"id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
"type": "Type",
"name": "French"
},
{
"id": "concept/b779de71-e499-43aa-abd3-ad991a0d1375",
"type": "Type",
"name": "Botanists"
},
{
"id": "concept/0e64c455-7fd1-414a-89ce-38102f009ac4",
"type": "Type",
"name": "Zoologists"
},
{
"id": "concept/49390038-5b23-441e-b8c5-b4b44d2c04a7",
"type": "Type",
"name": "Faculty"
},
{
"id": "concept/787eed88-09dd-4961-99af-cd53378f3ce6",
"type": "Type",
"name": "Chemists"
},
{
"id": "concept/4dbea3b6-9049-40bf-bc16-5b0a064ceb56",
"type": "Type",
"name": "Meteorologists"
},
{
"id": "concept/9cb213a4-799a-4d64-b755-5980b3045a60",
"type": "Type",
"name": "Paleontologists"
},
{
"id": "concept/62ba8667-022f-4c6f-88e2-d843f1462a08",
"type": "Type",
"name": "Malacologists"
},
{
"id": "concept/50674beb-e61a-4f72-a34d-58e64f498bbc",
"type": "Type",
"name": "Encyclopedists"
},
{
"id": "concept/51a2fcfd-d4b4-42af-872b-f8dcf4a62ced",
"type": "Type",
"name": "Authors"
}
],
"descriptions": [
{
"content": "Chevalier; Professor; franz\u00f6sischer Naturforscher, Biologe",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "French naturalist (1744-1829)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "naturalista franc\u00e9s (1744-1829)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "officier, naturaliste et professeur de zoologie fran\u00e7ais (1744-1829)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "qu\u00edmico franc\u00eas",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"member_of": [
{
"id": "group/ae7678d4-6bf2-452c-8a7f-d170176fd5d3",
"type": "Group",
"name": "Soci\u00e9t\u00e9 philomathique de Paris"
},
{
"id": "group/c2379f3d-47ad-4270-b905-380c91904a8d",
"type": "Group",
"name": "Acad\u00e9mie de Berlin"
},
{
"id": "group/c50693b7-126c-4c3b-8f84-1bf21c662e65",
"type": "Group",
"name": "Bavarian Academy of Sciences and Humanities"
}
],
"birthDate": "1744-08-01T00:00:00",
"birthPlace": {
"id": "place/8940d47c-7650-4cef-b06e-46e30af65a04",
"type": "Place",
"name": "Bazentin"
},
"deathDate": "1829-12-18T00:00:00",
"deathPlace": {
"id": "place/8e117529-3872-494c-ab5f-8d7800be2c64",
"type": "Place",
"name": "Paris"
}
},
{
"id": "person/642a0152-1567-4fbe-93f3-66f11c5cab9a",
"type": "Person",
"name": "Claude Monet",
"classifications": [
{
"id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
"type": "Type",
"name": "French"
},
{
"id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
"type": "Type",
"name": "male"
},
{
"id": "concept/0588f9d1-03e3-4b52-b2bf-dd41e601dcdc",
"type": "Type",
"name": "Artists"
},
{
"id": "concept/98e4295b-7e89-4836-b601-a195888b6257",
"type": "Type",
"name": "caricaturists"
},
{
"id": "concept/4f377430-c1ec-432d-b00c-d70264520e8e",
"type": "Type",
"name": "Landscape painters"
},
{
"id": "concept/5272d911-5ccb-4a45-8571-1fed0176d361",
"type": "Type",
"name": "Painters"
},
{
"id": "concept/b455d036-ded0-4b6a-b94a-d693dcd7dba4",
"type": "Type",
"name": "owners"
},
{
"id": "concept/7ec0c9f8-b1ea-46d7-b5e6-36f23129db6c",
"type": "Type",
"name": "Impressionist artists"
}
],
"descriptions": [
{
"content": "French painter",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "French, 1840\u20131926",
"classifications": [
{
"id": "concept/54e35d81-9548-4b4e-8973-de02b09bf9da",
"type": "Type",
"name": "display biography"
}
]
},
{
"content": "French painter, 1840-1926",
"classifications": [
{
"id": "concept/54e35d81-9548-4b4e-8973-de02b09bf9da",
"type": "Type",
"name": "display biography"
}
]
},
{
"content": "He was a successful caricaturist in his native Le Havre, but after studying plein-air landscape painting, he moved to Paris in 1859. He soon met future Impressionists Camille Pissarro and Pierre-Auguste Renoir. Renoir and Monet began painting outdoors together in the late 1860s, laying the foundations of Impressionism. In 1874, with Pissarro and Edgar Degas, Monet helped organize the Soci\u00e9t\u00e9 Anonyme des Artistes, Peintres, Sculpteurs, Graveurs, etc., the formal name of the Impressionists' group. During the 1870s Monet developed his charateristic technique for rendering atmospheric outdoor light, using broken, rhythmic brushwork. Throughout his career, he remained loyal to the Impressionists' early goal of capturing the transitory effects of nature through direct observation. In 1890 he began creating paintings in series, depicting the same subject under various conditions and at different times of the day. His late pictures, made when he was half-blind, are shimmering pools of color almost totally devoid of form.",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "Peintre. - \u00c9tabli \u00e0 Giverny en 1883",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"birthDate": "1840-11-14T00:00:00",
"birthPlace": {
"id": "place/8e117529-3872-494c-ab5f-8d7800be2c64",
"type": "Place",
"name": "Paris"
},
"deathDate": "1926-12-05T00:00:00",
"deathPlace": {
"id": "place/1eead86b-4570-4217-b675-fb1fa81f2670",
"type": "Place",
"name": "Giverny"
}
},
{
"id": "person/bad186a1-bc28-4709-8edb-eca3a9faf387",
"type": "Person",
"name": "Monet, Jean, 1932-",
"birthDate": "1932-01-01T00:00:00"
},
{
"id": "person/39884fa6-b0e5-4fdf-98a7-1788f4bad5fb",
"type": "Person",
"name": "Monet, J.-C. (Jean-Claude)",
"classifications": [
{
"id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
"type": "Type",
"name": "French"
}
],
"birthDate": "1941-01-01T00:00:00"
},
{
"id": "person/f368e56b-fe27-4f6f-9e16-75725afe8e31",
"type": "Person",
"name": "Carter, Frances Monet"
},
{
"id": "person/a1dccf2f-48c7-43cb-a51c-cc2b4fa54958",
"type": "Person",
"name": "Monet, Paul",
"classifications": [
{
"id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
"type": "Type",
"name": "male"
},
{
"id": "concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9",
"type": "Type",
"name": "Officers"
},
{
"id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
"type": "Type",
"name": "French"
}
],
"descriptions": [
{
"content": "Franz\u00f6sischer Offizier der Ehrenlegion, Kapit\u00e4n einer kolonialen Artillerie",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"member_of": [
{
"id": "group/f6827e5b-8e0f-413c-ada1-4c4e0fc594d6",
"type": "Group",
"name": "Minist\u00e8re des colonies"
},
{
"id": "group/a9170801-fd5d-4cbe-b55e-843465f806ab",
"type": "Group",
"name": "Acad\u00e9mie Goncourt"
},
{
"id": "group/9064c768-7424-4ccf-9fca-4d8d357ce73a",
"type": "Group",
"name": "Vi\u1ec7t Nam Thanh Ni\u00ean H\u1ed9i"
}
],
"birthDate": "1884-01-13T00:00:00",
"birthPlace": {
"id": "place/8de9ae57-d9e0-44f4-8950-54442e0506a1",
"type": "Place",
"name": "Angers"
},
"deathDate": "1941-05-26T00:00:00"
},
{
"id": "person/2c8940dc-46bb-4029-a2f6-65fd24e8be8c",
"type": "Person",
"name": "Laurette Alexis-Monet",
"classifications": [
{
"id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
"type": "Type",
"name": "female"
},
{
"id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
"type": "Type",
"name": "French"
}
],
"birthDate": "1923-07-10T00:00:00",
"deathDate": "2011-12-15T00:00:00"
},
{
"id": "person/39b693f8-d2da-4d4e-a7a7-fb3dfd8d769d",
"type": "Person",
"name": "Monet, Alice K. B.",
"classifications": [
{
"id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
"type": "Type",
"name": "female"
}
]
},
{
"id": "person/60391131-8295-487a-b786-1216c9cc63ef",
"type": "Person",
"name": "Monet-Viera, Molly"
},
{
"id": "person/8b0a5321-663f-4681-a034-a57cf47e9383",
"type": "Person",
"name": "Monet, Chantal",
"classifications": [
{
"id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
"type": "Type",
"name": "female"
},
{
"id": "concept/303558a7-ab8f-4b09-a7f7-fffc993a84f5",
"type": "Type",
"name": "Journalists"
},
{
"id": "concept/83155191-338b-4396-90ee-f9a625bcbfd3",
"type": "Type",
"name": "Belgian"
}
]
},
{
"id": "Q296",
"type": "Person",
"name": "Claude Monet",
"classifications": [
{
"id": "Q1028181",
"type": "Type",
"name": "pintor"
},
{
"id": "Q1925963",
"type": "Type",
"name": "artista gr\u00e1fico"
}
],
"descriptions": [
{
"content": "French painter (1840\u20131926)",
"classifications": []
},
{
"content": "pintor franc\u00e9s",
"classifications": []
},
{
"content": "peintre impressionniste fran\u00e7ais",
"classifications": []
},
{
"content": "pintor franc\u00eas (1840-1926)",
"classifications": []
},
{
"content": "franz\u00f6sischer Maler des Impressionismus (1840\u20131926)",
"classifications": []
}
],
"birthDate": "1840-11-14T00:00:00",
"birthPlace": {
"id": "Q90",
"type": "Place",
"name": "Paris"
},
"deathDate": "1926-12-05T00:00:00",
"deathPlace": {
"id": "Q165061",
"type": "Place",
"name": "Giverny"
}
},
{
"id": "Q24698278",
"type": "Person",
"name": "Monet",
"descriptions": [
{
"content": "family name",
"classifications": []
},
{
"content": "apellido",
"classifications": []
},
{
"content": "nom de famille",
"classifications": []
},
{
"content": "sobrenome",
"classifications": []
},
{
"content": "Familienname",
"classifications": []
}
]
},
{
"id": "Q2959838",
"type": "Person",
"name": "Charles Monnet",
"classifications": [
{
"id": "Q1028181",
"type": "Type",
"name": "pintor"
}
],
"descriptions": [
{
"content": "French court painter (1732-1808)",
"classifications": []
},
{
"content": "pintor franc\u00e9s",
"classifications": []
},
{
"content": "peintre fran\u00e7ais",
"classifications": []
},
{
"content": "pintor franc\u00eas",
"classifications": []
},
{
"content": "franz\u00f6sischer Hofmaler",
"classifications": []
}
],
"birthDate": "1732-01-10T00:00:00",
"birthPlace": {
"id": "Q90",
"type": "Place",
"name": "Paris"
},
"deathDate": "1819-03-19T00:00:00",
"deathPlace": {
"id": "Q90",
"type": "Place",
"name": "Paris"
}
},
{
"id": "Q8142",
"type": "Person",
"name": "\u901a\u8ca8",
"descriptions": [
{
"content": "generally accepted medium of exchange for goods or services",
"classifications": []
},
{
"content": "medio de cambio utilizado para bienes o servicios",
"classifications": []
},
{
"content": "instrument de paiement en vigueur en un lieu et \u00e0 une \u00e9poque donn\u00e9e",
"classifications": []
},
{
"content": "unidade monet\u00e1ria, meio de pagamento",
"classifications": []
},
{
"content": "Verfassung und Ordnung des gesamten Geldwesens eines Staates",
"classifications": []
}
]
},
{
"id": "Q119729672",
"type": "Person",
"name": "Monet",
"descriptions": [
{
"content": "given name",
"classifications": []
}
]
},
{
"id": "Q234900",
"type": "Person",
"name": "Linda Darnell",
"classifications": [
{
"id": "Q2259451",
"type": "Type",
"name": "stage actor"
},
{
"id": "Q10798782",
"type": "Type",
"name": "television actor"
},
{
"id": "Q10800557",
"type": "Type",
"name": "film actor"
}
],
"descriptions": [
{
"content": "American actress (1923\u20131965)",
"classifications": []
},
{
"content": "actriz estadounidense",
"classifications": []
},
{
"content": "actrice am\u00e9ricaine",
"classifications": []
},
{
"content": "US-amerikanische Schauspielerin",
"classifications": []
},
{
"content": "Amerikaans actrice (1923\u20131965)",
"classifications": []
}
],
"birthDate": "1923-10-16T00:00:00",
"birthPlace": {
"id": "Q16557",
"type": "Place",
"name": "Dallas"
},
"deathDate": "1965-04-10T00:00:00",
"deathPlace": {
"id": "Q1531184",
"type": "Place",
"name": "Glenview"
}
},
{
"id": "Q223162",
"type": "Person",
"name": "Mon\u00e9teau",
"descriptions": [
{
"content": "commune in Yonne, France",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "commune fran\u00e7aise du d\u00e9partement de l'Yonne",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "franz\u00f6sische Gemeinde",
"classifications": []
}
]
}
]
},
{
"text": "Argenteuil",
"label": "LOCATION",
"start_char": 121,
"end_char": 131,
"candidates": [
{
"id": "place/4699255d-458a-4795-8b04-2614f1c171db",
"type": "Place",
"name": "Argenteuil",
"part_of": [
{
"id": "place/b7e88db4-e572-46e6-9617-8a2594bcfa8c",
"type": "Place",
"name": "Argenteuil"
}
]
},
{
"id": "place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74",
"type": "Place",
"name": "Argenteuil",
"part_of": [
{
"id": "place/682402f8-cdc4-4ebc-ae38-5b3824d2e4aa",
"type": "Place",
"name": "Quebec"
}
]
},
{
"id": "place/2f05fdc5-7e9e-4936-bde8-84a88347fde7",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "regional county municipality in Quebec, Canada",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "municipalit\u00e9 r\u00e9gionale de comt\u00e9 du Qu\u00e9bec (Canada)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"part_of": [
{
"id": "place/cd467ccf-665a-423f-a1b5-1785869d960f",
"type": "Place",
"name": "Laurentides"
}
]
},
{
"id": "place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a",
"type": "Place",
"name": "arrondissement of Argenteuil",
"descriptions": [
{
"content": "arrondissement of France",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "distrito de Francia",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "arrondissement fran\u00e7ais",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "Verwaltungseinheit in Frankreich",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "arrondissement in Val-d'Oise, Frankrijk",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"part_of": [
{
"id": "place/bb803f08-8018-4a00-814a-6fceb3ec6d28",
"type": "Place",
"name": "Essonne"
}
]
},
{
"id": "place/b7e88db4-e572-46e6-9617-8a2594bcfa8c",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "Argenteuil is a commune in the Val-d'Oise department in the \u00cele-de-France region, located about 15 kilometers northwest of Paris, France. (AI generated)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "Stadt im nordwestlichen Vorortbereich von Paris, an der Seine, im D\u00e9partement Val d'Oise, Frankreich",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "commune in Val-d'Oise, France",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "comuna francesa",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "commune fran\u00e7aise du d\u00e9partement du Val-d'Oise",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"part_of": [
{
"id": "place/6271aae3-a32f-4aaa-883d-9c99a803a09c",
"type": "Place",
"name": "France"
},
{
"id": "place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a",
"type": "Place",
"name": "arrondissement of Argenteuil"
},
{
"id": "place/bb803f08-8018-4a00-814a-6fceb3ec6d28",
"type": "Place",
"name": "Essonne"
},
{
"id": "place/67b2f4c7-5915-483e-af7e-8e7c218e1b53",
"type": "Place",
"name": "Grand Paris"
},
{
"id": "place/4699255d-458a-4795-8b04-2614f1c171db",
"type": "Place",
"name": "Argenteuil"
},
{
"id": "place/c4067590-40a9-462c-995a-9c58f100e6e6",
"type": "Place",
"name": "Argenteuil"
}
]
},
{
"id": "place/4f8c46e0-3701-4871-8073-0116e17eeed1",
"type": "Place",
"name": "Saint-Andr\u00e9-d'Argenteuil",
"descriptions": [
{
"content": "municipality in Quebec, Canada",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "municipio en la\u00a0provincia\u00a0de\u00a0Quebec,\u00a0Canad\u00e1",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
},
{
"content": "municipalit\u00e9 au Qu\u00e9bec (Canada)",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"part_of": [
{
"id": "place/123bf43c-269e-40fd-b37d-c564dce9ce9b",
"type": "Place",
"name": "Qu\u00e9bec"
},
{
"id": "place/2f05fdc5-7e9e-4936-bde8-84a88347fde7",
"type": "Place",
"name": "Argenteuil"
},
{
"id": "place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74",
"type": "Place",
"name": "Argenteuil"
},
{
"id": "place/cd467ccf-665a-423f-a1b5-1785869d960f",
"type": "Place",
"name": "Laurentides"
}
]
},
{
"id": "place/1a1b5be6-9f94-4a05-88af-1d49ea123f3c",
"type": "Place",
"name": "Argenteuil (Qu\u00e9bec : Division de recensement)"
},
{
"id": "place/b1a8acf8-392b-4739-af32-8db989d806d0",
"type": "Place",
"name": "Argenteuil (Qu\u00e9bec)"
},
{
"id": "place/afacb2bd-8041-4d59-a700-18009fae3ad1",
"type": "Place",
"name": "North River (Argenteuil, Qu\u00e9bec)"
},
{
"id": "place/9a2c2a4c-dd02-4e54-907a-3b6208174a06",
"type": "Place",
"name": "Argenteuil",
"classifications": [
{
"id": "concept/4c4443fb-d094-4de4-a5cb-5e3078d58f06",
"type": "Type",
"name": "Cities and towns"
}
],
"descriptions": [
{
"content": "Silver deposits here were exploited by Gauls; town was destroyed by Normans, but rebuilt; convent here was endowed by Charlemagne; was famous in the 12th century for abbess H\u00e9lo\u00efse, of the tragic H\u00e9lo\u00efse-Abelard romance; currently a residential area.",
"classifications": [
{
"id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
"type": "Type",
"name": "descriptive note"
}
]
}
],
"part_of": [
{
"id": "place/d5aeace4-86fa-4193-a508-4fa6c615432d",
"type": "Place",
"name": "\u00cele-de-France"
}
]
},
{
"id": "Q181946",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "commune in Val-d'Oise, France",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "commune fran\u00e7aise du d\u00e9partement du Val-d'Oise",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "franz\u00f6sische Gemeinde",
"classifications": []
}
],
"part_of": [
{
"id": "Q511613",
"type": "Place",
"name": "arrondissement of Argenteuil"
},
{
"id": "Q12784",
"type": "Place",
"name": "Val-d'Oise"
},
{
"id": "Q16665915",
"type": "Place",
"name": "M\u00e9tropole du Grand Paris"
}
]
},
{
"id": "Q645211",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "regional county municipality in Quebec, Canada",
"classifications": []
},
{
"content": "municipalit\u00e9 r\u00e9gionale de comt\u00e9 du Qu\u00e9bec (Canada)",
"classifications": []
}
],
"part_of": [
{
"id": "Q2304022",
"type": "Place",
"name": "Laurentides"
}
]
},
{
"id": "Q1151230",
"type": "Place",
"name": "Argenteuil-sur-Arman\u00e7on",
"descriptions": [
{
"content": "commune in Yonne, France",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "commune fran\u00e7aise du d\u00e9partement de l'Yonne",
"classifications": []
},
{
"content": "comuna francesa",
"classifications": []
},
{
"content": "franz\u00f6sische Gemeinde",
"classifications": []
}
],
"part_of": [
{
"id": "Q1724141",
"type": "Place",
"name": "canton of Ancy-le-Franc"
},
{
"id": "Q12816",
"type": "Place",
"name": "Yonne"
},
{
"id": "Q700536",
"type": "Place",
"name": "arrondissement of Avallon"
}
]
},
{
"id": "Q2860941",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "provincial electoral district in Quebec, Canada",
"classifications": []
},
{
"content": "circonscription electorale provinciale du Qu\u00e9bec, Canada",
"classifications": []
},
{
"content": "Provinzwahlkreis in Qu\u00e9bec",
"classifications": []
}
],
"part_of": [
{
"id": "Q176",
"type": "Place",
"name": "Quebec"
}
]
},
{
"id": "Q3095674",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "railway station in Argenteuil, France",
"classifications": []
},
{
"content": "estaci\u00f3n de tren en Francia",
"classifications": []
},
{
"content": "gare ferroviaire fran\u00e7aise",
"classifications": []
},
{
"content": "Bahnhof in Frankreich",
"classifications": []
},
{
"content": "spoorwegstation in Frankrijk",
"classifications": []
}
],
"part_of": [
{
"id": "Q181946",
"type": "Place",
"name": "Argenteuil"
}
]
},
{
"id": "Q2860945",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "painting by \u00c9douard Manet, 1874",
"classifications": []
},
{
"content": "cuadro de \u00c9douard Manet",
"classifications": []
},
{
"content": "tableau d'\u00c9douard Manet",
"classifications": []
},
{
"content": "pintura de \u00c9douard Manet",
"classifications": []
},
{
"content": "Gem\u00e4lde von \u00c9douard Manet aus dem Jahr 1874",
"classifications": []
}
]
},
{
"id": "Q20188741",
"type": "Place",
"name": "Argenteuil",
"descriptions": [
{
"content": "painting by Claude Monet (c. 1872, National Gallery of Art)",
"classifications": []
},
{
"content": "cuadro de Claude Monet",
"classifications": []
},
{
"content": "peinture de Claude Monet (v. 1872, National Gallery of Art)",
"classifications": []
},
{
"content": "pintura de Claude Monet",
"classifications": []
},
{
"content": "\u00d6lgem\u00e4lde von Claude Monet",
"classifications": []
}
]
}
]
}
]
TEXT = "This painting depicts [Monet](PERSON)'s first wife, [Camille](PERSON), outside on a snowy day passing by the [French](LOCATION) doors of their home at [Argenteuil](LOCATION). Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green."
MODEL = "gpt-4o-mini"
prompt = """
Disambiguate the entities in the following text.
{text}
Here are the Candidates:
{candidates}
Only return the JSON output, nothing else. Do so with the following schema:
Return a list of entities with the following schema:
class Entity(BaseModel):
entity_text: str
label: str
wikidata_id: str
sources: list[str]
"""
formatted_prompt = prompt.format(candidates=CANDIDATES, text=TEXT)
print(formatted_prompt)
Disambiguate the entities in the following text.
This painting depicts [Monet](PERSON)'s first wife, [Camille](PERSON), outside on a snowy day passing by the [French](LOCATION) doors of their home at [Argenteuil](LOCATION). Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green.
Here are the Candidates:
[{'text': 'Monet', 'label': 'PERSON', 'start_char': 22, 'end_char': 27, 'candidates': [{'id': 'person/31450df4-cb6b-44f0-8335-38593ea70104', 'type': 'Person', 'name': 'Jean-Baptiste de Lamarck', 'classifications': [{'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/e46688bf-8720-4f67-85b2-d9e048b95506', 'type': 'Type', 'name': 'Naturalists'}, {'id': 'concept/b3a2d21c-2782-4da3-aaa4-53c444c4735e', 'type': 'Type', 'name': 'Biologists'}, {'id': 'concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9', 'type': 'Type', 'name': 'Officers'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}, {'id': 'concept/b779de71-e499-43aa-abd3-ad991a0d1375', 'type': 'Type', 'name': 'Botanists'}, {'id': 'concept/0e64c455-7fd1-414a-89ce-38102f009ac4', 'type': 'Type', 'name': 'Zoologists'}, {'id': 'concept/49390038-5b23-441e-b8c5-b4b44d2c04a7', 'type': 'Type', 'name': 'Faculty'}, {'id': 'concept/787eed88-09dd-4961-99af-cd53378f3ce6', 'type': 'Type', 'name': 'Chemists'}, {'id': 'concept/4dbea3b6-9049-40bf-bc16-5b0a064ceb56', 'type': 'Type', 'name': 'Meteorologists'}, {'id': 'concept/9cb213a4-799a-4d64-b755-5980b3045a60', 'type': 'Type', 'name': 'Paleontologists'}, {'id': 'concept/62ba8667-022f-4c6f-88e2-d843f1462a08', 'type': 'Type', 'name': 'Malacologists'}, {'id': 'concept/50674beb-e61a-4f72-a34d-58e64f498bbc', 'type': 'Type', 'name': 'Encyclopedists'}, {'id': 'concept/51a2fcfd-d4b4-42af-872b-f8dcf4a62ced', 'type': 'Type', 'name': 'Authors'}], 'descriptions': [{'content': 'Chevalier; Professor; französischer Naturforscher, Biologe', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'French naturalist (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'naturalista francés (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'officier, naturaliste et professeur de zoologie français (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'químico francês', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'member_of': [{'id': 'group/ae7678d4-6bf2-452c-8a7f-d170176fd5d3', 'type': 'Group', 'name': 'Société philomathique de Paris'}, {'id': 'group/c2379f3d-47ad-4270-b905-380c91904a8d', 'type': 'Group', 'name': 'Académie de Berlin'}, {'id': 'group/c50693b7-126c-4c3b-8f84-1bf21c662e65', 'type': 'Group', 'name': 'Bavarian Academy of Sciences and Humanities'}], 'birthDate': '1744-08-01T00:00:00', 'birthPlace': {'id': 'place/8940d47c-7650-4cef-b06e-46e30af65a04', 'type': 'Place', 'name': 'Bazentin'}, 'deathDate': '1829-12-18T00:00:00', 'deathPlace': {'id': 'place/8e117529-3872-494c-ab5f-8d7800be2c64', 'type': 'Place', 'name': 'Paris'}}, {'id': 'person/642a0152-1567-4fbe-93f3-66f11c5cab9a', 'type': 'Person', 'name': 'Claude Monet', 'classifications': [{'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}, {'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/0588f9d1-03e3-4b52-b2bf-dd41e601dcdc', 'type': 'Type', 'name': 'Artists'}, {'id': 'concept/98e4295b-7e89-4836-b601-a195888b6257', 'type': 'Type', 'name': 'caricaturists'}, {'id': 'concept/4f377430-c1ec-432d-b00c-d70264520e8e', 'type': 'Type', 'name': 'Landscape painters'}, {'id': 'concept/5272d911-5ccb-4a45-8571-1fed0176d361', 'type': 'Type', 'name': 'Painters'}, {'id': 'concept/b455d036-ded0-4b6a-b94a-d693dcd7dba4', 'type': 'Type', 'name': 'owners'}, {'id': 'concept/7ec0c9f8-b1ea-46d7-b5e6-36f23129db6c', 'type': 'Type', 'name': 'Impressionist artists'}], 'descriptions': [{'content': 'French painter', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'French, 1840–1926', 'classifications': [{'id': 'concept/54e35d81-9548-4b4e-8973-de02b09bf9da', 'type': 'Type', 'name': 'display biography'}]}, {'content': 'French painter, 1840-1926', 'classifications': [{'id': 'concept/54e35d81-9548-4b4e-8973-de02b09bf9da', 'type': 'Type', 'name': 'display biography'}]}, {'content': "He was a successful caricaturist in his native Le Havre, but after studying plein-air landscape painting, he moved to Paris in 1859. He soon met future Impressionists Camille Pissarro and Pierre-Auguste Renoir. Renoir and Monet began painting outdoors together in the late 1860s, laying the foundations of Impressionism. In 1874, with Pissarro and Edgar Degas, Monet helped organize the Société Anonyme des Artistes, Peintres, Sculpteurs, Graveurs, etc., the formal name of the Impressionists' group. During the 1870s Monet developed his charateristic technique for rendering atmospheric outdoor light, using broken, rhythmic brushwork. Throughout his career, he remained loyal to the Impressionists' early goal of capturing the transitory effects of nature through direct observation. In 1890 he began creating paintings in series, depicting the same subject under various conditions and at different times of the day. His late pictures, made when he was half-blind, are shimmering pools of color almost totally devoid of form.", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'Peintre. - Établi à Giverny en 1883', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'birthDate': '1840-11-14T00:00:00', 'birthPlace': {'id': 'place/8e117529-3872-494c-ab5f-8d7800be2c64', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1926-12-05T00:00:00', 'deathPlace': {'id': 'place/1eead86b-4570-4217-b675-fb1fa81f2670', 'type': 'Place', 'name': 'Giverny'}}, {'id': 'person/bad186a1-bc28-4709-8edb-eca3a9faf387', 'type': 'Person', 'name': 'Monet, Jean, 1932-', 'birthDate': '1932-01-01T00:00:00'}, {'id': 'person/39884fa6-b0e5-4fdf-98a7-1788f4bad5fb', 'type': 'Person', 'name': 'Monet, J.-C. (Jean-Claude)', 'classifications': [{'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'birthDate': '1941-01-01T00:00:00'}, {'id': 'person/f368e56b-fe27-4f6f-9e16-75725afe8e31', 'type': 'Person', 'name': 'Carter, Frances Monet'}, {'id': 'person/a1dccf2f-48c7-43cb-a51c-cc2b4fa54958', 'type': 'Person', 'name': 'Monet, Paul', 'classifications': [{'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9', 'type': 'Type', 'name': 'Officers'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'descriptions': [{'content': 'Französischer Offizier der Ehrenlegion, Kapitän einer kolonialen Artillerie', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'member_of': [{'id': 'group/f6827e5b-8e0f-413c-ada1-4c4e0fc594d6', 'type': 'Group', 'name': 'Ministère des colonies'}, {'id': 'group/a9170801-fd5d-4cbe-b55e-843465f806ab', 'type': 'Group', 'name': 'Académie Goncourt'}, {'id': 'group/9064c768-7424-4ccf-9fca-4d8d357ce73a', 'type': 'Group', 'name': 'Việt Nam Thanh Niên Hội'}], 'birthDate': '1884-01-13T00:00:00', 'birthPlace': {'id': 'place/8de9ae57-d9e0-44f4-8950-54442e0506a1', 'type': 'Place', 'name': 'Angers'}, 'deathDate': '1941-05-26T00:00:00'}, {'id': 'person/2c8940dc-46bb-4029-a2f6-65fd24e8be8c', 'type': 'Person', 'name': 'Laurette Alexis-Monet', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'birthDate': '1923-07-10T00:00:00', 'deathDate': '2011-12-15T00:00:00'}, {'id': 'person/39b693f8-d2da-4d4e-a7a7-fb3dfd8d769d', 'type': 'Person', 'name': 'Monet, Alice K. B.', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}]}, {'id': 'person/60391131-8295-487a-b786-1216c9cc63ef', 'type': 'Person', 'name': 'Monet-Viera, Molly'}, {'id': 'person/8b0a5321-663f-4681-a034-a57cf47e9383', 'type': 'Person', 'name': 'Monet, Chantal', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}, {'id': 'concept/303558a7-ab8f-4b09-a7f7-fffc993a84f5', 'type': 'Type', 'name': 'Journalists'}, {'id': 'concept/83155191-338b-4396-90ee-f9a625bcbfd3', 'type': 'Type', 'name': 'Belgian'}]}, {'id': 'Q296', 'type': 'Person', 'name': 'Claude Monet', 'classifications': [{'id': 'Q1028181', 'type': 'Type', 'name': 'pintor'}, {'id': 'Q1925963', 'type': 'Type', 'name': 'artista gráfico'}], 'descriptions': [{'content': 'French painter (1840–1926)', 'classifications': []}, {'content': 'pintor francés', 'classifications': []}, {'content': 'peintre impressionniste français', 'classifications': []}, {'content': 'pintor francês (1840-1926)', 'classifications': []}, {'content': 'französischer Maler des Impressionismus (1840–1926)', 'classifications': []}], 'birthDate': '1840-11-14T00:00:00', 'birthPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1926-12-05T00:00:00', 'deathPlace': {'id': 'Q165061', 'type': 'Place', 'name': 'Giverny'}}, {'id': 'Q24698278', 'type': 'Person', 'name': 'Monet', 'descriptions': [{'content': 'family name', 'classifications': []}, {'content': 'apellido', 'classifications': []}, {'content': 'nom de famille', 'classifications': []}, {'content': 'sobrenome', 'classifications': []}, {'content': 'Familienname', 'classifications': []}]}, {'id': 'Q2959838', 'type': 'Person', 'name': 'Charles Monnet', 'classifications': [{'id': 'Q1028181', 'type': 'Type', 'name': 'pintor'}], 'descriptions': [{'content': 'French court painter (1732-1808)', 'classifications': []}, {'content': 'pintor francés', 'classifications': []}, {'content': 'peintre français', 'classifications': []}, {'content': 'pintor francês', 'classifications': []}, {'content': 'französischer Hofmaler', 'classifications': []}], 'birthDate': '1732-01-10T00:00:00', 'birthPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1819-03-19T00:00:00', 'deathPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}}, {'id': 'Q8142', 'type': 'Person', 'name': '通貨', 'descriptions': [{'content': 'generally accepted medium of exchange for goods or services', 'classifications': []}, {'content': 'medio de cambio utilizado para bienes o servicios', 'classifications': []}, {'content': 'instrument de paiement en vigueur en un lieu et à une époque donnée', 'classifications': []}, {'content': 'unidade monetária, meio de pagamento', 'classifications': []}, {'content': 'Verfassung und Ordnung des gesamten Geldwesens eines Staates', 'classifications': []}]}, {'id': 'Q119729672', 'type': 'Person', 'name': 'Monet', 'descriptions': [{'content': 'given name', 'classifications': []}]}, {'id': 'Q234900', 'type': 'Person', 'name': 'Linda Darnell', 'classifications': [{'id': 'Q2259451', 'type': 'Type', 'name': 'stage actor'}, {'id': 'Q10798782', 'type': 'Type', 'name': 'television actor'}, {'id': 'Q10800557', 'type': 'Type', 'name': 'film actor'}], 'descriptions': [{'content': 'American actress (1923–1965)', 'classifications': []}, {'content': 'actriz estadounidense', 'classifications': []}, {'content': 'actrice américaine', 'classifications': []}, {'content': 'US-amerikanische Schauspielerin', 'classifications': []}, {'content': 'Amerikaans actrice (1923–1965)', 'classifications': []}], 'birthDate': '1923-10-16T00:00:00', 'birthPlace': {'id': 'Q16557', 'type': 'Place', 'name': 'Dallas'}, 'deathDate': '1965-04-10T00:00:00', 'deathPlace': {'id': 'Q1531184', 'type': 'Place', 'name': 'Glenview'}}, {'id': 'Q223162', 'type': 'Person', 'name': 'Monéteau', 'descriptions': [{'content': 'commune in Yonne, France', 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département de l'Yonne", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}]}]}, {'text': 'Argenteuil', 'label': 'LOCATION', 'start_char': 121, 'end_char': 131, 'candidates': [{'id': 'place/4699255d-458a-4795-8b04-2614f1c171db', 'type': 'Place', 'name': 'Argenteuil', 'part_of': [{'id': 'place/b7e88db4-e572-46e6-9617-8a2594bcfa8c', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74', 'type': 'Place', 'name': 'Argenteuil', 'part_of': [{'id': 'place/682402f8-cdc4-4ebc-ae38-5b3824d2e4aa', 'type': 'Place', 'name': 'Quebec'}]}, {'id': 'place/2f05fdc5-7e9e-4936-bde8-84a88347fde7', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'regional county municipality in Quebec, Canada', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipalité régionale de comté du Québec (Canada)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/cd467ccf-665a-423f-a1b5-1785869d960f', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a', 'type': 'Place', 'name': 'arrondissement of Argenteuil', 'descriptions': [{'content': 'arrondissement of France', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'distrito de Francia', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'arrondissement français', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'Verwaltungseinheit in Frankreich', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "arrondissement in Val-d'Oise, Frankrijk", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/bb803f08-8018-4a00-814a-6fceb3ec6d28', 'type': 'Place', 'name': 'Essonne'}]}, {'id': 'place/b7e88db4-e572-46e6-9617-8a2594bcfa8c', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': "Argenteuil is a commune in the Val-d'Oise department in the Île-de-France region, located about 15 kilometers northwest of Paris, France. (AI generated)", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "Stadt im nordwestlichen Vorortbereich von Paris, an der Seine, im Département Val d'Oise, Frankreich", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "commune in Val-d'Oise, France", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'comuna francesa', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "commune française du département du Val-d'Oise", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/6271aae3-a32f-4aaa-883d-9c99a803a09c', 'type': 'Place', 'name': 'France'}, {'id': 'place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a', 'type': 'Place', 'name': 'arrondissement of Argenteuil'}, {'id': 'place/bb803f08-8018-4a00-814a-6fceb3ec6d28', 'type': 'Place', 'name': 'Essonne'}, {'id': 'place/67b2f4c7-5915-483e-af7e-8e7c218e1b53', 'type': 'Place', 'name': 'Grand Paris'}, {'id': 'place/4699255d-458a-4795-8b04-2614f1c171db', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/c4067590-40a9-462c-995a-9c58f100e6e6', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'place/4f8c46e0-3701-4871-8073-0116e17eeed1', 'type': 'Place', 'name': "Saint-André-d'Argenteuil", 'descriptions': [{'content': 'municipality in Quebec, Canada', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipio en la\xa0provincia\xa0de\xa0Quebec,\xa0Canadá', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipalité au Québec (Canada)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/123bf43c-269e-40fd-b37d-c564dce9ce9b', 'type': 'Place', 'name': 'Québec'}, {'id': 'place/2f05fdc5-7e9e-4936-bde8-84a88347fde7', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/cd467ccf-665a-423f-a1b5-1785869d960f', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'place/1a1b5be6-9f94-4a05-88af-1d49ea123f3c', 'type': 'Place', 'name': 'Argenteuil (Québec : Division de recensement)'}, {'id': 'place/b1a8acf8-392b-4739-af32-8db989d806d0', 'type': 'Place', 'name': 'Argenteuil (Québec)'}, {'id': 'place/afacb2bd-8041-4d59-a700-18009fae3ad1', 'type': 'Place', 'name': 'North River (Argenteuil, Québec)'}, {'id': 'place/9a2c2a4c-dd02-4e54-907a-3b6208174a06', 'type': 'Place', 'name': 'Argenteuil', 'classifications': [{'id': 'concept/4c4443fb-d094-4de4-a5cb-5e3078d58f06', 'type': 'Type', 'name': 'Cities and towns'}], 'descriptions': [{'content': 'Silver deposits here were exploited by Gauls; town was destroyed by Normans, but rebuilt; convent here was endowed by Charlemagne; was famous in the 12th century for abbess Héloïse, of the tragic Héloïse-Abelard romance; currently a residential area.', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/d5aeace4-86fa-4193-a508-4fa6c615432d', 'type': 'Place', 'name': 'Île-de-France'}]}, {'id': 'Q181946', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': "commune in Val-d'Oise, France", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département du Val-d'Oise", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}], 'part_of': [{'id': 'Q511613', 'type': 'Place', 'name': 'arrondissement of Argenteuil'}, {'id': 'Q12784', 'type': 'Place', 'name': "Val-d'Oise"}, {'id': 'Q16665915', 'type': 'Place', 'name': 'Métropole du Grand Paris'}]}, {'id': 'Q645211', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'regional county municipality in Quebec, Canada', 'classifications': []}, {'content': 'municipalité régionale de comté du Québec (Canada)', 'classifications': []}], 'part_of': [{'id': 'Q2304022', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'Q1151230', 'type': 'Place', 'name': 'Argenteuil-sur-Armançon', 'descriptions': [{'content': 'commune in Yonne, France', 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département de l'Yonne", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}], 'part_of': [{'id': 'Q1724141', 'type': 'Place', 'name': 'canton of Ancy-le-Franc'}, {'id': 'Q12816', 'type': 'Place', 'name': 'Yonne'}, {'id': 'Q700536', 'type': 'Place', 'name': 'arrondissement of Avallon'}]}, {'id': 'Q2860941', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'provincial electoral district in Quebec, Canada', 'classifications': []}, {'content': 'circonscription electorale provinciale du Québec, Canada', 'classifications': []}, {'content': 'Provinzwahlkreis in Québec', 'classifications': []}], 'part_of': [{'id': 'Q176', 'type': 'Place', 'name': 'Quebec'}]}, {'id': 'Q3095674', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'railway station in Argenteuil, France', 'classifications': []}, {'content': 'estación de tren en Francia', 'classifications': []}, {'content': 'gare ferroviaire française', 'classifications': []}, {'content': 'Bahnhof in Frankreich', 'classifications': []}, {'content': 'spoorwegstation in Frankrijk', 'classifications': []}], 'part_of': [{'id': 'Q181946', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'Q2860945', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'painting by Édouard Manet, 1874', 'classifications': []}, {'content': 'cuadro de Édouard Manet', 'classifications': []}, {'content': "tableau d'Édouard Manet", 'classifications': []}, {'content': 'pintura de Édouard Manet', 'classifications': []}, {'content': 'Gemälde von Édouard Manet aus dem Jahr 1874', 'classifications': []}]}, {'id': 'Q20188741', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'painting by Claude Monet (c. 1872, National Gallery of Art)', 'classifications': []}, {'content': 'cuadro de Claude Monet', 'classifications': []}, {'content': 'peinture de Claude Monet (v. 1872, National Gallery of Art)', 'classifications': []}, {'content': 'pintura de Claude Monet', 'classifications': []}, {'content': 'Ölgemälde von Claude Monet', 'classifications': []}]}]}]
Only return the JSON output, nothing else. Do so with the following schema:
Return a list of entities with the following schema:
class Entity(BaseModel):
entity_text: str
label: str
wikidata_id: str
sources: list[str]
response = client.responses.create(
model="gpt-4o",
input=formatted_prompt,
)
output_text = response.output_text
print(output_text)
```json
[
{
"entity_text": "Monet",
"label": "PERSON",
"wikidata_id": "Q296",
"sources": ["Claude Monet"]
},
{
"entity_text": "Argenteuil",
"label": "LOCATION",
"wikidata_id": "Q181946",
"sources": ["commune in Val-d'Oise, France"]
}
]
```
def parse_json_with_sources(text):
json_data = text.split("```json")[1]
json_data, sources = json_data.split("```")
json_data = json.loads(json_data)
return json_data, sources
json_output, sources = parse_json_with_sources(output_text)
print(json_output)
[{'entity_text': 'Monet', 'label': 'PERSON', 'wikidata_id': 'Q296', 'sources': ['Claude Monet']}, {'entity_text': 'Argenteuil', 'label': 'LOCATION', 'wikidata_id': 'Q181946', 'sources': ["commune in Val-d'Oise, France"]}]
from spacy import displacy
import spacy
doc = annotated_text_to_spacy_doc(TEXT)
displacy.render(doc, style="ent")
output_ents = []
pandas_output = []
for ent in doc.ents:
found=False
for item in json_output:
if item["entity_text"] == ent.text:
output_ents.append({"start": ent.start_char, "end": ent.end_char, "label": f'{ent.label_} <a href="https://www.wikidata.org/wiki/{item["wikidata_id"]}">{item["wikidata_id"]}</a>'})
pandas_output.append({"entity_text": item["entity_text"], "label": item["label"], "wikidata_id": item["wikidata_id"], "ent_start": ent.start_char, "ent_end": ent.end_char})
found=True
if found==False:
output_ents.append({"start": ent.start_char, "end": ent.end_char, "label": ent.label_})
pandas_output.append({"entity_text": ent.text, "label": ent.label_, "wikidata_id": None, "ent_start": ent.start_char, "ent_end": ent.end_char})
dic_ents = {
"text": doc.text,
"ents": output_ents,
"title": None
}
displacy.render(dic_ents, manual=True, style="ent")
2.15. Getting the Data as a DataFrame#
df = pd.DataFrame(pandas_output)
df
| entity_text | label | wikidata_id | ent_start | ent_end | |
|---|---|---|---|---|---|
| 0 | Monet | PERSON | Q296 | 22 | 27 |
| 1 | Camille | PERSON | None | 43 | 50 |
| 2 | French | LOCATION | None | 91 | 97 |
| 3 | Argenteuil | LOCATION | Q181946 | 121 | 131 |
df.to_csv("../../output/entities.csv", index=False)