2.14. Disambiguation with Candidates#

When using LLMs, we can leverage context to help determine correct identifiers for entities found. One of the largest challenges with LLMs is getting them to generate the correct identifier for specific entities. Without context, an LLM will confidently generate a believable looking identifier code. When checked, however, users will often find these codes do not exist or are entirely wrong.

We solve this problem with context. LLMs can receive context in one of two ways: either we can give it the context or we can use an LLM agentically with tools so that it can retrieve the context for itself. Both have their advantages, but both work within the same principal: context allows the LLM to get the correct identifier code so that it does not need to hallucinate one. While hallucinations are still possible, the chances are reduced if we provide a list of options to an LLM to choose from.

In this notebook, we will explore the first of these options, where we provide the LLM with a list of candidates that were generated in the previous data notebook. To make things easier, we have pasted the output from that notebook here.

It is also worth noting that providing the LLM with the necessary context is often quite cheaper (assuming you are using a paid-model), than letting the model agentically query the web or use other tools. We will see this in the next notebook.

2.14.1. Installing Packages#

!pip install pydantic openai spacy pandas

2.14.2. Getting our ENV Varaibles.#

First, we need to set up our environment variables to access the OpenAI API. We’ll use the python-dotenv package to load environment variables from a .env file, which should contain our OPENAI_API_KEY. This keeps our API key secure by not hardcoding it directly in our code.

import sys
sys.path.append("..")
from dotenv import load_dotenv
import os
import pandas as pd
import json
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

2.14.3. Visualization Functions#

This bit of code is just for making it easier to visualize our data at the end of the notebook.

Hide code cell source

import re
import spacy
from spacy.tokens import Doc, Span


def annotated_text_to_spacy_doc(text, nlp=None):
    """
    Converts annotated text in format [Entity](LABEL) to a spaCy Doc with entity spans.
    
    Args:
        text (str): Text with annotations like "[Tom](PERSON) worked for [Microsoft](ORGANIZATION)"
        nlp (spacy.Language, optional): spaCy language model. If None, uses blank English model.
    
    Returns:
        spacy.tokens.Doc: spaCy document with entity spans set
        
    Example:
        >>> text = "[Tom](PERSON) worked for [Microsoft](ORGANIZATION) in 2020 before he lived in [Rome](LOCATION)."
        >>> doc = annotated_text_to_spacy_doc(text)
        >>> spacy.displacy.render(doc, style="ent")
    """
    if nlp is None:
        nlp = spacy.blank("en")
    
    # Pattern to match [text](LABEL) format
    pattern = r'\[([^\]]+)\]\(([^)]+)\)'
    
    # Parse the text to extract tokens and entity information
    tokens = []
    entity_spans = []  # List of (start_token_idx, end_token_idx, label)
    custom_labels = set()
    
    # Split text by the pattern and process each part
    last_end = 0
    token_idx = 0
    
    for match in re.finditer(pattern, text):
        # Add tokens before the entity
        before_entity = text[last_end:match.start()]
        if before_entity.strip():
            # Tokenize the text before the entity
            before_tokens = before_entity.split()
            tokens.extend(before_tokens)
            token_idx += len(before_tokens)
        
        # Add the entity tokens
        entity_text = match.group(1)
        entity_label = match.group(2)
        custom_labels.add(entity_label)
        
        # Tokenize the entity text
        entity_tokens = entity_text.split()
        start_token_idx = token_idx
        tokens.extend(entity_tokens)
        token_idx += len(entity_tokens)
        end_token_idx = token_idx
        
        # Store entity span information
        entity_spans.append((start_token_idx, end_token_idx, entity_label))
        
        last_end = match.end()
    
    # Add any remaining tokens after the last entity
    remaining = text[last_end:]
    if remaining.strip():
        remaining_tokens = remaining.split()
        tokens.extend(remaining_tokens)
    
    # Add custom labels to the NLP model if they don't exist
    if "ner" not in nlp.pipe_names:
        ner = nlp.add_pipe("ner")
    else:
        ner = nlp.get_pipe("ner")
    
    for label in custom_labels:
        ner.add_label(label)
    
    # Create spaces array (True for tokens that should have a space after them)
    # Simple heuristic: all tokens except the last one get a space
    spaces = [True] * len(tokens)
    if tokens:
        spaces[-1] = False
    
    # Create the Doc from tokens
    doc = Doc(nlp.vocab, words=tokens, spaces=spaces)
    
    # Create entity spans
    entities = []
    for start_idx, end_idx, label in entity_spans:
        if start_idx < len(doc) and end_idx <= len(doc):
            span = Span(doc, start_idx, end_idx, label=label)
            entities.append(span)
    
    # Set entities on the document
    doc.ents = entities
    
    return doc


def visualize_annotated_text(text, nlp=None, style="ent", jupyter=True):
    """
    Convenience function to convert annotated text and visualize it with displaCy.
    
    Args:
        text (str): Text with annotations like "[Tom](PERSON) worked for [Microsoft](ORGANIZATION)"
        nlp (spacy.Language, optional): spaCy language model. If None, uses blank English model.
        style (str): displaCy style ("ent" or "dep")
        jupyter (bool): Whether to render for Jupyter notebook
    
    Returns:
        Rendered visualization (HTML string if not in Jupyter)
    """
    doc = annotated_text_to_spacy_doc(text, nlp)
    
    try:
        import spacy
        return spacy.displacy.render(doc, style=style, jupyter=jupyter)
    except ImportError:
        print("spaCy not installed. Please install with: pip install spacy")
        return None
client = OpenAI(api_key=OPENAI_API_KEY)
CANDIDATES = [{
  "text": "Monet",
  "label": "PERSON",
  "start_char": 22,
  "end_char": 27,
  "candidates": [
    {
      "id": "person/31450df4-cb6b-44f0-8335-38593ea70104",
      "type": "Person",
      "name": "Jean-Baptiste de Lamarck",
      "classifications": [
        {
          "id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
          "type": "Type",
          "name": "male"
        },
        {
          "id": "concept/e46688bf-8720-4f67-85b2-d9e048b95506",
          "type": "Type",
          "name": "Naturalists"
        },
        {
          "id": "concept/b3a2d21c-2782-4da3-aaa4-53c444c4735e",
          "type": "Type",
          "name": "Biologists"
        },
        {
          "id": "concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9",
          "type": "Type",
          "name": "Officers"
        },
        {
          "id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
          "type": "Type",
          "name": "French"
        },
        {
          "id": "concept/b779de71-e499-43aa-abd3-ad991a0d1375",
          "type": "Type",
          "name": "Botanists"
        },
        {
          "id": "concept/0e64c455-7fd1-414a-89ce-38102f009ac4",
          "type": "Type",
          "name": "Zoologists"
        },
        {
          "id": "concept/49390038-5b23-441e-b8c5-b4b44d2c04a7",
          "type": "Type",
          "name": "Faculty"
        },
        {
          "id": "concept/787eed88-09dd-4961-99af-cd53378f3ce6",
          "type": "Type",
          "name": "Chemists"
        },
        {
          "id": "concept/4dbea3b6-9049-40bf-bc16-5b0a064ceb56",
          "type": "Type",
          "name": "Meteorologists"
        },
        {
          "id": "concept/9cb213a4-799a-4d64-b755-5980b3045a60",
          "type": "Type",
          "name": "Paleontologists"
        },
        {
          "id": "concept/62ba8667-022f-4c6f-88e2-d843f1462a08",
          "type": "Type",
          "name": "Malacologists"
        },
        {
          "id": "concept/50674beb-e61a-4f72-a34d-58e64f498bbc",
          "type": "Type",
          "name": "Encyclopedists"
        },
        {
          "id": "concept/51a2fcfd-d4b4-42af-872b-f8dcf4a62ced",
          "type": "Type",
          "name": "Authors"
        }
      ],
      "descriptions": [
        {
          "content": "Chevalier; Professor; franz\u00f6sischer Naturforscher, Biologe",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "French naturalist (1744-1829)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "naturalista franc\u00e9s (1744-1829)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "officier, naturaliste et professeur de zoologie fran\u00e7ais (1744-1829)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "qu\u00edmico franc\u00eas",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "member_of": [
        {
          "id": "group/ae7678d4-6bf2-452c-8a7f-d170176fd5d3",
          "type": "Group",
          "name": "Soci\u00e9t\u00e9 philomathique de Paris"
        },
        {
          "id": "group/c2379f3d-47ad-4270-b905-380c91904a8d",
          "type": "Group",
          "name": "Acad\u00e9mie de Berlin"
        },
        {
          "id": "group/c50693b7-126c-4c3b-8f84-1bf21c662e65",
          "type": "Group",
          "name": "Bavarian Academy of Sciences and Humanities"
        }
      ],
      "birthDate": "1744-08-01T00:00:00",
      "birthPlace": {
        "id": "place/8940d47c-7650-4cef-b06e-46e30af65a04",
        "type": "Place",
        "name": "Bazentin"
      },
      "deathDate": "1829-12-18T00:00:00",
      "deathPlace": {
        "id": "place/8e117529-3872-494c-ab5f-8d7800be2c64",
        "type": "Place",
        "name": "Paris"
      }
    },
    {
      "id": "person/642a0152-1567-4fbe-93f3-66f11c5cab9a",
      "type": "Person",
      "name": "Claude Monet",
      "classifications": [
        {
          "id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
          "type": "Type",
          "name": "French"
        },
        {
          "id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
          "type": "Type",
          "name": "male"
        },
        {
          "id": "concept/0588f9d1-03e3-4b52-b2bf-dd41e601dcdc",
          "type": "Type",
          "name": "Artists"
        },
        {
          "id": "concept/98e4295b-7e89-4836-b601-a195888b6257",
          "type": "Type",
          "name": "caricaturists"
        },
        {
          "id": "concept/4f377430-c1ec-432d-b00c-d70264520e8e",
          "type": "Type",
          "name": "Landscape painters"
        },
        {
          "id": "concept/5272d911-5ccb-4a45-8571-1fed0176d361",
          "type": "Type",
          "name": "Painters"
        },
        {
          "id": "concept/b455d036-ded0-4b6a-b94a-d693dcd7dba4",
          "type": "Type",
          "name": "owners"
        },
        {
          "id": "concept/7ec0c9f8-b1ea-46d7-b5e6-36f23129db6c",
          "type": "Type",
          "name": "Impressionist artists"
        }
      ],
      "descriptions": [
        {
          "content": "French painter",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "French, 1840\u20131926",
          "classifications": [
            {
              "id": "concept/54e35d81-9548-4b4e-8973-de02b09bf9da",
              "type": "Type",
              "name": "display biography"
            }
          ]
        },
        {
          "content": "French painter, 1840-1926",
          "classifications": [
            {
              "id": "concept/54e35d81-9548-4b4e-8973-de02b09bf9da",
              "type": "Type",
              "name": "display biography"
            }
          ]
        },
        {
          "content": "He was a successful caricaturist in his native Le Havre, but after studying plein-air landscape painting, he moved to Paris in 1859. He soon met future Impressionists Camille Pissarro and Pierre-Auguste Renoir. Renoir and Monet began painting outdoors together in the late 1860s, laying the foundations of Impressionism. In 1874, with Pissarro and Edgar Degas, Monet helped organize the Soci\u00e9t\u00e9 Anonyme des Artistes, Peintres, Sculpteurs, Graveurs, etc., the formal name of the Impressionists' group. During the 1870s Monet developed his charateristic technique for rendering atmospheric outdoor light, using broken, rhythmic brushwork. Throughout his career, he remained loyal to the Impressionists' early goal of capturing the transitory effects of nature through direct observation. In 1890 he began creating paintings in series, depicting the same subject under various conditions and at different times of the day. His late pictures, made when he was half-blind, are shimmering pools of color almost totally devoid of form.",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "Peintre. - \u00c9tabli \u00e0 Giverny en 1883",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "birthDate": "1840-11-14T00:00:00",
      "birthPlace": {
        "id": "place/8e117529-3872-494c-ab5f-8d7800be2c64",
        "type": "Place",
        "name": "Paris"
      },
      "deathDate": "1926-12-05T00:00:00",
      "deathPlace": {
        "id": "place/1eead86b-4570-4217-b675-fb1fa81f2670",
        "type": "Place",
        "name": "Giverny"
      }
    },
    {
      "id": "person/bad186a1-bc28-4709-8edb-eca3a9faf387",
      "type": "Person",
      "name": "Monet, Jean, 1932-",
      "birthDate": "1932-01-01T00:00:00"
    },
    {
      "id": "person/39884fa6-b0e5-4fdf-98a7-1788f4bad5fb",
      "type": "Person",
      "name": "Monet, J.-C. (Jean-Claude)",
      "classifications": [
        {
          "id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
          "type": "Type",
          "name": "French"
        }
      ],
      "birthDate": "1941-01-01T00:00:00"
    },
    {
      "id": "person/f368e56b-fe27-4f6f-9e16-75725afe8e31",
      "type": "Person",
      "name": "Carter, Frances Monet"
    },
    {
      "id": "person/a1dccf2f-48c7-43cb-a51c-cc2b4fa54958",
      "type": "Person",
      "name": "Monet, Paul",
      "classifications": [
        {
          "id": "concept/6f652917-4c07-4d51-8209-fcdd4f285343",
          "type": "Type",
          "name": "male"
        },
        {
          "id": "concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9",
          "type": "Type",
          "name": "Officers"
        },
        {
          "id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
          "type": "Type",
          "name": "French"
        }
      ],
      "descriptions": [
        {
          "content": "Franz\u00f6sischer Offizier der Ehrenlegion, Kapit\u00e4n einer kolonialen Artillerie",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "member_of": [
        {
          "id": "group/f6827e5b-8e0f-413c-ada1-4c4e0fc594d6",
          "type": "Group",
          "name": "Minist\u00e8re des colonies"
        },
        {
          "id": "group/a9170801-fd5d-4cbe-b55e-843465f806ab",
          "type": "Group",
          "name": "Acad\u00e9mie Goncourt"
        },
        {
          "id": "group/9064c768-7424-4ccf-9fca-4d8d357ce73a",
          "type": "Group",
          "name": "Vi\u1ec7t Nam Thanh Ni\u00ean H\u1ed9i"
        }
      ],
      "birthDate": "1884-01-13T00:00:00",
      "birthPlace": {
        "id": "place/8de9ae57-d9e0-44f4-8950-54442e0506a1",
        "type": "Place",
        "name": "Angers"
      },
      "deathDate": "1941-05-26T00:00:00"
    },
    {
      "id": "person/2c8940dc-46bb-4029-a2f6-65fd24e8be8c",
      "type": "Person",
      "name": "Laurette Alexis-Monet",
      "classifications": [
        {
          "id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
          "type": "Type",
          "name": "female"
        },
        {
          "id": "concept/7e91736d-7107-4494-9695-542e76cbf320",
          "type": "Type",
          "name": "French"
        }
      ],
      "birthDate": "1923-07-10T00:00:00",
      "deathDate": "2011-12-15T00:00:00"
    },
    {
      "id": "person/39b693f8-d2da-4d4e-a7a7-fb3dfd8d769d",
      "type": "Person",
      "name": "Monet, Alice K. B.",
      "classifications": [
        {
          "id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
          "type": "Type",
          "name": "female"
        }
      ]
    },
    {
      "id": "person/60391131-8295-487a-b786-1216c9cc63ef",
      "type": "Person",
      "name": "Monet-Viera, Molly"
    },
    {
      "id": "person/8b0a5321-663f-4681-a034-a57cf47e9383",
      "type": "Person",
      "name": "Monet, Chantal",
      "classifications": [
        {
          "id": "concept/a309a746-9e51-4c34-b207-7f4773d2ac1a",
          "type": "Type",
          "name": "female"
        },
        {
          "id": "concept/303558a7-ab8f-4b09-a7f7-fffc993a84f5",
          "type": "Type",
          "name": "Journalists"
        },
        {
          "id": "concept/83155191-338b-4396-90ee-f9a625bcbfd3",
          "type": "Type",
          "name": "Belgian"
        }
      ]
    },
    {
      "id": "Q296",
      "type": "Person",
      "name": "Claude Monet",
      "classifications": [
        {
          "id": "Q1028181",
          "type": "Type",
          "name": "pintor"
        },
        {
          "id": "Q1925963",
          "type": "Type",
          "name": "artista gr\u00e1fico"
        }
      ],
      "descriptions": [
        {
          "content": "French painter (1840\u20131926)",
          "classifications": []
        },
        {
          "content": "pintor franc\u00e9s",
          "classifications": []
        },
        {
          "content": "peintre impressionniste fran\u00e7ais",
          "classifications": []
        },
        {
          "content": "pintor franc\u00eas (1840-1926)",
          "classifications": []
        },
        {
          "content": "franz\u00f6sischer Maler des Impressionismus (1840\u20131926)",
          "classifications": []
        }
      ],
      "birthDate": "1840-11-14T00:00:00",
      "birthPlace": {
        "id": "Q90",
        "type": "Place",
        "name": "Paris"
      },
      "deathDate": "1926-12-05T00:00:00",
      "deathPlace": {
        "id": "Q165061",
        "type": "Place",
        "name": "Giverny"
      }
    },
    {
      "id": "Q24698278",
      "type": "Person",
      "name": "Monet",
      "descriptions": [
        {
          "content": "family name",
          "classifications": []
        },
        {
          "content": "apellido",
          "classifications": []
        },
        {
          "content": "nom de famille",
          "classifications": []
        },
        {
          "content": "sobrenome",
          "classifications": []
        },
        {
          "content": "Familienname",
          "classifications": []
        }
      ]
    },
    {
      "id": "Q2959838",
      "type": "Person",
      "name": "Charles Monnet",
      "classifications": [
        {
          "id": "Q1028181",
          "type": "Type",
          "name": "pintor"
        }
      ],
      "descriptions": [
        {
          "content": "French court painter (1732-1808)",
          "classifications": []
        },
        {
          "content": "pintor franc\u00e9s",
          "classifications": []
        },
        {
          "content": "peintre fran\u00e7ais",
          "classifications": []
        },
        {
          "content": "pintor franc\u00eas",
          "classifications": []
        },
        {
          "content": "franz\u00f6sischer Hofmaler",
          "classifications": []
        }
      ],
      "birthDate": "1732-01-10T00:00:00",
      "birthPlace": {
        "id": "Q90",
        "type": "Place",
        "name": "Paris"
      },
      "deathDate": "1819-03-19T00:00:00",
      "deathPlace": {
        "id": "Q90",
        "type": "Place",
        "name": "Paris"
      }
    },
    {
      "id": "Q8142",
      "type": "Person",
      "name": "\u901a\u8ca8",
      "descriptions": [
        {
          "content": "generally accepted medium of exchange for goods or services",
          "classifications": []
        },
        {
          "content": "medio de cambio utilizado para bienes o servicios",
          "classifications": []
        },
        {
          "content": "instrument de paiement en vigueur en un lieu et \u00e0 une \u00e9poque donn\u00e9e",
          "classifications": []
        },
        {
          "content": "unidade monet\u00e1ria, meio de pagamento",
          "classifications": []
        },
        {
          "content": "Verfassung und Ordnung des gesamten Geldwesens eines Staates",
          "classifications": []
        }
      ]
    },
    {
      "id": "Q119729672",
      "type": "Person",
      "name": "Monet",
      "descriptions": [
        {
          "content": "given name",
          "classifications": []
        }
      ]
    },
    {
      "id": "Q234900",
      "type": "Person",
      "name": "Linda Darnell",
      "classifications": [
        {
          "id": "Q2259451",
          "type": "Type",
          "name": "stage actor"
        },
        {
          "id": "Q10798782",
          "type": "Type",
          "name": "television actor"
        },
        {
          "id": "Q10800557",
          "type": "Type",
          "name": "film actor"
        }
      ],
      "descriptions": [
        {
          "content": "American actress (1923\u20131965)",
          "classifications": []
        },
        {
          "content": "actriz estadounidense",
          "classifications": []
        },
        {
          "content": "actrice am\u00e9ricaine",
          "classifications": []
        },
        {
          "content": "US-amerikanische Schauspielerin",
          "classifications": []
        },
        {
          "content": "Amerikaans actrice (1923\u20131965)",
          "classifications": []
        }
      ],
      "birthDate": "1923-10-16T00:00:00",
      "birthPlace": {
        "id": "Q16557",
        "type": "Place",
        "name": "Dallas"
      },
      "deathDate": "1965-04-10T00:00:00",
      "deathPlace": {
        "id": "Q1531184",
        "type": "Place",
        "name": "Glenview"
      }
    },
    {
      "id": "Q223162",
      "type": "Person",
      "name": "Mon\u00e9teau",
      "descriptions": [
        {
          "content": "commune in Yonne, France",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "commune fran\u00e7aise du d\u00e9partement de l'Yonne",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "franz\u00f6sische Gemeinde",
          "classifications": []
        }
      ]
    }
  ]
},
{
  "text": "Argenteuil",
  "label": "LOCATION",
  "start_char": 121,
  "end_char": 131,
  "candidates": [
    {
      "id": "place/4699255d-458a-4795-8b04-2614f1c171db",
      "type": "Place",
      "name": "Argenteuil",
      "part_of": [
        {
          "id": "place/b7e88db4-e572-46e6-9617-8a2594bcfa8c",
          "type": "Place",
          "name": "Argenteuil"
        }
      ]
    },
    {
      "id": "place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74",
      "type": "Place",
      "name": "Argenteuil",
      "part_of": [
        {
          "id": "place/682402f8-cdc4-4ebc-ae38-5b3824d2e4aa",
          "type": "Place",
          "name": "Quebec"
        }
      ]
    },
    {
      "id": "place/2f05fdc5-7e9e-4936-bde8-84a88347fde7",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "regional county municipality in Quebec, Canada",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "municipalit\u00e9 r\u00e9gionale de comt\u00e9 du Qu\u00e9bec (Canada)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "part_of": [
        {
          "id": "place/cd467ccf-665a-423f-a1b5-1785869d960f",
          "type": "Place",
          "name": "Laurentides"
        }
      ]
    },
    {
      "id": "place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a",
      "type": "Place",
      "name": "arrondissement of Argenteuil",
      "descriptions": [
        {
          "content": "arrondissement of France",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "distrito de Francia",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "arrondissement fran\u00e7ais",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "Verwaltungseinheit in Frankreich",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "arrondissement in Val-d'Oise, Frankrijk",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "part_of": [
        {
          "id": "place/bb803f08-8018-4a00-814a-6fceb3ec6d28",
          "type": "Place",
          "name": "Essonne"
        }
      ]
    },
    {
      "id": "place/b7e88db4-e572-46e6-9617-8a2594bcfa8c",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "Argenteuil is a commune in the Val-d'Oise department in the \u00cele-de-France region, located about 15 kilometers northwest of Paris, France. (AI generated)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "Stadt im nordwestlichen Vorortbereich von Paris, an der Seine, im D\u00e9partement Val d'Oise, Frankreich",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "commune in Val-d'Oise, France",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "comuna francesa",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "commune fran\u00e7aise du d\u00e9partement du Val-d'Oise",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "part_of": [
        {
          "id": "place/6271aae3-a32f-4aaa-883d-9c99a803a09c",
          "type": "Place",
          "name": "France"
        },
        {
          "id": "place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a",
          "type": "Place",
          "name": "arrondissement of Argenteuil"
        },
        {
          "id": "place/bb803f08-8018-4a00-814a-6fceb3ec6d28",
          "type": "Place",
          "name": "Essonne"
        },
        {
          "id": "place/67b2f4c7-5915-483e-af7e-8e7c218e1b53",
          "type": "Place",
          "name": "Grand Paris"
        },
        {
          "id": "place/4699255d-458a-4795-8b04-2614f1c171db",
          "type": "Place",
          "name": "Argenteuil"
        },
        {
          "id": "place/c4067590-40a9-462c-995a-9c58f100e6e6",
          "type": "Place",
          "name": "Argenteuil"
        }
      ]
    },
    {
      "id": "place/4f8c46e0-3701-4871-8073-0116e17eeed1",
      "type": "Place",
      "name": "Saint-Andr\u00e9-d'Argenteuil",
      "descriptions": [
        {
          "content": "municipality in Quebec, Canada",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "municipio en la\u00a0provincia\u00a0de\u00a0Quebec,\u00a0Canad\u00e1",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        },
        {
          "content": "municipalit\u00e9 au Qu\u00e9bec (Canada)",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "part_of": [
        {
          "id": "place/123bf43c-269e-40fd-b37d-c564dce9ce9b",
          "type": "Place",
          "name": "Qu\u00e9bec"
        },
        {
          "id": "place/2f05fdc5-7e9e-4936-bde8-84a88347fde7",
          "type": "Place",
          "name": "Argenteuil"
        },
        {
          "id": "place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74",
          "type": "Place",
          "name": "Argenteuil"
        },
        {
          "id": "place/cd467ccf-665a-423f-a1b5-1785869d960f",
          "type": "Place",
          "name": "Laurentides"
        }
      ]
    },
    {
      "id": "place/1a1b5be6-9f94-4a05-88af-1d49ea123f3c",
      "type": "Place",
      "name": "Argenteuil (Qu\u00e9bec : Division de recensement)"
    },
    {
      "id": "place/b1a8acf8-392b-4739-af32-8db989d806d0",
      "type": "Place",
      "name": "Argenteuil (Qu\u00e9bec)"
    },
    {
      "id": "place/afacb2bd-8041-4d59-a700-18009fae3ad1",
      "type": "Place",
      "name": "North River (Argenteuil, Qu\u00e9bec)"
    },
    {
      "id": "place/9a2c2a4c-dd02-4e54-907a-3b6208174a06",
      "type": "Place",
      "name": "Argenteuil",
      "classifications": [
        {
          "id": "concept/4c4443fb-d094-4de4-a5cb-5e3078d58f06",
          "type": "Type",
          "name": "Cities and towns"
        }
      ],
      "descriptions": [
        {
          "content": "Silver deposits here were exploited by Gauls; town was destroyed by Normans, but rebuilt; convent here was endowed by Charlemagne; was famous in the 12th century for abbess H\u00e9lo\u00efse, of the tragic H\u00e9lo\u00efse-Abelard romance; currently a residential area.",
          "classifications": [
            {
              "id": "concept/b9d84f17-662e-46ef-ab8b-7499717f8337",
              "type": "Type",
              "name": "descriptive note"
            }
          ]
        }
      ],
      "part_of": [
        {
          "id": "place/d5aeace4-86fa-4193-a508-4fa6c615432d",
          "type": "Place",
          "name": "\u00cele-de-France"
        }
      ]
    },
    {
      "id": "Q181946",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "commune in Val-d'Oise, France",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "commune fran\u00e7aise du d\u00e9partement du Val-d'Oise",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "franz\u00f6sische Gemeinde",
          "classifications": []
        }
      ],
      "part_of": [
        {
          "id": "Q511613",
          "type": "Place",
          "name": "arrondissement of Argenteuil"
        },
        {
          "id": "Q12784",
          "type": "Place",
          "name": "Val-d'Oise"
        },
        {
          "id": "Q16665915",
          "type": "Place",
          "name": "M\u00e9tropole du Grand Paris"
        }
      ]
    },
    {
      "id": "Q645211",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "regional county municipality in Quebec, Canada",
          "classifications": []
        },
        {
          "content": "municipalit\u00e9 r\u00e9gionale de comt\u00e9 du Qu\u00e9bec (Canada)",
          "classifications": []
        }
      ],
      "part_of": [
        {
          "id": "Q2304022",
          "type": "Place",
          "name": "Laurentides"
        }
      ]
    },
    {
      "id": "Q1151230",
      "type": "Place",
      "name": "Argenteuil-sur-Arman\u00e7on",
      "descriptions": [
        {
          "content": "commune in Yonne, France",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "commune fran\u00e7aise du d\u00e9partement de l'Yonne",
          "classifications": []
        },
        {
          "content": "comuna francesa",
          "classifications": []
        },
        {
          "content": "franz\u00f6sische Gemeinde",
          "classifications": []
        }
      ],
      "part_of": [
        {
          "id": "Q1724141",
          "type": "Place",
          "name": "canton of Ancy-le-Franc"
        },
        {
          "id": "Q12816",
          "type": "Place",
          "name": "Yonne"
        },
        {
          "id": "Q700536",
          "type": "Place",
          "name": "arrondissement of Avallon"
        }
      ]
    },
    {
      "id": "Q2860941",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "provincial electoral district in Quebec, Canada",
          "classifications": []
        },
        {
          "content": "circonscription electorale provinciale du Qu\u00e9bec, Canada",
          "classifications": []
        },
        {
          "content": "Provinzwahlkreis in Qu\u00e9bec",
          "classifications": []
        }
      ],
      "part_of": [
        {
          "id": "Q176",
          "type": "Place",
          "name": "Quebec"
        }
      ]
    },
    {
      "id": "Q3095674",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "railway station in Argenteuil, France",
          "classifications": []
        },
        {
          "content": "estaci\u00f3n de tren en Francia",
          "classifications": []
        },
        {
          "content": "gare ferroviaire fran\u00e7aise",
          "classifications": []
        },
        {
          "content": "Bahnhof in Frankreich",
          "classifications": []
        },
        {
          "content": "spoorwegstation in Frankrijk",
          "classifications": []
        }
      ],
      "part_of": [
        {
          "id": "Q181946",
          "type": "Place",
          "name": "Argenteuil"
        }
      ]
    },
    {
      "id": "Q2860945",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "painting by \u00c9douard Manet, 1874",
          "classifications": []
        },
        {
          "content": "cuadro de \u00c9douard Manet",
          "classifications": []
        },
        {
          "content": "tableau d'\u00c9douard Manet",
          "classifications": []
        },
        {
          "content": "pintura de \u00c9douard Manet",
          "classifications": []
        },
        {
          "content": "Gem\u00e4lde von \u00c9douard Manet aus dem Jahr 1874",
          "classifications": []
        }
      ]
    },
    {
      "id": "Q20188741",
      "type": "Place",
      "name": "Argenteuil",
      "descriptions": [
        {
          "content": "painting by Claude Monet (c. 1872, National Gallery of Art)",
          "classifications": []
        },
        {
          "content": "cuadro de Claude Monet",
          "classifications": []
        },
        {
          "content": "peinture de Claude Monet (v. 1872, National Gallery of Art)",
          "classifications": []
        },
        {
          "content": "pintura de Claude Monet",
          "classifications": []
        },
        {
          "content": "\u00d6lgem\u00e4lde von Claude Monet",
          "classifications": []
        }
      ]
    }
  ]
}
]
TEXT = "This painting depicts [Monet](PERSON)'s first wife, [Camille](PERSON), outside on a snowy day passing by the [French](LOCATION) doors of their home at [Argenteuil](LOCATION). Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green."
MODEL = "gpt-4o-mini"
prompt = """
Disambiguate the entities in the following text.

{text}

Here are the Candidates:

{candidates}

Only return the JSON output, nothing else. Do so with the following schema:

Return a list of entities with the following schema:
class Entity(BaseModel):
    entity_text: str
    label: str
    wikidata_id: str
    sources: list[str]

"""
formatted_prompt = prompt.format(candidates=CANDIDATES, text=TEXT)
print(formatted_prompt)
Disambiguate the entities in the following text.

This painting depicts [Monet](PERSON)'s first wife, [Camille](PERSON), outside on a snowy day passing by the [French](LOCATION) doors of their home at [Argenteuil](LOCATION). Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green.

Here are the Candidates:

[{'text': 'Monet', 'label': 'PERSON', 'start_char': 22, 'end_char': 27, 'candidates': [{'id': 'person/31450df4-cb6b-44f0-8335-38593ea70104', 'type': 'Person', 'name': 'Jean-Baptiste de Lamarck', 'classifications': [{'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/e46688bf-8720-4f67-85b2-d9e048b95506', 'type': 'Type', 'name': 'Naturalists'}, {'id': 'concept/b3a2d21c-2782-4da3-aaa4-53c444c4735e', 'type': 'Type', 'name': 'Biologists'}, {'id': 'concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9', 'type': 'Type', 'name': 'Officers'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}, {'id': 'concept/b779de71-e499-43aa-abd3-ad991a0d1375', 'type': 'Type', 'name': 'Botanists'}, {'id': 'concept/0e64c455-7fd1-414a-89ce-38102f009ac4', 'type': 'Type', 'name': 'Zoologists'}, {'id': 'concept/49390038-5b23-441e-b8c5-b4b44d2c04a7', 'type': 'Type', 'name': 'Faculty'}, {'id': 'concept/787eed88-09dd-4961-99af-cd53378f3ce6', 'type': 'Type', 'name': 'Chemists'}, {'id': 'concept/4dbea3b6-9049-40bf-bc16-5b0a064ceb56', 'type': 'Type', 'name': 'Meteorologists'}, {'id': 'concept/9cb213a4-799a-4d64-b755-5980b3045a60', 'type': 'Type', 'name': 'Paleontologists'}, {'id': 'concept/62ba8667-022f-4c6f-88e2-d843f1462a08', 'type': 'Type', 'name': 'Malacologists'}, {'id': 'concept/50674beb-e61a-4f72-a34d-58e64f498bbc', 'type': 'Type', 'name': 'Encyclopedists'}, {'id': 'concept/51a2fcfd-d4b4-42af-872b-f8dcf4a62ced', 'type': 'Type', 'name': 'Authors'}], 'descriptions': [{'content': 'Chevalier; Professor; französischer Naturforscher, Biologe', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'French naturalist (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'naturalista francés (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'officier, naturaliste et professeur de zoologie français (1744-1829)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'químico francês', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'member_of': [{'id': 'group/ae7678d4-6bf2-452c-8a7f-d170176fd5d3', 'type': 'Group', 'name': 'Société philomathique de Paris'}, {'id': 'group/c2379f3d-47ad-4270-b905-380c91904a8d', 'type': 'Group', 'name': 'Académie de Berlin'}, {'id': 'group/c50693b7-126c-4c3b-8f84-1bf21c662e65', 'type': 'Group', 'name': 'Bavarian Academy of Sciences and Humanities'}], 'birthDate': '1744-08-01T00:00:00', 'birthPlace': {'id': 'place/8940d47c-7650-4cef-b06e-46e30af65a04', 'type': 'Place', 'name': 'Bazentin'}, 'deathDate': '1829-12-18T00:00:00', 'deathPlace': {'id': 'place/8e117529-3872-494c-ab5f-8d7800be2c64', 'type': 'Place', 'name': 'Paris'}}, {'id': 'person/642a0152-1567-4fbe-93f3-66f11c5cab9a', 'type': 'Person', 'name': 'Claude Monet', 'classifications': [{'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}, {'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/0588f9d1-03e3-4b52-b2bf-dd41e601dcdc', 'type': 'Type', 'name': 'Artists'}, {'id': 'concept/98e4295b-7e89-4836-b601-a195888b6257', 'type': 'Type', 'name': 'caricaturists'}, {'id': 'concept/4f377430-c1ec-432d-b00c-d70264520e8e', 'type': 'Type', 'name': 'Landscape painters'}, {'id': 'concept/5272d911-5ccb-4a45-8571-1fed0176d361', 'type': 'Type', 'name': 'Painters'}, {'id': 'concept/b455d036-ded0-4b6a-b94a-d693dcd7dba4', 'type': 'Type', 'name': 'owners'}, {'id': 'concept/7ec0c9f8-b1ea-46d7-b5e6-36f23129db6c', 'type': 'Type', 'name': 'Impressionist artists'}], 'descriptions': [{'content': 'French painter', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'French, 1840–1926', 'classifications': [{'id': 'concept/54e35d81-9548-4b4e-8973-de02b09bf9da', 'type': 'Type', 'name': 'display biography'}]}, {'content': 'French painter, 1840-1926', 'classifications': [{'id': 'concept/54e35d81-9548-4b4e-8973-de02b09bf9da', 'type': 'Type', 'name': 'display biography'}]}, {'content': "He was a successful caricaturist in his native Le Havre, but after studying plein-air landscape painting, he moved to Paris in 1859. He soon met future Impressionists Camille Pissarro and Pierre-Auguste Renoir. Renoir and Monet began painting outdoors together in the late 1860s, laying the foundations of Impressionism. In 1874, with Pissarro and Edgar Degas, Monet helped organize the Société Anonyme des Artistes, Peintres, Sculpteurs, Graveurs, etc., the formal name of the Impressionists' group. During the 1870s Monet developed his charateristic technique for rendering atmospheric outdoor light, using broken, rhythmic brushwork. Throughout his career, he remained loyal to the Impressionists' early goal of capturing the transitory effects of nature through direct observation. In 1890 he began creating paintings in series, depicting the same subject under various conditions and at different times of the day. His late pictures, made when he was half-blind, are shimmering pools of color almost totally devoid of form.", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'Peintre. - Établi à Giverny en 1883', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'birthDate': '1840-11-14T00:00:00', 'birthPlace': {'id': 'place/8e117529-3872-494c-ab5f-8d7800be2c64', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1926-12-05T00:00:00', 'deathPlace': {'id': 'place/1eead86b-4570-4217-b675-fb1fa81f2670', 'type': 'Place', 'name': 'Giverny'}}, {'id': 'person/bad186a1-bc28-4709-8edb-eca3a9faf387', 'type': 'Person', 'name': 'Monet, Jean, 1932-', 'birthDate': '1932-01-01T00:00:00'}, {'id': 'person/39884fa6-b0e5-4fdf-98a7-1788f4bad5fb', 'type': 'Person', 'name': 'Monet, J.-C. (Jean-Claude)', 'classifications': [{'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'birthDate': '1941-01-01T00:00:00'}, {'id': 'person/f368e56b-fe27-4f6f-9e16-75725afe8e31', 'type': 'Person', 'name': 'Carter, Frances Monet'}, {'id': 'person/a1dccf2f-48c7-43cb-a51c-cc2b4fa54958', 'type': 'Person', 'name': 'Monet, Paul', 'classifications': [{'id': 'concept/6f652917-4c07-4d51-8209-fcdd4f285343', 'type': 'Type', 'name': 'male'}, {'id': 'concept/d799dcc0-7c99-494b-91c2-0ecc04fd8bc9', 'type': 'Type', 'name': 'Officers'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'descriptions': [{'content': 'Französischer Offizier der Ehrenlegion, Kapitän einer kolonialen Artillerie', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'member_of': [{'id': 'group/f6827e5b-8e0f-413c-ada1-4c4e0fc594d6', 'type': 'Group', 'name': 'Ministère des colonies'}, {'id': 'group/a9170801-fd5d-4cbe-b55e-843465f806ab', 'type': 'Group', 'name': 'Académie Goncourt'}, {'id': 'group/9064c768-7424-4ccf-9fca-4d8d357ce73a', 'type': 'Group', 'name': 'Việt Nam Thanh Niên Hội'}], 'birthDate': '1884-01-13T00:00:00', 'birthPlace': {'id': 'place/8de9ae57-d9e0-44f4-8950-54442e0506a1', 'type': 'Place', 'name': 'Angers'}, 'deathDate': '1941-05-26T00:00:00'}, {'id': 'person/2c8940dc-46bb-4029-a2f6-65fd24e8be8c', 'type': 'Person', 'name': 'Laurette Alexis-Monet', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}, {'id': 'concept/7e91736d-7107-4494-9695-542e76cbf320', 'type': 'Type', 'name': 'French'}], 'birthDate': '1923-07-10T00:00:00', 'deathDate': '2011-12-15T00:00:00'}, {'id': 'person/39b693f8-d2da-4d4e-a7a7-fb3dfd8d769d', 'type': 'Person', 'name': 'Monet, Alice K. B.', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}]}, {'id': 'person/60391131-8295-487a-b786-1216c9cc63ef', 'type': 'Person', 'name': 'Monet-Viera, Molly'}, {'id': 'person/8b0a5321-663f-4681-a034-a57cf47e9383', 'type': 'Person', 'name': 'Monet, Chantal', 'classifications': [{'id': 'concept/a309a746-9e51-4c34-b207-7f4773d2ac1a', 'type': 'Type', 'name': 'female'}, {'id': 'concept/303558a7-ab8f-4b09-a7f7-fffc993a84f5', 'type': 'Type', 'name': 'Journalists'}, {'id': 'concept/83155191-338b-4396-90ee-f9a625bcbfd3', 'type': 'Type', 'name': 'Belgian'}]}, {'id': 'Q296', 'type': 'Person', 'name': 'Claude Monet', 'classifications': [{'id': 'Q1028181', 'type': 'Type', 'name': 'pintor'}, {'id': 'Q1925963', 'type': 'Type', 'name': 'artista gráfico'}], 'descriptions': [{'content': 'French painter (1840–1926)', 'classifications': []}, {'content': 'pintor francés', 'classifications': []}, {'content': 'peintre impressionniste français', 'classifications': []}, {'content': 'pintor francês (1840-1926)', 'classifications': []}, {'content': 'französischer Maler des Impressionismus (1840–1926)', 'classifications': []}], 'birthDate': '1840-11-14T00:00:00', 'birthPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1926-12-05T00:00:00', 'deathPlace': {'id': 'Q165061', 'type': 'Place', 'name': 'Giverny'}}, {'id': 'Q24698278', 'type': 'Person', 'name': 'Monet', 'descriptions': [{'content': 'family name', 'classifications': []}, {'content': 'apellido', 'classifications': []}, {'content': 'nom de famille', 'classifications': []}, {'content': 'sobrenome', 'classifications': []}, {'content': 'Familienname', 'classifications': []}]}, {'id': 'Q2959838', 'type': 'Person', 'name': 'Charles Monnet', 'classifications': [{'id': 'Q1028181', 'type': 'Type', 'name': 'pintor'}], 'descriptions': [{'content': 'French court painter (1732-1808)', 'classifications': []}, {'content': 'pintor francés', 'classifications': []}, {'content': 'peintre français', 'classifications': []}, {'content': 'pintor francês', 'classifications': []}, {'content': 'französischer Hofmaler', 'classifications': []}], 'birthDate': '1732-01-10T00:00:00', 'birthPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}, 'deathDate': '1819-03-19T00:00:00', 'deathPlace': {'id': 'Q90', 'type': 'Place', 'name': 'Paris'}}, {'id': 'Q8142', 'type': 'Person', 'name': '通貨', 'descriptions': [{'content': 'generally accepted medium of exchange for goods or services', 'classifications': []}, {'content': 'medio de cambio utilizado para bienes o servicios', 'classifications': []}, {'content': 'instrument de paiement en vigueur en un lieu et à une époque donnée', 'classifications': []}, {'content': 'unidade monetária, meio de pagamento', 'classifications': []}, {'content': 'Verfassung und Ordnung des gesamten Geldwesens eines Staates', 'classifications': []}]}, {'id': 'Q119729672', 'type': 'Person', 'name': 'Monet', 'descriptions': [{'content': 'given name', 'classifications': []}]}, {'id': 'Q234900', 'type': 'Person', 'name': 'Linda Darnell', 'classifications': [{'id': 'Q2259451', 'type': 'Type', 'name': 'stage actor'}, {'id': 'Q10798782', 'type': 'Type', 'name': 'television actor'}, {'id': 'Q10800557', 'type': 'Type', 'name': 'film actor'}], 'descriptions': [{'content': 'American actress (1923–1965)', 'classifications': []}, {'content': 'actriz estadounidense', 'classifications': []}, {'content': 'actrice américaine', 'classifications': []}, {'content': 'US-amerikanische Schauspielerin', 'classifications': []}, {'content': 'Amerikaans actrice (1923–1965)', 'classifications': []}], 'birthDate': '1923-10-16T00:00:00', 'birthPlace': {'id': 'Q16557', 'type': 'Place', 'name': 'Dallas'}, 'deathDate': '1965-04-10T00:00:00', 'deathPlace': {'id': 'Q1531184', 'type': 'Place', 'name': 'Glenview'}}, {'id': 'Q223162', 'type': 'Person', 'name': 'Monéteau', 'descriptions': [{'content': 'commune in Yonne, France', 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département de l'Yonne", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}]}]}, {'text': 'Argenteuil', 'label': 'LOCATION', 'start_char': 121, 'end_char': 131, 'candidates': [{'id': 'place/4699255d-458a-4795-8b04-2614f1c171db', 'type': 'Place', 'name': 'Argenteuil', 'part_of': [{'id': 'place/b7e88db4-e572-46e6-9617-8a2594bcfa8c', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74', 'type': 'Place', 'name': 'Argenteuil', 'part_of': [{'id': 'place/682402f8-cdc4-4ebc-ae38-5b3824d2e4aa', 'type': 'Place', 'name': 'Quebec'}]}, {'id': 'place/2f05fdc5-7e9e-4936-bde8-84a88347fde7', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'regional county municipality in Quebec, Canada', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipalité régionale de comté du Québec (Canada)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/cd467ccf-665a-423f-a1b5-1785869d960f', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a', 'type': 'Place', 'name': 'arrondissement of Argenteuil', 'descriptions': [{'content': 'arrondissement of France', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'distrito de Francia', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'arrondissement français', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'Verwaltungseinheit in Frankreich', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "arrondissement in Val-d'Oise, Frankrijk", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/bb803f08-8018-4a00-814a-6fceb3ec6d28', 'type': 'Place', 'name': 'Essonne'}]}, {'id': 'place/b7e88db4-e572-46e6-9617-8a2594bcfa8c', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': "Argenteuil is a commune in the Val-d'Oise department in the Île-de-France region, located about 15 kilometers northwest of Paris, France. (AI generated)", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "Stadt im nordwestlichen Vorortbereich von Paris, an der Seine, im Département Val d'Oise, Frankreich", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "commune in Val-d'Oise, France", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'comuna francesa', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': "commune française du département du Val-d'Oise", 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/6271aae3-a32f-4aaa-883d-9c99a803a09c', 'type': 'Place', 'name': 'France'}, {'id': 'place/bae1a4f6-a9f0-4bb6-83fc-faec8611194a', 'type': 'Place', 'name': 'arrondissement of Argenteuil'}, {'id': 'place/bb803f08-8018-4a00-814a-6fceb3ec6d28', 'type': 'Place', 'name': 'Essonne'}, {'id': 'place/67b2f4c7-5915-483e-af7e-8e7c218e1b53', 'type': 'Place', 'name': 'Grand Paris'}, {'id': 'place/4699255d-458a-4795-8b04-2614f1c171db', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/c4067590-40a9-462c-995a-9c58f100e6e6', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'place/4f8c46e0-3701-4871-8073-0116e17eeed1', 'type': 'Place', 'name': "Saint-André-d'Argenteuil", 'descriptions': [{'content': 'municipality in Quebec, Canada', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipio en la\xa0provincia\xa0de\xa0Quebec,\xa0Canadá', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}, {'content': 'municipalité au Québec (Canada)', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/123bf43c-269e-40fd-b37d-c564dce9ce9b', 'type': 'Place', 'name': 'Québec'}, {'id': 'place/2f05fdc5-7e9e-4936-bde8-84a88347fde7', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/b4b825fd-4b8e-4642-b37b-0d076a5ccf74', 'type': 'Place', 'name': 'Argenteuil'}, {'id': 'place/cd467ccf-665a-423f-a1b5-1785869d960f', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'place/1a1b5be6-9f94-4a05-88af-1d49ea123f3c', 'type': 'Place', 'name': 'Argenteuil (Québec : Division de recensement)'}, {'id': 'place/b1a8acf8-392b-4739-af32-8db989d806d0', 'type': 'Place', 'name': 'Argenteuil (Québec)'}, {'id': 'place/afacb2bd-8041-4d59-a700-18009fae3ad1', 'type': 'Place', 'name': 'North River (Argenteuil, Québec)'}, {'id': 'place/9a2c2a4c-dd02-4e54-907a-3b6208174a06', 'type': 'Place', 'name': 'Argenteuil', 'classifications': [{'id': 'concept/4c4443fb-d094-4de4-a5cb-5e3078d58f06', 'type': 'Type', 'name': 'Cities and towns'}], 'descriptions': [{'content': 'Silver deposits here were exploited by Gauls; town was destroyed by Normans, but rebuilt; convent here was endowed by Charlemagne; was famous in the 12th century for abbess Héloïse, of the tragic Héloïse-Abelard romance; currently a residential area.', 'classifications': [{'id': 'concept/b9d84f17-662e-46ef-ab8b-7499717f8337', 'type': 'Type', 'name': 'descriptive note'}]}], 'part_of': [{'id': 'place/d5aeace4-86fa-4193-a508-4fa6c615432d', 'type': 'Place', 'name': 'Île-de-France'}]}, {'id': 'Q181946', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': "commune in Val-d'Oise, France", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département du Val-d'Oise", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}], 'part_of': [{'id': 'Q511613', 'type': 'Place', 'name': 'arrondissement of Argenteuil'}, {'id': 'Q12784', 'type': 'Place', 'name': "Val-d'Oise"}, {'id': 'Q16665915', 'type': 'Place', 'name': 'Métropole du Grand Paris'}]}, {'id': 'Q645211', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'regional county municipality in Quebec, Canada', 'classifications': []}, {'content': 'municipalité régionale de comté du Québec (Canada)', 'classifications': []}], 'part_of': [{'id': 'Q2304022', 'type': 'Place', 'name': 'Laurentides'}]}, {'id': 'Q1151230', 'type': 'Place', 'name': 'Argenteuil-sur-Armançon', 'descriptions': [{'content': 'commune in Yonne, France', 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': "commune française du département de l'Yonne", 'classifications': []}, {'content': 'comuna francesa', 'classifications': []}, {'content': 'französische Gemeinde', 'classifications': []}], 'part_of': [{'id': 'Q1724141', 'type': 'Place', 'name': 'canton of Ancy-le-Franc'}, {'id': 'Q12816', 'type': 'Place', 'name': 'Yonne'}, {'id': 'Q700536', 'type': 'Place', 'name': 'arrondissement of Avallon'}]}, {'id': 'Q2860941', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'provincial electoral district in Quebec, Canada', 'classifications': []}, {'content': 'circonscription electorale provinciale du Québec, Canada', 'classifications': []}, {'content': 'Provinzwahlkreis in Québec', 'classifications': []}], 'part_of': [{'id': 'Q176', 'type': 'Place', 'name': 'Quebec'}]}, {'id': 'Q3095674', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'railway station in Argenteuil, France', 'classifications': []}, {'content': 'estación de tren en Francia', 'classifications': []}, {'content': 'gare ferroviaire française', 'classifications': []}, {'content': 'Bahnhof in Frankreich', 'classifications': []}, {'content': 'spoorwegstation in Frankrijk', 'classifications': []}], 'part_of': [{'id': 'Q181946', 'type': 'Place', 'name': 'Argenteuil'}]}, {'id': 'Q2860945', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'painting by Édouard Manet, 1874', 'classifications': []}, {'content': 'cuadro de Édouard Manet', 'classifications': []}, {'content': "tableau d'Édouard Manet", 'classifications': []}, {'content': 'pintura de Édouard Manet', 'classifications': []}, {'content': 'Gemälde von Édouard Manet aus dem Jahr 1874', 'classifications': []}]}, {'id': 'Q20188741', 'type': 'Place', 'name': 'Argenteuil', 'descriptions': [{'content': 'painting by Claude Monet (c. 1872, National Gallery of Art)', 'classifications': []}, {'content': 'cuadro de Claude Monet', 'classifications': []}, {'content': 'peinture de Claude Monet (v. 1872, National Gallery of Art)', 'classifications': []}, {'content': 'pintura de Claude Monet', 'classifications': []}, {'content': 'Ölgemälde von Claude Monet', 'classifications': []}]}]}]

Only return the JSON output, nothing else. Do so with the following schema:

Return a list of entities with the following schema:
class Entity(BaseModel):
    entity_text: str
    label: str
    wikidata_id: str
    sources: list[str]
response = client.responses.create(
    model="gpt-4o",
    input=formatted_prompt,
)

output_text = response.output_text
print(output_text)
```json
[
    {
        "entity_text": "Monet",
        "label": "PERSON",
        "wikidata_id": "Q296",
        "sources": ["Claude Monet"]
    },
    {
        "entity_text": "Argenteuil",
        "label": "LOCATION",
        "wikidata_id": "Q181946",
        "sources": ["commune in Val-d'Oise, France"]
    }
]
```
def parse_json_with_sources(text):
    json_data = text.split("```json")[1]
    json_data, sources = json_data.split("```")
    json_data = json.loads(json_data)
    return json_data, sources


json_output, sources = parse_json_with_sources(output_text)
print(json_output)
[{'entity_text': 'Monet', 'label': 'PERSON', 'wikidata_id': 'Q296', 'sources': ['Claude Monet']}, {'entity_text': 'Argenteuil', 'label': 'LOCATION', 'wikidata_id': 'Q181946', 'sources': ["commune in Val-d'Oise, France"]}]
from spacy import displacy
import spacy
doc = annotated_text_to_spacy_doc(TEXT)
displacy.render(doc, style="ent")
This painting depicts Monet PERSON 's first wife, Camille PERSON , outside on a snowy day passing by the French LOCATION doors of their home at Argenteuil LOCATION . Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green.
output_ents = []
pandas_output = []
for ent in doc.ents:
    found=False
    for item in json_output:
        if item["entity_text"] == ent.text:
            output_ents.append({"start": ent.start_char, "end": ent.end_char, "label": f'{ent.label_} <a href="https://www.wikidata.org/wiki/{item["wikidata_id"]}">{item["wikidata_id"]}</a>'})
            pandas_output.append({"entity_text": item["entity_text"], "label": item["label"], "wikidata_id": item["wikidata_id"], "ent_start": ent.start_char, "ent_end": ent.end_char})
            found=True
    if found==False:
        output_ents.append({"start": ent.start_char, "end": ent.end_char, "label": ent.label_})
        pandas_output.append({"entity_text": ent.text, "label": ent.label_, "wikidata_id": None, "ent_start": ent.start_char, "ent_end": ent.end_char})
dic_ents = {
    "text": doc.text,
    "ents": output_ents,
    "title": None
}

displacy.render(dic_ents, manual=True, style="ent")
This painting depicts Monet PERSON Q296 's first wife, Camille PERSON , outside on a snowy day passing by the French LOCATION doors of their home at Argenteuil LOCATION Q181946 . Her face is rendered in a radically bold Impressionist technique of mere daubs of paint quickly applied, just as the snow and trees are defined by broad, broken strokes of pure white and green.

2.15. Getting the Data as a DataFrame#

df = pd.DataFrame(pandas_output)
df
entity_text label wikidata_id ent_start ent_end
0 Monet PERSON Q296 22 27
1 Camille PERSON None 43 50
2 French LOCATION None 91 97
3 Argenteuil LOCATION Q181946 121 131
df.to_csv("../../output/entities.csv", index=False)