Expert-driven entity extraction

Palantir Developers

Expert-driven entity extraction

Take live entity extraction to the next level by defining complex hierarchical fields to extract from PDFs through advanced automated Functions on Objects.

Overview

In this example, elevate your unstructured data extraction process by defining hierarchical, complex, and categorical fields for direct extraction from PDF files. Deploy on your input PDF data and specify the fields you need to extract.

Example Uses

Extracting structured data from unstructured PDF documents.
Defining and extracting complex, hierarchical fields from legal documents.
Automating the extraction of categorical data from financial reports.
Creating a custom extraction workflow for industry-specific documents.

Features

Automate: Utilize Palantir Foundry's automation capabilities to execute complex, recursive function execution. Learn more
Functions: Leverage Palantir Foundry's functions to construct hierarchical job specs and execute extraction. Learn more
Write Back Ontology: Form and write back the extracted data into a structured ontology. Learn more

Next Steps

Explore the workshop workflow to define your own extraction fields.
Integrate the extracted data into your existing data pipelines.
Optimize the hierarchical extraction structure for better performance.

Implementation

Dataset

Datasets can either be statically uploaded, or dynamically synchronized to external systems via Data connections.

document_job_seed

document_seed

field_definition_seed

field_extraction_job_seed

field_option_seed

process_seed

Pipeline

Ontology

Application

Actual results and experiences may vary. Substitute notional data with organizational data to deploy an operational workflow.

·Contact SupportCookie Statement Privacy Statement