Expert-driven entity extraction
Palantir Developers

Expert-driven entity extraction

Take live entity extraction to the next level by defining complex hierarchical fields to extract from PDFs through advanced automated Functions on Objects.

Overview

In this example, elevate your unstructured data extraction process by defining hierarchical, complex, and categorical fields for direct extraction from PDF files. Deploy on your input PDF data and specify the fields you need to extract.

Example Uses

  • Extracting structured data from unstructured PDF documents.
  • Defining and extracting complex, hierarchical fields from legal documents.
  • Automating the extraction of categorical data from financial reports.
  • Creating a custom extraction workflow for industry-specific documents.

Features

  • Automate: Utilize Palantir Foundry's automation capabilities to execute complex, recursive function execution. Learn more
  • Functions: Leverage Palantir Foundry's functions to construct hierarchical job specs and execute extraction. Learn more
  • Write Back Ontology: Form and write back the extracted data into a structured ontology. Learn more

Next Steps

  • Explore the workshop workflow to define your own extraction fields.
  • Integrate the extracted data into your existing data pipelines.
  • Optimize the hierarchical extraction structure for better performance.

Implementation

Dataset
Datasets can either be statically uploaded, or dynamically synchronized to external systems via Data connections.
document_job_seed
document_seed
field_definition_seed
field_extraction_job_seed
field_option_seed
process_seed
6
Pipeline
Ontology
6
Application
Actual results and experiences may vary. Substitute notional data with organizational data to deploy an operational workflow.