In this example, elevate your unstructured data extraction process by defining hierarchical, complex, and categorical fields for direct extraction from PDF files. Deploy on your input PDF data and specify the fields you need to extract.
Example Uses
- Extracting structured data from unstructured PDF documents.
- Defining and extracting complex, hierarchical fields from legal documents.
- Automating the extraction of categorical data from financial reports.
- Creating a custom extraction workflow for industry-specific documents.
Features
- Automate: Utilize Palantir Foundry's automation capabilities to execute complex, recursive function execution. Learn more
- Functions: Leverage Palantir Foundry's functions to construct hierarchical job specs and execute extraction. Learn more
- Write Back Ontology: Form and write back the extracted data into a structured ontology. Learn more
Next Steps
- Explore the workshop workflow to define your own extraction fields.
- Integrate the extracted data into your existing data pipelines.
- Optimize the hierarchical extraction structure for better performance.