
Supported Document File Types

The currently supported file types are:

  • pdf
  • jpg (or jpeg)
  • png
  • docx
  • pptx
  • xlsx
  • csv
  • tsv
  • json
  • txt

Additional data sources

The GroundX ingestion pipeline can also crawl and ingest the content from websites using the endpoint.

Note: the crawler scrapes the page content from the source HTML and can sometimes be confused by the structure of the page.