Langchain document loaders. Built with Docusaurus.

  • Langchain document loaders. 📄️ AirbyteLoader Airbyte is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. Find the API reference, description and package for each document loader type, such as webpages, PDFs, cloud providers, social platforms, etc. Jun 14, 2025 · This guide covers the types of document loaders available in LangChain, various chunking strategies, and practical examples to help you implement them effectively. Learn how to load data into the standard LangChain Document format using various document loaders. Currently, supports only text files. g. , CSV, PDF, HTML) into standardized Document objects for LLM applications. It has the largest catalog of ELT connectors to data warehouses and databases. See the individual pages for more on each category. LangChain. They facilitate the seamless integration and processing of diverse data sources, such as YouTube, Wikipedia, and GitHub, into Document objects. Learn how to load documents from various sources using LangChain Document Loaders. 3. Integrations You can find available integrations on the Document loaders integrations page. A Google Cloud Storage (GCS) document loader that allows you to load documents from storage buckets. Each Docling parses PDF, DOCX, PPTX, HTML, and other formats into a rich unified representation including document layout, tables etc. , making them ready for generative AI workflows like RAG. Document loaders are designed to load document objects. Web loaders, which load data from remote sources. In LangChain, this usually involves creating Document objects, which encapsulate the extracted text (page_content) along with metadata—a dictionary containing details about the document, such as This notebook provides a quick overview for getting started with JSON document loader. LangChain4j Documentation 2025. js categorizes document loaders in two different ways: File loaders, which load data into LangChain formats from your local filesystem. document_loaders # Document Loaders are classes to load Documents. Jul 15, 2024 · LangChain Document Loaders convert data from various formats (e. How to create a custom Document Loader Overview Applications based on LLMs frequently entail extracting data from databases or files, like PDFs, and converting it into a format that LLMs can utilize. LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc. For detailed documentation of all JSONLoader features and configurations head to the API reference. document_loaders. Document loaders 📄️ acreom acreom is a dev-first knowledge base with tasks running on local markdown files. To handle different types of documents in a straightforward way, LangChain provides several document loader classes. Jul 15, 2024 · Overview LangChain Document Loaders convert data from various formats (e. © Copyright 2023, LangChain Inc. Built with Docusaurus. latest LangChain Python API Reference langchain-core: 0. It also integrates with multiple AI models like Google's Gemini and OpenAI for generating insights from the loaded documents. Interface Documents loaders implement the BaseLoader interface. git. The Repository can be local on disk available at repo_path, or remote at clone_url that will be cloned to repo_path. 📄️ Airbyte CDK (Deprecated) Note: AirbyteCDKLoader is deprecated Document loaders Document loaders load data into LangChain's expected format for use-cases such as retrieval-augmented generation (RAG). Browse the list of available loaders, their parameters, and examples. Class hierarchy:. 72 document_loaders GitLoader # class langchain_community. GitLoader(repo_path: str, clone_url: str | None = None, branch: str | None = 'main', file_filter: Callable[[str], bool] | None = None) [source] # Load Git repository files. Class hierarchy: This covers how to load all documents in a directory. This project demonstrates the use of LangChain's document loaders to process various types of data, including text files, PDFs, CSVs, and web pages. Document Loaders are usually used to load a lot of Documents in a single run. xngvka kotiqjd lhm qousj oqluo uymmsb cyhnya ufw rnpzh gdrh