Building Custom ChatGPT using your Documents

Anil Tiwari
3 min readApr 25, 2023

We are going to build custom ChatGPT using you own document library. We are going to take advantage of latest development in large language models (LLM) like OpenAI GPT-3, GTP-4.

There are 2 approaches to build question/answering system using LLM (Large language model).

  1. Fine tune GPT: If you fine tune GPT, it will not limit your context to your data but, your data + data that GPT trained on. It can easily go out of context for answering your questions which are not relevant to you document. This will also require retraining the model for new data.
  2. Semantic search + ChatGPT LLM (Large language model): This approach will better fit to you question/answering system because it will give you context specific answers, easy to update your data with new information. In contrast to fine-tuning GPT requires re-training the model.

We are going to use Approach #2 here.

Followings libraries/framework will be used to build our question answering system.

  1. Langchain: It is a powerful library for developing LLM based applications.
  2. Pinecone: This will serve as vector database for storing your embedding vectors and performing semantic search.
  3. Streamlit: This will be used to deploy the app.
  4. OpenAI: LLM libraries from OpenAI.

--

--

Anil Tiwari

Technology Lead, AI, Machine Learning, TensorFlow, App modernization, Design and development of enterprise apps. https://techbabas.com