Constructing LLMs-Powered Apps with OPL Stack | by Wen Yang | Apr, 2023

OPL: OpenAI, Pinecone, and Lanchain for knowledge-based AI assistant
I bear in mind a month in the past, Eugene Yan posted a poll on Linkedin:
Are you feeling the FOMO from not engaged on LLMs/Generative AI?
Most answered “Sure”. It’s simple to grasp why, given the sweeping consideration generated by chatGPT and now the discharge of gpt-4. Individuals describe the rise of Massive Language Fashions (LLMs) feels just like the iPhone second. But I feel there’s actually no have to really feel the FOMO. Contemplate this: lacking out on the chance to develop iPhones doesn’t preclude the ample potential for creating revolutionary iPhone apps. So too with LLMs. We now have simply entered the daybreak of a brand new period and now it’s the right time to harness the magic of integrating LLMs to construct highly effective purposes.
On this publish, I’ll cowl beneath subjects:
- What’s the OPL stack?
- Easy methods to use the OPL to construct chatGPT with area information? (Important parts with code walkthrough)
- Manufacturing concerns
- Widespread misconceptions
OPL stands for OpenAI, Pinecone, and Langchain, which has more and more turn into the business resolution to overcome the 2 limitations of LLMs:
- LLMs hallucination: chatGPT will typically present mistaken solutions with overconfidence. One of many underlying causes is that these language fashions are skilled to foretell the subsequent phrase very successfully, or the subsequent token to be exact. Given an enter textual content, chatGPT will return phrases with excessive chance, which doesn’t imply that chatGPT has reasoning capacity.
- Much less up-to-date information: chatGPT’s coaching information is restricted to web information previous to Sep 2021. Due to this fact, it should produce much less fascinating solutions in case your questions are about latest developments or subjects.
The widespread resolution is so as to add a information base on prime of LLMs and use Langchain as a framework to construct the pipeline. The important parts of every expertise may be summarized beneath:
- OpenAI:
– gives API entry to highly effective LLMs resembling chatGPT and gpt-4
– gives embedding fashions to transform textual content to embeddings. - Pinecone: it gives embedding vector storage, semantic similarity comparability, and quick retrieval.
- Langchain: it includes 6 modules (
Fashions
,Prompts
,Indexes
,Reminiscence
,Chains
andBrokers
).
–Fashions
provides flexibility in embedding fashions, chat fashions, and LLMs, together with however not restricted to OpenAI’s choices. You too can use different fashions from Hugging Face like BLOOM and FLAN-T5.
–Reminiscence
: there are a selection of the way to permit chatbots to recollect previous dialog reminiscence. From my expertise, entity reminiscence works nicely and is environment friendly.
–Chains
: In the event you’re new to Langchain, Chains is a superb place to begin. It follows a pipeline-like construction to course of person enter, choose the LLM mannequin, apply a Immediate template, and search the related context from the information base.
Subsequent, I’ll stroll by way of the app I constructed utilizing the OPL stack.
The app I constructed is known as chatOutside , which has two main sections:
- chatGPT: allows you to chat with chatGPT instantly, and the format is just like a Q&A app, the place you obtain a single enter and output at a time.
- chatOutside: means that you can chat with a model of chatGPT with professional information of Out of doors actions and developments. The format is extra like a chatbot model, the place all messages are recorded because the dialog progresses. I’ve additionally included a piece that gives supply hyperlinks, which may increase person confidence and is all the time helpful to have.
As you’ll be able to see, if you happen to ask the identical query: “What’re one of the best trainers in 2023? My funds is round $200”. chatGPT will say “as an AI language mannequin, I don’t have entry to data from the longer term.” Whereas chatOutside will offer you extra up-to-date solutions, together with supply hyperlinks.
There are three main steps concerned within the growth course of:
- Step 1: Construct an Outdoors Data Base in Pinecone
- Step 2: Use Langchain for Query & Answering Service
- Step 3: Construct our app in Streamlit
Implementation particulars for every step are mentioned beneath.
Step 1: Construct an Outdoors Data Base in Pinecone
- Step 1.1: I linked to our Outdoors catalog database and chosen articles revealed between January 1st, 2022, and March twenty ninth, 2023. This offered us with roughly 20,000 information.
Subsequent, we have to carry out two information transformations.
- Step 1.2: convert the above dataframe to a listing of dictionaries to make sure information may be upserted accurately into Pinecone.
# Convert dataframe to a listing of dict for Pinecone information upsert
information = df_item.to_dict('information')
- Step 1.3: Break up the
content material
into smaller chunks utilizing Langchain’sRecursiveCharacterTextSplitter
. The advantage of breaking down paperwork into smaller chunks is twofold:
– A typical article is likely to be greater than 1000 characters, which could be very lengthy. Think about we wish to retrieve top-3 articles as context to immediate the chatGPT, we may simply hit the 4000 token restrict.
– Smaller chunks present extra related data, leading to higher context to immediate chatGPT.
from langchain.text_splitter import RecursiveCharacterTextSplittertext_splitter = RecursiveCharacterTextSplitter(
chunk_size=400,
chunk_overlap=20,
length_function=tiktoken_len,
separators=["nn", "n", " ", ""]
)
After splitting, every file’s content material was damaged down into a number of chunks, every having lower than 400 tokens.
One factor price noting is that the textual content splitter used is known as RecursiveCharacterTextSplitter
, which is really helpful use by Harrison Chase, the creator of Langchain. The essential concept is to first break up by the paragraph, then break up by sentence, with overlapping (20 tokens). This helps protect significant data and context from the encircling sentences.
- Step 1.4: Upsert information to Pinecone. The beneath code is tailored from James Briggs’s fantastic tutorial.
import pinecone
from langchain.embeddings.openai import OpenAIEmbeddings# 0. Initialize Pinecone Consumer
with open('./credentials.yml', 'r') as file:
cre = yaml.safe_load(file)
# pinecone API
pinecone_api_key = cre['pinecone']['apikey']
pinecone.init(api_key=pinecone_api_key, surroundings="us-west1-gcp")
# 1. Create a brand new index
index_name = 'outside-chatgpt'
# 2. Use OpenAI's ada-002 as embedding mannequin
model_name = 'text-embedding-ada-002'
embed = OpenAIEmbeddings(
document_model_name=model_name,
query_model_name=model_name,
openai_api_key=OPENAI_API_KEY
)
embed_dimension = 1536
# 3. examine if index already exists (it should not if that is first time)
if index_name not in pinecone.list_indexes():
# if doesn't exist, create index
pinecone.create_index(
identify=index_name,
metric='cosine',
dimension=embed_dimension
)
# 3. Hook up with index
index = pinecone.Index(index_name)
We batch add and embed all articles. which took about 20 minutes to upsert 20k information. Be sure you modify the tqdm
import accordingly based mostly in your env (you don’t have to import each!)
# If utilizing terminal
from tqdm.auto import tqdm# If utilizing in Jupyter pocket book
from tqdm.autonotebook import tqdm
from uuid import uuid4
batch_limit = 100
texts = []
metadatas = []
for i, file in enumerate(tqdm(information)):
# 1. Get metadata fields for this file
metadata = {
'item_uuid': str(file['id']),
'supply': file['url'],
'title': file['title']
}
# 2. Create chunks from the file textual content
record_texts = text_splitter.split_text(file['content'])
# 3. Create particular person metadata dicts for every chunk
record_metadatas = [{
"chunk": j, "text": text, **metadata
} for j, text in enumerate(record_texts)]
# 4. Append these to present batches
texts.lengthen(record_texts)
metadatas.lengthen(record_metadatas)
# 5. Particular case: if now we have reached the batch_limit we are able to add texts
if len(texts) >= batch_limit:
ids = [str(uuid4()) for _ in range(len(texts))]
embeds = embed.embed_documents(texts)
index.upsert(vectors=zip(ids, embeds, metadatas))
texts = []
metadatas = []
After upserting the Outdoors articles information, we are able to examine our pinecone index by utilizing index.describe_index_stats()
. One of many stats to concentrate to is index_fullness
, which was 0.2 in our case. This implies the Pinecone pod was 20% full, suggesting {that a} single p1 pod can retailer roughly 100k articles.
Step 2: Use Langchain for Query & Answering Service
Observe: Langchain updates so quick lately, the model used beneath code is 0.0.118
.
The above sketchnote illustrates how information flows in the course of the inference stage:
- The person asks a query: “What are one of the best trainers in 2023?”.
- The query is transformed into embedding utilizing the
ada-002
mannequin. - The person query embedding is in contrast with all vectors saved in Pinecone utilizing
similarity_search
perform, which retrieves the highest 3 textual content chunks which can be probably to reply the query. - Langchain then passes the highest 3 textual content chunks as
context
, together with the person query to gpt-3.5 (ChatCompletion
) to generate the solutions.
All may be achieved with lower than 30 traces of code:
from langchain.vectorstores import Pinecone
from langchain.chains import VectorDBQAWithSourcesChain
from langchain.embeddings.openai import OpenAIEmbeddings# 1. Specify Pinecone as Vectorstore
# =======================================
# 1.1 get pinecone index identify
index = pinecone.Index(index_name) #'outside-chatgpt'
# 1.2 specify embedding mannequin
model_name = 'text-embedding-ada-002'
embed = OpenAIEmbeddings(
document_model_name=model_name,
query_model_name=model_name,
openai_api_key=OPENAI_API_KEY
)
# 1.3 gives text_field
text_field = "textual content"
vectorstore = Pinecone(
index, embed.embed_query, text_field
)
# 2. Wrap the chain as a perform
qa_with_sources = VectorDBQAWithSourcesChain.from_chain_type(
llm=llm,
chain_type="stuff",
vectorstore=vectorstore
)
Now we are able to take a look at by asking a hiking-related query: “Are you able to suggest some superior mountain climbing trails with views of water in California bay space?”
Step 3: Construct our app in Streamlit
After verifying the logic is working in Jupyter pocket book, we are able to assemble every thing collectively and construct a frontend utilizing streamlit. In our streamlit app, there are two python recordsdata:
– app.py
: the principle python file for frontend and energy the app
– utils.py
: the supporting perform which will likely be referred to as by app.py
Right here’s what my utils.py
appears to be like like:
import pinecone
import streamlit as st
from langchain.chains import VectorDBQAWithSourcesChain
from langchain.chat_models import ChatOpenAI
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings# ------OpenAI: LLM---------------
OPENAI_API_KEY = st.secrets and techniques["OPENAI_KEY"]
llm = ChatOpenAI(
openai_api_key=OPENAI_API_KEY,
model_name='gpt-3.5-turbo',
temperature=0.0
)
# ------OpenAI: Embed model-------------
model_name = 'text-embedding-ada-002'
embed = OpenAIEmbeddings(
document_model_name=model_name,
query_model_name=model_name,
openai_api_key=OPENAI_API_KEY
)
# --- Pinecone ------
pinecone_api_key = st.secrets and techniques["PINECONE_API_KEY"]
pinecone.init(api_key=pinecone_api_key, surroundings="us-west1-gcp")
index_name = "outside-chatgpt"
index = pinecone.Index(index_name)
text_field = "textual content"
vectorstore = Pinecone(index, embed.embed_query, text_field)
# ======= Langchain ChatDBQA with supply chain =======
def qa_with_sources(question):
qa = VectorDBQAWithSourcesChain.from_chain_type(
llm=llm,
chain_type="stuff",
vectorstore=vectorstore
)
response = qa(question)
return response
And eventually, right here’s what my app.py
appears to be like like:
import os
import openai
from PIL import Picture
from streamlit_chat import message
from utils import *openai.api_key = st.secrets and techniques["OPENAI_KEY"]
# For Langchain
os.environ["OPENAI_API_KEY"] = openai.api_key
# ==== Part 1: Streamlit Settings ======
with st.sidebar:
st.markdown("# Welcome to chatOutside 🙌")
st.markdown(
"**chatOutside** means that you can speak to model of **chatGPT** n"
"that has entry to newest Outdoors content material! n"
)
st.markdown(
"Not like chatGPT, chatOutside cannot make stuff upn"
"and can reply from Outdoors information base. n"
)
st.markdown("👩🏫 Developer: Wen Yang")
st.markdown("---")
st.markdown("# Beneath The Hood 🎩 🐇")
st.markdown("Easy methods to Forestall Massive Language Mannequin (LLM) hallucination?")
st.markdown("- **Pinecone**: vector database for Outdoors information")
st.markdown("- **Langchain**: to recollect the context of the dialog")
# Homepage title
st.title("chatOutside: Outdoors + ChatGPT")
# Hero Picture
picture = Picture.open('VideoBkg_08.jpg')
st.picture(picture, caption='Get Outdoors!')
st.header("chatGPT 🤖")
# ====== Part 2: ChatGPT solely ======
def chatgpt(immediate):
res = openai.ChatCompletion.create(
mannequin='gpt-3.5-turbo',
messages=[
{"role": "system",
"content": "You are a friendly and helpful assistant. "
"Answer the question as truthfully as possible. "
"If unsure, say you don't know."},
{"role": "user", "content": prompt},
],
temperature=0,
)["choices"][0]["message"]["content"]
return res
input_gpt = st.text_input(label='Chat right here! 💬')
output_gpt = st.text_area(label="Answered by chatGPT:",
worth=chatgpt(input_gpt), peak=200)
# ========= Finish of Part 2 ===========
# ========== Part 3: chatOutside ============================
st.header("chatOutside 🏕️")
def chatoutside(question):
# begin chat with chatOutside
strive:
response = qa_with_sources(question)
reply = response['answer']
supply = response['sources']
besides Exception as e:
print("I am afraid your query failed! That is the error: ")
print(e)
return None
if len(reply) > 0:
return reply, supply
else:
return None
# ============================================================
# ========== Part 4. Show ChatOutside in chatbot model ===========
if 'generated' not in st.session_state:
st.session_state['generated'] = []
if 'previous' not in st.session_state:
st.session_state['past'] = []
if 'supply' not in st.session_state:
st.session_state['source'] = []
def clear_text():
st.session_state["input"] = ""
# We'll get the person's enter by calling the get_text perform
def get_text():
input_text = st.text_input('Chat right here! 💬', key="enter")
return input_text
user_input = get_text()
if user_input:
# supply comprise urls from Outdoors
output, supply = chatoutside(user_input)
# retailer the output
st.session_state.previous.append(user_input)
st.session_state.generated.append(output)
st.session_state.supply.append(supply)
# Show supply urls
st.write(supply)
if st.session_state['generated']:
for i in vary(len(st.session_state['generated'])-1, -1, -1):
message(st.session_state["generated"][i], key=str(i))
message(st.session_state['past'][i], is_user=True,
avatar_style="big-ears", key=str(i) + '_user')