Langchain과 Gradio를 이용해 수면 상담 Chatbot 만들기 (진행 중)

수면 상담 ChatBot을 만들자!

수면 장애의 가장 일반적인 치료방법은 아마 약물치료일 것입니다.

신경정신과에 가서 수면제 처방을 받으면 가장 간단하죠.

그런데 약물은 내성이 생길 수도 있고, 제약회사에서는 관련성을 부인하지만

수면제를 먹은 상태에서 몽유병 환자처럼 돌아다니다 사고를 치거나

자해나 자살을 시도한 사례가 부작용을 의심하게 합니다.

불면증 인지행동치료(CBT-I: Cognitive Behavioral Therapy for Insomnia)는

신경정신과에서도 인정받은 치료방법이지만 CBT-I 훈련을 받은 상담사(의사!)와

1주일에 1회씩 4~6주간 상담 세션을 진행해야 합니다.

우리나라에서도 모바일 앱 형태로 만들어진 digital CBT-I가 식약처에서 디지털 치료기기로

승인받아 의사의 처방을 받으면 사용할 수 있습니다. 그런데 시장에서 그리 성공한 것 같지는

않습니다. 앱으로 영어공부하는 것도 며칠이면 지겨워지는데 수면 교육 같은 걸 앱으로

받는게 재미있을리가…

미국에서는 CBT-I 앱을 미군을 대상에로 만들어 무료 배포하고 있습니다.

또한 CBT-I 매뉴얼도 인터넷에 공개되어 있습니다.

2주간 배운 랭체인과 RAG를 이용하면 수면 장애가 있는 사람들을 위한

chatbot을 만들 수 있을거란 근거없는 자신감이 생겼습니다.

RAG 만들기

먼저 RAG에 사용할 자료를 구해 봅시다!

https://cbtiweb.org/ResourceFiles/CBTI-MTherapistMaterials_03232020170216354.pdf

CBTI-MTherapistMaterials_03232020170216354.pdf

PDF를 읽어서 Embedding을 만들고 FAISS를 VectorStore로 활용하겠습니다.

def build_db():
    global cbti_manual_db

    # Load the text file and build a vector store with FAISS
    loader = PyPDFLoader("CBT-I.pdf")
    document = loader.load()

    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0, 
                                                    separators=["\n\n",
                                                                "\n",
                                                                " ",
                                                                ".",
                                                                ",",
                                                                "",])
    docs = text_splitter.split_documents(document)

    embeddings = OpenAIEmbeddings()

    cbti_manual_db = FAISS.from_documents(docs, embeddings)

매번 실행할 때마다 PDF를 읽고, 텍스트를 자르고,

VectorStore를 만드는 과정이 불합리해 보였습니다.

FAISS를 한번 만들면 저장을 하고, 그 다음부터는 저장된 index를 읽도록 수정합니다.

def build_db():
    global cbti_manual_db
    cbti_db_file_name = "cbti_faiss_index"
    
    embeddings = OpenAIEmbeddings()

    # if there's a saved faiss db load it. or make one and save it.
    try:
        cbti_manual_db = FAISS.load_local(cbti_db_file_name, embeddings=embeddings, allow_dangerous_deserialization=True)
    except Exception as e:
        print(f"Failed to load FAISS index: {e}.")

        # Load the text file and build a vector store with FAISS
        loader = PyPDFLoader("Cognitive Behavioral Treatment of Insomnia A Session-by-Session Guide.pdf")
        document = loader.load()

        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size = 1000, 
            chunk_overlap = 200,
            separators=["\n\n", "\n", " ", ". ", ", ", "?", "! "]
        )
        docs = text_splitter.split_documents(document)                

        cbti_manual_db = FAISS.from_documents(docs, embeddings)
        cbti_manual_db.save_local(cbti_db_file_name)

Gradio + Langchain으로 Chatbot 만들기

챗봇을 만들어 본 일이 없어 어떻게 만들지 고민이었는데

11기 1주차 랭체인방 발표를 듣고 Gradio를 알게 됐습니다.

Gradio에 제공하는 memory 기능과 RAG를 연결해서 만들면 간단할 것 같았는데

공부가 부족해서 메모리 연계는 아직 미완성입니다.

RAG에 사용한 원 데이터가 영어로 되어 있어

사용자가 입력한 질문을 영어로 번역하게 해놓았습니다.

이 부분도 어떻게 처리할지 고민이네요!

def predict(message, history):
    global cbti_manual_db

    history_langchain_format = []

    for human, ai in history:
        history_langchain_format.append(HumanMessage(content=human))
        history_langchain_format.append(AIMessage(content=ai))   

    history_langchain_format.append(HumanMessage(content=message))

    # Translate the message into English to enhance RAG
    en_prompt = ChatPromptTemplate.from_messages([
        ("system", "Translate the user message into English"),
        ("user", "{message}")
    ])

    en_chain = en_prompt | llm | parser
    en_message = en_chain.invoke({"message": message})

    # define retriever with similarity search
    retriever = cbti_manual_db.as_retriever(search_type="similarity", search_kwargs={"k": 6})
    prompt = hub.pull("rlm/rag-prompt")

    rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | prompt
        | llm
    )

    gpt_response = rag_chain.invoke(en_message)
    
    return gpt_response.content


build_db()
gr.ChatInterface(predict).launch()

실행 결과

영어 심은데 영어가 나왔습니다 T.T

보완할 점

RAG의 참고 데이터가 영어일 때 한국어로 질문하는 경우 어떻게 처리할지
메모리 기능 동작하도록 수정
ChatGPT에 파일을 업로드 했을 때보다 많이 떨어지는 답변 수준을 어떻게 개선할 것인가

추가로 보완 작업을 했습니다.

영어 데이터에서 RAG를 수행하기 위한 꼼수

참고 데이터가 영문이므로 사용자 입력을 영어로 번역해서

RAG를 수행하는게 합리적일 것 같습니다.

영문으로 번역하기 위한 모듈을 추가했습니다.

질문이 어떤 언어로 되어 있는지를 먼저 확인하고,

영어인 경우에는 번역하는 과정을 거치지 않습니다.

번역된 질문과 원래의 언어 정보를 return합니다.

def translate_to_eng(org_msg):
     # Detect the language of original message
    detect_lang_prompt = ChatPromptTemplate.from_messages([
        ("system", "Detect the language used for the message. Answer with the language name in English"),
        ("user", "{message}")
    ])

    detect_lang_chain = detect_lang_prompt | llm | parser
    org_lang = detect_lang_chain.invoke({"message": org_msg})

    # Translate the message into English to enhance RAG
    if org_lang != "English":        
        en_prompt = ChatPromptTemplate.from_messages([
            ("system", "Translate the user message into English"),
            ("user", "{message}")
        ])

        en_chain = en_prompt | llm | parser
        en_message = en_chain.invoke({"message": org_msg})
    else:
        en_message = org_msg   
    
    return en_message, org_lang

나중에 영어로 나온 답변을 원래 질문할 때의 언어로 바꾸기 위한 모듈도 추가해 줍니다

def translate_to_org(msg, org_lang):
    if org_lang == "English":
        return msg

    # Translate the message into org_lang
    translate_prompt = ChatPromptTemplate.from_messages([
        ("system", "Translate the user message into {org_lang} with your expertise. Use a natural and easy words and expressions"),
        ("user", "{message}")
    ])

    translate_chain = translate_prompt | llm | parser
    result_message = translate_chain.invoke({"org_lang": org_lang, "message": msg})
    
    return result_message

Chat History를 반영한 chatbot 구성

Langchain에 있는 document를 참고하여 삽질을 거듭한 끝에

기존의 채팅 내용을 참조하는 RAG를 겨우 만들어냈습니다

def predict(message, history):
    global cbti_manual_db

    history_langchain_format = []

    for human, ai in history:
        history_langchain_format.append(HumanMessage(content=human))
        history_langchain_format.append(AIMessage(content=ai))       

    en_message, org_lang = translate_to_eng(message)
    #org_lang = "English"

    # define retriever with similarity search
    retriever = cbti_manual_db.as_retriever(search_type="similarity", search_kwargs={"k": 6})


    # question + history => new question
    contextualize_q_system_prompt = (
        "Given a chat history and the latest user question "
        "which might reference context in the chat history, "
        "formulate a standalone question which can be understood "
        "without the chat history. Do NOT answer the question, "
        "just reformulate it if needed and otherwise return it as is."
    )

    contextualize_q_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", contextualize_q_system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )
    history_aware_retriever = create_history_aware_retriever(
        llm, retriever, contextualize_q_prompt
    )

    # Q&A assistant which takes the reformulated question and answer for it
    system_prompt = (
        "You are an assistant for question-answering tasks in the field of sleep medicine. "
        "You talk and behave as a professional sleep consultant and an expert CBT-I practitioner"
        "Use easy words to help the user to improve sleep quality."
        "Use the following pieces of retrieved context to answer "
        "the question. If you don't know the answer, say that you "
        "don't know. Use three sentences maximum and keep the "
        "answer concise."
        "\n\n"
        "{context}"
    )

    qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )

    question_answer_chain = qa_prompt | llm

    # build RAG chain
    rag_chain = create_retrieval_chain(history_aware_retriever | format_docs, question_answer_chain)

    #rag_chain = create_retrieval_chain()
    # rag_chain = create_
    '''
    rag_chain = (
        {"context": retriever | format_docs, "input": RunnablePassthrough()}
        | prompt
        | llm
    )
    '''

    try:
        gpt_response = rag_chain.invoke({"input": en_message, "chat_history": history_langchain_format})
    except Exception as e:
        print("error during RAG chain invocation:", e)
        raise e
    
    # add message to history_langchain_format after RAG gives the answer
    history_langchain_format.append(HumanMessage(content=message))    

    # Translate the answer into the original language
    response = translate_to_org(gpt_response['answer'].content, org_lang)
    
    return response

수정 보완 후 실행 결과

처음보다 답변 수준이 많이 좋아진 것 같습니다. 제 이름도 잘 기억하고 있네요!

개선 후에도 여전히 남아있는 보완할 점

답변 수준이 아직 마음에 들지 않습니다. Prompt를 좀 더 만져봐야겠습니다.
속도가 느린데 원인을 모르겠네요! Retriever가 문제일 것 같습니다.

#11기-랭체인

⏰ 가장 빠르게 AI를 배우는 곳 | 지피터스 AI스터디 17기 🚀