[8기 랭체인]Multi agent를 이용해 설계해보기

[8기 랭체인] Multi-Agent Systems (MAS)로 프로젝트 설계를 할 수 있을까? 1부 를 읽어보고 오시면 더 좋습니다.

1. import

import functools
import random
from collections import OrderedDict
from typing import Callable, List

import tenacity
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import RegexParser
from langchain.prompts import (
    PromptTemplate,
)
from langchain.schema import (
    HumanMessage,
    SystemMessage,
)
from langchain.globals import set_debug

set_debug(True)

from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

functools: 고차 함수(higher-order functions)를 조작하는 데 사용되는 파이썬의 표준 라이브러리입니다. 예를 들어, partial 함수는 다른 함수의 일부 인자를 고정시켜 새로운 함수를 만드는 데 사용됩니다.
random: 난수를 생성하는 데 사용됩니다. 예를 들어, 대화를 종료할 확률을 결정하는 데 사용될 수 있습니다.
collections.OrderedDict: 순서가 있는 딕셔너리를 만드는 데 사용됩니다. 이 딕셔너리는 요소가 추가된 순서를 기억합니다.
typing.Callable, List: 타입 힌트를 제공하는 데 사용됩니다. Callable은 호출 가능한 객체(함수 등)의 타입을, List는 리스트의 타입을 나타냅니다.
tenacity: 함수를 재시도하는 로직을 구현할 때 사용됩니다. 특정 조건에서 함수를 반복적으로 실행하고자 할 때 유용합니다.
langchain.chat_models.ChatOpenAI: langchain 라이브러리의 일부로, OpenAI의 챗봇 모델을 사용하기 위한 모듈입니다.
langchain.output_parsers.RegexParser: 정규 표현식을 사용해 출력을 파싱하는 데 사용됩니다.
langchain.prompts.PromptTemplate: 대화 프롬프트를 템플릿 형태로 만들기 위한 모듈입니다.
langchain.schema.HumanMessage, SystemMessage: 대화 시스템에서 사용되는 메시지 유형을 정의합니다. HumanMessage는 사용자 메시지를, SystemMessage는 시스템 메시지를 나타냅니다.
langchain.globals.set_debug(True): langchain 라이브러리의 디버깅 모드를 활성화합니다. 이는 디버깅 정보를 보다 쉽게 추적할 수 있도록 도와줍니다.
dotenv: 환경 변수를 관리하는 데 사용됩니다. load_dotenv(find_dotenv())를 통해 .env 파일로부터 환경 변수를 로드하여 코드에 적용합니다.

2. Class

class DialogueAgent:
    def __init__(
        self,
        name: str,
        system_message: SystemMessage,
        model: ChatOpenAI,
    ) -> None:
        self.name = name
        self.system_message = system_message
        self.model = model
        self.prefix = f"{self.name}: "
        self.reset()

    def reset(self):
        self.message_history = ["Here is the conversation so far."]

    def send(self) -> str:
        """
        Applies the chatmodel to the message history
        and returns the message string
        """
        message = self.model(
            [
                self.system_message,
                HumanMessage(content="\n".join(self.message_history + [self.prefix])),
            ]
        )
        return message.content

    def receive(self, name: str, message: str) -> None:
        """
        Concatenates {message} spoken by {name} into message history
        """
        self.message_history.append(f"{name}: {message}")


class DialogueSimulator:
    def __init__(
        self,
        agents: List[DialogueAgent],
        selection_function: Callable[[int, List[DialogueAgent]], int],
    ) -> None:
        self.agents = agents
        self._step = 0
        self.select_next_speaker = selection_function

    def reset(self):
        for agent in self.agents:
            agent.reset()

    def inject(self, name: str, message: str):
        """
        Initiates the conversation with a {message} from {name}
        """
        for agent in self.agents:
            agent.receive(name, message)

        # increment time
        self._step += 1

    def step(self) -> tuple[str, str]:
        # 1. choose the next speaker
        speaker_idx = self.select_next_speaker(self._step, self.agents)
        speaker = self.agents[speaker_idx]

        # 2. next speaker sends message
        message = speaker.send()

        # 3. everyone receives message
        for receiver in self.agents:
            receiver.receive(speaker.name, message)

        # 4. increment time
        self._step += 1

        return speaker.name, message

class IntegerOutputParser(RegexParser):
    def get_format_instructions(self) -> str:
        return "Your response should be an integer delimited by angled brackets, like this: <int>."


class DirectorDialogueAgent(DialogueAgent):
    def __init__(
        self,
        name,
        system_message: SystemMessage,
        model: ChatOpenAI,
        speakers: List[DialogueAgent],
        stopping_probability: float,
    ) -> None:
        super().__init__(name, system_message, model)
        self.speakers = speakers
        self.next_speaker = ""

        self.stop = False
        self.stopping_probability = stopping_probability
        self.termination_clause = "Finish the conversation by stating a concluding message and thanking everyone."
        self.continuation_clause = "Do not end the conversation. Keep the conversation going by adding your own ideas."

        # 1. have a prompt for generating a response to the previous speaker
        self.response_prompt_template = PromptTemplate(
            input_variables=["message_history", "termination_clause"],
            template=f"""{{message_history}}

Follow up with an insightful comment.
{{termination_clause}}
{self.prefix}
        """,
        )

        # 2. have a prompt for deciding who to speak next
        self.choice_parser = IntegerOutputParser(
            regex=r"<(\d+)>", output_keys=["choice"], default_output_key="choice"
        )
        self.choose_next_speaker_prompt_template = PromptTemplate(
            input_variables=["message_history", "speaker_names"],
            template=f"""{{message_history}}

Given the above conversation, select the next speaker by choosing index next to their name: 
{{speaker_names}}

{self.choice_parser.get_format_instructions()}

Do nothing else.
        """,
        )

        # 3. have a prompt for prompting the next speaker to speak
        self.prompt_next_speaker_prompt_template = PromptTemplate(
            input_variables=["message_history", "next_speaker"],
            template=f"""{{message_history}}

The next speaker is {{next_speaker}}. 
Prompt the next speaker to speak with an insightful question.
{self.prefix}
        """,
        )

    def _generate_response(self):
        # if self.stop = True, then we will inject the prompt with a termination clause
        sample = random.uniform(0, 1)
        self.stop = sample < self.stopping_probability

        print_w(f"\tStop? {self.stop}\n")

        response_prompt = self.response_prompt_template.format(
            message_history="\n".join(self.message_history),
            termination_clause=self.termination_clause if self.stop else "",
        )

        self.response = self.model(
            [
                self.system_message,
                HumanMessage(content=response_prompt),
            ]
        ).content

        return self.response

    @tenacity.retry(
        stop=tenacity.stop_after_attempt(2),
        wait=tenacity.wait_none(),  # No waiting time between retries
        retry=tenacity.retry_if_exception_type(ValueError),
        before_sleep=lambda retry_state: print_w(
            f"ValueError occurred: {retry_state.outcome.exception()}, retrying..."
        ),
        retry_error_callback=lambda retry_state: 0,
    )  # Default value when all retries are exhausted
    def _choose_next_speaker(self) -> str:
        speaker_names = "\n".join(
            [f"{idx}: {name}" for idx, name in enumerate(self.speakers)]
        )
        choice_prompt = self.choose_next_speaker_prompt_template.format(
            message_history="\n".join(
                self.message_history + [self.prefix] + [self.response]
            ),
            speaker_names=speaker_names,
        )

        choice_string = self.model(
            [
                self.system_message,
                HumanMessage(content=choice_prompt),
            ]
        ).content
        choice = int(self.choice_parser.parse(choice_string)["choice"])

        return choice

    def select_next_speaker(self):
        return self.chosen_speaker_id

    def send(self) -> str:
        """
        Applies the chatmodel to the message history
        and returns the message string
        """
        # 1. generate and save response to the previous speaker
        self.response = self._generate_response()

        if self.stop:
            message = self.response
        else:
            # 2. decide who to speak next
            self.chosen_speaker_id = self._choose_next_speaker()
            self.next_speaker = self.speakers[self.chosen_speaker_id]
            print_w(f"\tNext speaker: {self.next_speaker}\n")

            # 3. prompt the next speaker to speak
            next_prompt = self.prompt_next_speaker_prompt_template.format(
                message_history="\n".join(
                    self.message_history + [self.prefix] + [self.response]
                ),
                next_speaker=self.next_speaker,
            )
            message = self.model(
                [
                    self.system_message,
                    HumanMessage(content=next_prompt),
                ]
            ).content
            message = " ".join([self.response, message])

        return message

DialogueAgent 클래스

목적: 대화에 참여하는 개별 에이전트를 나타냅니다.
구현:
- init: 에이전트의 이름, 시스템 메시지, 챗 모델을 초기화합니다.
- reset: 대화 기록을 초기 상태로 설정합니다.
- send: 현재 대화 기록을 기반으로 챗 모델을 사용해 새로운 메시지를 생성하고 반환합니다.
- receive: 다른 에이전트로부터 받은 메시지를 대화 기록에 추가합니다.

DialogueSimulator 클래스

목적: 여러 DialogueAgent 객체들을 관리하고 대화를 진행시키는 역할을 합니다.
구현:
- init: 에이전트 리스트와 다음 스피커를 선택하는 함수를 초기화합니다.
- reset: 모든 에이전트의 대화 기록을 초기화합니다.
- inject: 외부에서 메시지를 주입해 대화를 시작하거나 새로운 방향으로 유도합니다.
- step: 다음에 말할 에이전트를 선택하고, 해당 에이전트가 메시지를 생성하게 한 후, 모든 에이전트가 이 메시지를 받도록 합니다.

IntegerOutputParser 클래스

목적: 정수 형식의 응답을 파싱하는 데 사용됩니다.
구현:
- get_format_instructions: 사용자에게 정수 형식의 응답을 요청하는 방법을 알려줍니다.

DirectorDialogueAgent 클래스

목적: DialogueAgent를 상속받아 대화의 흐름을 관리하고 제어하는 역할을 합니다.
구현:
- init: 상위 클래스의 생성자를 호출하고, 추가적인 멤버 변수들을 초기화합니다.
- generateresponse: 대화를 종료할지 결정하고, 대화에 대한 응답을 생성합니다.
- choosenext_speaker: tenacity.retry 데코레이터를 사용하여 다음에 말할 에이전트를 선택합니다. 이 선택은 IntegerOutputParser를 통해 이루어집니다.
- send: 대화의 흐름을 관리합니다. 대화를 종료하거나 다음 에이전트에게 말할 기회를 줍니다.

3. 함수


topic = "Your goal is to design a single-page Pong game webpage using vanilla JavaScript and Django. Write the file and folder structure as a tree. Use code blocks with markdown for class names, descriptions, and goals. All courses should actively utilize your area of expertise and communicate with each other. All courses should be documented."
director_name = "Jane Doe"
agent_summaries = OrderedDict(
    {
        "Project Manager": ("Project Director", "Project Manager"),
        "Frontend Developer": ("Develop using pure vanilla JavaScript. You'll also need to leverage the Bootstrap toolkit", "Frontend Developer"),
        "Backend developers": ("Develop your backend using the Django framework", "Backend developers"),
        "Database Administrator": (" Manage and optimize PostgreSQL databases.", "Database Administrator"),
        "System architecture": ("Design and implement the backend as microservices.", "System architecture"),
    }
)

agent_summary_string = "\n- ".join(
    [""]
    + [
        f"{name}: {role}, located in {location}"
        for name, (role, location) in agent_summaries.items()
    ]
)

conversation_description = f"""This is a conversation about designing for this topic: {topic}.
This discussion features a team of experts, including {agent_summary_string}, who each bring unique insights and expertise to the development and implementation of gaming platforms. You should have specific programming design patterns, file and folder structures, and names and functional descriptions of classes."""
agent_descriptor_system_message = SystemMessage(
    content="Talk about the specifics of the design based on each team member's role."
)


def generate_agent_description(agent_name, agent_role, agent_location):
    agent_specifier_prompt = [
        agent_descriptor_system_message,
        HumanMessage(
            content=f"""{conversation_description}
            Please reply with a professional description of {agent_name}, who is a {agent_role} in {agent_location}, that emphasizes their particular role and location.
            Speak directly to {agent_name}
            Do not add anything else."""
        ),
    ]
    agent_description = ChatOpenAI(model="gpt-4", temperature=1.0)(agent_specifier_prompt).content
    return agent_description


def generate_agent_header(agent_name, agent_role, agent_location, agent_description):
    return f"""{conversation_description}
Your name is {agent_name}, your role is {agent_role}, and you are located in {agent_location}.
Your description is as follows: {agent_description}
You are discussing the topic: {topic}.
Your goal is to provide the most informative, professional, and direct design on the topic from the perspective of your role and location.
"""


def generate_agent_system_message(agent_name, agent_header):
    return SystemMessage(
        content=(
            f"""{agent_header}
You will speak in the style of {agent_name}, and exaggerate your personality.
You should have specific programming design patterns, file and folder structures, and names and functional descriptions of classes and functions.
Do not say the same things over and over again.
Speak in the first person from the perspective of {agent_name}
Do not change roles!
Do not speak from the perspective of anyone else.
Speak only from the perspective of {agent_name}.
Stop speaking the moment you finish speaking from your perspective.
Do not add anything else.
    """
        )
    )


agent_descriptions = [
    generate_agent_description(name, role, location)
    for name, (role, location) in agent_summaries.items()
]
agent_headers = [
    generate_agent_header(name, role, location, description)
    for (name, (role, location)), description in zip(
        agent_summaries.items(), agent_descriptions
    )
]
agent_system_messages = [
    generate_agent_system_message(name, header)
    for name, header in zip(agent_summaries, agent_headers)
]

for name, description, header, system_message in zip(
    agent_summaries, agent_descriptions, agent_headers, agent_system_messages
):
    print_w(f"\n\n{name} Description:")
    print_w(f"\n{description}")
    print_w(f"\nHeader:\n{header}")
    print_w(f"\nSystem Message:\n{system_message.content}")

topic_specifier_prompt = [
    SystemMessage(content="You can make a task more specific."),
    HumanMessage(
        content=f"""{conversation_description}
        Tell us more about your topic. 
        Organize your topic into one question that needs to be answered.
        Demonstrate your subject matter expertise.
        Stick to the specified topic.
        Don't add anything else."""
    ),
]
specified_topic = ChatOpenAI(model="gpt-4", temperature=1.0)(topic_specifier_prompt).content

print_w(f"Original topic:\n{topic}\n")
print_w(f"Detailed topic:\n{specified_topic}\n")

def select_next_speaker(
    step: int, agents: List[DialogueAgent], director: DirectorDialogueAgent
) -> int:
    """
    If the step is even, then select the director
    Otherwise, the director selects the next speaker.
    """
    # the director speaks on odd steps
    if step % 2 == 1:
        idx = 0
    else:
        # here the director chooses the next speaker
        idx = director.select_next_speaker() + 1  # +1 because we excluded the director
    return idx

director = DirectorDialogueAgent(
    name=director_name,
    system_message=agent_system_messages[0],
    model=ChatOpenAI(model="gpt-4", temperature=0.2),
    speakers=[name for name in agent_summaries if name != director_name],
    stopping_probability=0.2,
)

agents = [director]
for name, system_message in zip(
    list(agent_summaries.keys())[1:], agent_system_messages[1:]
):
    agents.append(
        DialogueAgent(
            name=name,
            system_message=system_message,
            model=ChatOpenAI(model="gpt-4", temperature=0.2),
        )
    )

    simulator = DialogueSimulator(
    agents=agents,
    selection_function=functools.partial(select_next_speaker, director=director),
)
simulator.reset()
simulator.inject("Audience member", specified_topic)
print_w(f"(Audience member): {specified_topic}")
print_w("\n")

while True:
    name, message = simulator.step()
    print_w(f"({name}): {message}")
    print_w("\n")
    if director.stop:
        break

generate_agent_description 함수

목적: 각 에이전트에 대한 전문적인 설명을 생성합니다.
사용: agent_descriptor_system_message와 HumanMessage를 사용하여 에이전트에 대한 설명을 요청하는 프롬프트를 만들고, 이를 ChatOpenAI 모델에 전달하여 설명을 생성합니다.

generate_agent_header 함수

목적: 각 에이전트에 대한 헤더 정보를 생성합니다.
사용: 에이전트의 이름, 역할, 위치, 설명을 통합하여 대화의 맥락을 설정하는 헤더를 만듭니다.

generate_agent_system_message 함수

목적: 각 에이전트가 대화에서 사용할 시스템 메시지를 생성합니다.
사용: 각 에이전트의 헤더 정보를 기반으로 대화의 규칙과 에이전트의 역할을 명시하는 시스템 메시지를 만듭니다.

select_next_speaker 함수

목적: 다음에 말할 에이전트를 선택하는 로직을 제공합니다.
사용: 대화의 현재 단계에 따라 DirectorDialogueAgent가 다음 스피커를 선택하거나, 짝수 단계에서는 자동으로 DirectorDialogueAgent가 말하도록 합니다.

4. 프롬프트

topic = "Your goal is to design a single-page Pong game webpage using vanilla JavaScript and Django. Write the file and folder structure as a tree. Use code blocks with markdown for class names, descriptions, and goals. All courses should actively utilize your area of expertise and communicate with each other. All courses should be documented."
director_name = "Jane Doe"
agent_summaries = OrderedDict(
    {
        "Project Manager": ("Project Director", "Project Manager"),
        "Frontend Developer": ("Develop using pure vanilla JavaScript. You'll also need to leverage the Bootstrap toolkit", "Frontend Developer"),
        "Backend developers": ("Develop your backend using the Django framework", "Backend developers"),
        "Database Administrator": (" Manage and optimize PostgreSQL databases.", "Database Administrator"),
        "System architecture": ("Design and implement the backend as microservices.", "System architecture"),
    }
)

전체 작업의 목표에 해당하는 topic에는 이후 모든 agent들이 지속적으로 확인해야하는 목표를 작성합니다.

각 agent의 역할, 룰을 설명하는 프롬프는 agent_summaries에 딕셔너리 형태로 저장되어 사옹됩니다.

이 외에도 각 클래스 내부에도 별도의 프롬프트가 되어 있습니다.

class DirectorDialogueAgent
...
        self.termination_clause = "Finish the conversation by stating a concluding message and thanking everyone."
        self.continuation_clause = "Do not end the conversation. Keep the conversation going by adding your own ideas."

대화를 끝내기 위해 각 답변의 끝에 댓글을 달아 판단합니다.

self.choose_next_speaker_prompt_template = PromptTemplate(
            input_variables=["message_history", "speaker_names"],
            template=f"""{{message_history}}

Given the above conversation, select the next speaker by choosing index next to their name: 
{{speaker_names}}

{self.choice_parser.get_format_instructions()}

Do nothing else.

다음 에이전트를 선택하기 위한 프롬프트 입니다.

5. 종합적인 작동 순서

1. 환경 설정 및 초기화

- 필요한 라이브러리들을 가져오고 (`import`), 환경 변수를 로드합니다 (`dotenv`). 이는 시스템 설정과 환경을 구성하는 데 필요합니다.

- 디버깅 모드를 활성화합니다 (`set_debug(True)`).

2. 에이전트와 시뮬레이터의 클래스 정의

- 대화에 참여할 기본 에이전트 (`DialogueAgent`), 대화의 흐름을 관리하는 감독 에이전트 (`DirectorDialogueAgent`), 그리고 이들을 관리하는 대화 시뮬레이터 (`DialogueSimulator`) 클래스를 정의합니다.

3. 에이전트 설명 및 시스템 메시지 생성

- 각 에이전트의 역할과 위치에 대한 설명을 생성하는 함수 (`generate_agent_description`)를 사용합니다.

- 이러한 설명을 바탕으로 대화를 진행할 때 각 에이전트가 사용할 시스템 메시지를 생성합니다 (`generate_agent_system_message`).

4. 에이전트 및 시뮬레이터 인스턴스 생성

- 각 역할에 맞는 에이전트들과 감독 에이전트를 생성하고, 이들을 포함하는 대화 시뮬레이터 인스턴스를 초기화합니다.

5. 대화 시작

- 대화 시뮬레이터를 리셋하고 (`simulator.reset`), 청중 멤버로부터의 초기 메시지를 주입하여 대화를 시작합니다 (`simulator.inject`).

6. 대화 진행

- `while` 루프를 통해 대화를 단계적으로 진행합니다. 각 단계에서는 다음과 같은 과정이 이루어집니다:

- `simulator.step`을 호출하여 다음에 말할 에이전트를 선택합니다.

- 선택된 에이전트가 메시지를 생성하고, 모든 에이전트가 이 메시지를 받습니다.

- 감독 에이전트 (`DirectorDialogueAgent`)는 대화를 종료할지 여부를 결정합니다.

7. 대화 종료

- 감독 에이전트가 대화를 종료하기로 결정하면, `while` 루프를 종료하고 대화가 끝납니다.

이 코드는 `langchain`과 OpenAI의 GPT-4 챗 모델을 활용하여 복잡한 대화 시나리오를 시뮬레이션하고, 각 에이전트의 역할과 상호작용을 구현하여 현실적인 대화 환경을 조성합니다.

6. 여기까지 만들면서 했던 여러 삽질

CustomAgent

커스텀 에이전트를 활용해서 여러 툴을 선택하는 방식으로 목표를 달성하려 했으나 접근 자체가 잘못되었음을 나중에서야 인식함
- 커스텀 에이전트는 langchain이 지원하지 않는 툴을 이식하기 위함
- 하지만 대화형 에이전트는 기본 에이전트이어서 툴이 필요없음

여러 에이전트를 관리하는 큰 에이전트를 다시 묶을 필요가 없다.

처음 예상보다 각 에이전트가 해야할 일이 세분화 될 필요가 없음을 인식
그래서 다시 필요한 에이전트만 축약해봤더니 지금의 형태가 됨
왜냐하면 토큰의 한계가 명확함 3.5-16k버전으로 모든것을 커버하기에는 코드를 작성하는 능력이 많이 부족하고
반대로 4 혹은 4-turbo를 사용하기에는 토큰량이 부족함 + 비용의 압박이 너무 심함
개인의 호기심 + 공부 목적으로 한번 돌릴때마다 0.5달러 이상씩 요금이 발생하는 현 상태에서는 더 이상 큰 프로젝트는 불가능함

AutoGPT

CustomAgent의 문제와 동일함

7. 느낀점

돌고 돌아 Cookbook이다. 대부분은 여기있는 예제로 해결가능하다.
생각보다 내부 구현에 사용되는 프롬프트 엔지니어링이 개선하기 어렵다.
- 개선해보려 노력했지만 그다지 효과를 못봐서 처음부터 전부 빌딩할 생각이 아니라면 langchain의 내부 프롬프트를 믿고 쓰는것도 나쁘지 않아보인다.
좀더 시간을 써서 발전시켜볼 여지가 있다.

8. 미래

gpt-4-128k가 공개되면 비용을 무시하고 전체 프로젝트의 코딩까지 맏겨보는 것도 시도해볼만 해 보인다.
백터디비 임배딩을 이용해 토큰을 좀더 아껴볼수있다. 클래스 기능에 추가해야 할듯한데 이부분도 공부가 필요하다.
모든 과정이 끝나고 불필요한 대화과정은 생략하고 결과물만 잘 정리해서 보여주는 과정을 추가해야한다.
앞에서 포기했으나 더 많은 agent를 이용해 더 세분화된 작업도 시도해보고싶다.

마무리

이번 8기에는 랭체인 코딩에 시간을 많이 못 써서 상당히 아쉽습니다. 이런 저런 시도를 가장 많이 해본 기수이지만 그에 반해 제 능력의 부족함도 많이 느낀 기수였기에 더 성정할 이유를 찾은 기수였습니다. 8기를 함께해주신 많은 캠프원분들에게 감사인사 드립니다.

결과물 링크

깃허브(코드) - Langchain_study/Multi_agent.py at main · cfcf26/Langchain_study (github.com)

결과 - https://plume-plume-a35.notion.site/cff9149e7bb94a61bd8fc0c6484fdc52?pvs=4

⏰ 가장 빠르게 AI를 배우는 곳 | 지피터스 AI스터디 17기 🚀