효과적인 프롬프트 엔지니어링으로 데이터 추출하기

1. 명확한 지시 제공
LLM이 정확한 데이터를 추출하도록 하기 위해서는 명확하고 구체적인 지시를 제공하는 것이 중요합니다. 명확한 지시는 모델이 혼동하지 않고 올바른 데이터를 추출하는 데 도움이 됩니다.

예시:
KO: 사용자의 입력에서 구조화된 정보를 추출하는 것이 목표입니다. 정보를 추출할 때, 반드시 아래에 설명된 형식과 정확히 일치하도록 해야 합니다. 스키마에 포함되지 않은 속성을 추가하지 마세요.
EN: Your goal is to extract structured information from the user’s input that matches the form described below. When extracting information please make sure it matches the type information exactly. Do not add any attributes that do not appear in the schema shown below.

2. 참조 예시 제공
모델이 어떤 유형의 데이터를 추출해야 하는지 이해할 수 있도록 여러 입력과 출력 예시를 제공합니다. 참조 예시는 모델의 이해를 돕고 일관된 결과를 보장합니다.

예시:
Input: Songs by paul simon
Output: <json>{"musicrequest": {"artist": ["paul simon"]}}</json>

Input: Please stop the music
Output: <json>{"musicrequest": {"action": "stop"}}</json>

Input: play something
Output: <json>{"musicrequest": {"action": "play"}}</json>

3. 추출 스키마 활용
추출할 데이터의 구조를 명확히 정의하여 모델이 어떤 형식으로 데이터를 추출해야 하는지 명확하게 합니다. Kor 라이브러리는 이를 쉽게 구현할 수 있는 도구를 제공합니다.

예시:
musicrequest: {
song: Array<string> // 사용자가 재생하고자 하는 노래
album: Array<string> // 사용자가 재생하고자 하는 앨범
artist: Array<string> // 사용자가 듣고자 하는 아티스트
action: "play" | "stop" | "previous" | "next" // 실행할 작업; play, stop, previous, next 중 하나
}

4. 프롬프트
“Your goal is to extract structured information from the user’s input that matches the form described below. When extracting information please make sure it matches the type information exactly. Do not add any attributes that do not appear in the schema shown below.

musicrequest: {
song: Array<string> // 사용자가 재생하고자 하는 노래
album: Array<string> // 사용자가 재생하고자 하는 앨범
artist: Array<string> // 사용자가 듣고자 하는 아티스트
action: "play" | "stop" | "previous" | "next" // 실행할 작업; play, stop, previous, next 중 하나
}

Please output the extracted information in JSON format. Do not output anything except for the extracted information. Do not add any clarifying information. Do not add any fields that are not in the schema. If the text contains attributes that do not appear in the schema, please ignore them. All output must be in JSON format and follow the schema specified above. Wrap the JSON in tags.

Input: Songs by paul simon
Output: <json>{"musicrequest": {"artist": ["paul simon"]}}</json>

Input: Please stop the music
Output: <json>{"musicrequest": {"action": "stop"}}</json>

Input: play something
Output: <json>{"musicrequest": {"action": "play"}}</json>

Input: [user input]
Output:

이와 같은 방식으로 프롬프트를 작성하면 LLM이 정확하고 일관된 방식으로 데이터를 추출할 수 있습니다. 데이터 추출의 성공은 잘 설계된 프롬프트에 달려 있으며, 이를 통해 프로젝트의 품질과 효율성을 크게 향상시킬 수 있습니다.

⏰ (오늘 모집 종료) 가장 빠르게 AI를 배우는 곳 | 지피터스 AI스터디 17기 모집 중 🚀

효과적인 프롬프트 엔지니어링으로 데이터 추출하기

👉 이 게시글도 읽어보세요