파이썬 EdgeTTS 활용해서 음성 파일 만들기(wav, mp3)

🔹 소개

파이썬 입코딩으로 영어, 한국어 영상 교재로 사용중인데, 그 동안 EdgeTTS를 통해 실시간 음성을 재생해서 사용하고 있었으나 간혹 재생 딜레이 문제가 발생하는 점, 또 하나는 한국어 음성을 재생할때 글자와 다른 발음이 나는 경우를 해결하기 위해, 실시간 재생 대신 저장된 wav 파일을 재생하는 방법이 필요

진행 방법

✅ 어떤 도구를 사용했고, 어떻게 활용하셨나요?

이미지 우측 하단 음성설정 부분에 WAV/TTS 선택 기능을 추가
TTS 선택시는 실시간 음성을 재생하고
WAV 선택시는 해당 음성 파일을 재생하게 구현

1️⃣ Cursor를 이용해서 EdgeTTS를 이용해서 wav 파일이나 mp3 파일을 제작하는 파이썬 코드 요청

엑셀 파일에서 A열에 영어 문장을 입력하고, 이 문장을 wav나 mp3 파일로 제작
A열에 있는 문장을 범위를 정해 wav 파일을 제작
파일명은 엑셀 번호를 사용 : en1.wav, en2.wav
엑셀 문서를 선택하고,
- 영어, 한국어, 중국어를 선택하고,
- 문장 범위를 선택,
- 저장할 폴더를 선택
- wav, mp3를 선택하고 생성 버튼 클릭

Tip: 사용한 프롬프트 전문을 꼭 포함하고, 내용을 짧게 소개해 주세요.

✅ 언어별로 다양한 음성 : 영어는 13개로 제일 많은데, 미국식 영어외에 영국, 호주, 싱가폴 등 훨씬 더 다양한 영어 음성이 있다. 중국어도 대륙, 대만, 홍콩 등 다양한 버젼이 있고, 한국어는 3개.

✅ 2단계로 영어, 한국어, 중국어의 다양한 음성을 선택 기능 추가

언어별 음성을 리스트업하고, 콤보 리스트로 음성을 선택
선택한 음성을 미리 들어보기 기능
배속 기능도 추가

Tip: 코드 전문은 코드블록에 감싸서 작성해주세요. ( / 을 눌러 '코드 블록'을 선택)

import asyncio
import edge_tts
from edge_tts import list_voices
import wave
import os
from pathlib import Path
import openpyxl
import time
from datetime import datetime

async def main():
    try:
        # 음성 설정
        eng_voice = "en-US-SteffanNeural"  # 영어 음성
        kor_voice = "ko-KR-SunHiNeural"    # 한글 음성
        eng_speed = 3  # 영어 3배속
        kor_speed = 3  # 한국어 3배속
        
        # 사용 가능한 음성 목록 출력
        voices = await list_voices()
        print("\n사용 가능한 음성:")
        print("\n영어 음성:")
        english_voices = [voice for voice in voices if voice["Locale"].startswith("en")]
        for voice in english_voices:
            print(f"- {voice['ShortName']}: {voice['FriendlyName']}")
            
        print("\n한국어 음성:")
        korean_voices = [voice for voice in voices if voice["Locale"].startswith("ko")]
        for voice in korean_voices:
            print(f"- {voice['ShortName']}: {voice['FriendlyName']}")
        
        print("\n현재 설정:")
        print(f"영어 음성: {eng_voice}")
        print(f"한글 음성: {kor_voice}")
        print(f"재생 속도: 영어 {eng_speed}배속, 한국어 {kor_speed}배속")
        
        # 스크립트 파일의 디렉토리 경로 가져오기
        script_dir = Path(os.path.dirname(os.path.abspath(__file__)))
        excel_path = script_dir / 'en600.xlsx'  # 스크립트와 같은 디렉토리의 엑셀 파일
        
        if not os.path.exists(excel_path):
            raise FileNotFoundError(f"엑셀 파일을 찾을 수 없습니다: {excel_path}")
            
        workbook = openpyxl.load_workbook(excel_path)
        sheet = workbook.active
        
        # A2:B11 범위의 데이터 읽기
        words = []
        for row in range(2, 6):  # 2부터 11까지
            english = sheet[f'A{row}'].value
            korean = sheet[f'B{row}'].value
            if english and korean:  # None 값 체크
                # 한글 띄어쓰기 정리
                korean = korean.replace(" ", "")  # 모든 공백 제거
                korean = korean.strip()  # 앞뒤 공백 제거
                words.append((english, korean))
        
        if not words:
            raise ValueError("엑셀 파일에서 단어를 읽어올 수 없습니다.")
        
        print("\n읽어온 단어 목록:")
        print("\n번호  영어                  한글                  시간")
        print("-" * 60)
        
        # TTS 설정
        for i, (english, korean) in enumerate(words, 1):
            # 영어 음성 생성 (영어 음성으로)
            communicate = edge_tts.Communicate(
                text=english,
                voice=eng_voice,  # 영어 음성
                rate=f"+{int((eng_speed-1)*100)}%"  # 영어 속도 조절
            )
            
            # 영어 WAV 파일 생성
            eng_file = script_dir / f"output_eng_{i}.wav"
            if os.path.exists(eng_file):
                os.remove(eng_file)
                
            await communicate.save(str(eng_file))
            
            # 한글 음성 생성 (한글 음성으로)
            communicate = edge_tts.Communicate(
                text=korean,
                voice=kor_voice,  # 한글 음성
                rate=f"+{int((kor_speed-1)*100)}%"  # 한글 속도 조절
            )
            
            # 한글 WAV 파일 생성
            kor_file = script_dir / f"output_kor_{i}.wav"
            if os.path.exists(kor_file):
                os.remove(kor_file)
                
            await communicate.save(str(kor_file))
            
            # 한글 문자열의 실제 표시 길이 계산 (한글은 2칸, 영문/숫자는 1칸 차지)
            korean_display_length = sum(2 if ord(c) > 127 else 1 for c in korean)
            padding = 40 - (28 + korean_display_length)  # 28은 고정된 부분의 길이 (번호4 + 공백2 + 영단어20 + 공백2)
            padding = max(padding, 1)  # 최소 1칸의 공백 보장
            
            # 로그 출력 및 재생
            current_time = datetime.now().strftime("%H:%M:%S.%f")[:-3]  # 밀리초까지 표시
            print(f"{i:4d}  {english:20}  {korean}{' ' * padding}{current_time}")
            os.system(f"afplay {eng_file}")
            os.system(f"afplay {kor_file}")
            
            # 임시 파일 삭제
            os.remove(eng_file)
            os.remove(kor_file)
            
    except FileNotFoundError as e:
        print(f"\n파일 오류: {e}")
        print("스크립트 위치:", script_dir)
    except Exception as e:
        print(f"\n에러가 발생했습니다: {str(e)}")
        # 임시 파일 정리
        for f in ['eng_file', 'kor_file']:
            if f in locals() and os.path.exists(locals()[f]):
                os.remove(locals()[f])

if __name__ == "__main__":
    asyncio.run(main())

결과와 배운 점

배운 점과 나만의 꿀팁을 알려주세요.

과정 중에 어떤 시행착오를 겪었나요?

도움이 필요한 부분이 있나요?

앞으로의 계획이 있다면 들려주세요.

(내용 입력)

도움 받은 글 (옵션)

참고한 지피터스 글이나 외부 사례를 알려주세요.