구글트렌드 실시간 인기검색어 추출하기

소개

- 프로젝트명: 이슈 기반 '오늘의 시' 추천 및 컨텐츠 자동 생성
- 목적 : 매일 오전에 사용자의 지시로 국내외 주요 이슈를 검색하여, 해당 이슈와 관련된 시를 추천하고 블러그 게시글을 작성하고 다른 SNS 계정의 컨테츠로 함께 활용하여 업무 효율성 증대
- 실행 과제 :

1. 이슈 수집 및 분류
   - 이슈수집 대상: 구글 트렌드, 네이버 데이터랩, Reddit 트렌딩 섹션을 통해 실시간 이슈 수집.
  - 수집 방법 : 각 플랫폼에서 실간 인기 검색어와 핫 키워드를 수집하며, 국내와 국외로 구분해 5개 주요 분야(사회, 경제, 정치, 문화, 환경)로 분류
  - 자동화 툴 : Python 스크립트와 API를 활용해 매일 오전 스케줄에 따라 자동 수집.
2. 이슈 분석 및 요약
   - 분석 대상: 수집된 각 키워드에 대해 구글, Reddit, 네이버에서 추가 정보를 수집해 심층 분석.
   - 요약 및 정리: GPT-4 모델을 사용해 각 키워드의 배경과 현재 상황을 100~150자 내외로 요약하여 구글시트에 저장
3. 사용자 선택 및 상호작용
   - 이슈 제공 방식: 사용자가 이슈 분석을 요청하면 2. 이슈 분석 및 요약을 실행하여 그 결과를 구글시트에 저장 및 사용자의 메일로 전송. 메일에는 결과를 확인할 수 있는 링크를 첨부함.
   - 사용자 선택 과정: 사용자는 메일을 수신한 후 제공된 링크나 버튼을 클릭하여 이슈 목록을 확인합니다. 사용자는 관심 있는 이슈를 선택하여 시스템에 콘텐츠 생성을 요청하게 됨. 선택된 이슈는 시스템에 자동으로 저장됩니다.
4. 콘텐츠 생성 및 시 추천 가이드 라인
목차 및 작성 가이드라인
   1. 오늘의 시 추천
최종 결정된 최근 사회의 중요한 이슈를 바탕으로 오늘의 시를 추천합니다. 구글 트렌드나 실시간 인기 검색어를 참고해 현재 주목받는 주제를 반영하되, 검색 출처는 명시하지 않습니다.
시의 제목과 작가를 포함하여 오늘의 시를 소개하세요.
주요 이슈를 설명하고, 이를 배경으로 간략하게 시를 추천하게 된 이유를 설명하세요. 설명은 자연스럽고 사람의 언어처럼 느껴지도록 작성하세요.
이미 추천된 시는 다시 추천되지 않도록\*\* 하세요. 이전에 추천된 시 목록을 참고하여 중복 추천을 방지합니다.
오늘의 시의 전문을 제공합니다.
저작권 검토 : 저작권에 문제가 있는 경우, 시의 핵심 소절이나 단락을 반드시 제공하고 를 읽을 수 있는 링크를 포함합니다.
링크가 없는 경우, 시 전체를 공개하지 못하는 이유를 설명하고, 이 시를 다 읽어볼 가치가 있는 이유를 감성적으로 설명하세요.
저작권 문제가 없는 경우, 시 전체를 제공합니다.
한국 시가 아닌 경우, 원문과 한국어 번역을 두 개의 테이블로 나누어 제공합니다.
시의 해설과 감상:
작가 소개 : 작가에 대해 정확히 200자로 소개합니다. 자연스럽고 간결한 문장으로 독자들에게 유용한 정보를 제공합니다.
전문가의 해설 : 인터넷을 검색하여 시에 대한 다른 작가, 전문가, 평론가 등의 해설이나 해석을 출처와 함께 정확히 300자로 제공합니다.
타겟층인 20대에서 50대 여성이 공감할 수 있도록, 감성적이고 충분한 길이의 감상평을 작성하세요. 감상평은 시의 구체적인 구절과 그에 대한 해석을 포함하여 독자가 더 깊이 공감할 수 있도록 하며, 정확히 300자로 작성합니다.
해설과 감상은 작가의 '****의 톤을 반영하여, 깊이 있고 섬세하며 내면의 감정을 탐구하는 방식으로 작성하세요.
오늘 이 시가 당신에게 필요한 이유:
시와 오늘의 이슈 간의 연관성을 설명합니다.
회원 상호작용 유도
토론 유도 질문 : 시와 관련된 질문을 던져 회원들이 댓글로 자신의 감상을 나눌 수 있도록 유도하세요. 예: "이 시를 읽고 어떤 감정을 느끼셨나요?
감정 공유 공간 안내: 회원들이 자신의 감정을 자유롭게 표현할 수 있도록 권장하세요.
목차 1번부터 4번까지는 목차에 어울리는 아이콘 넣어 주세요, 다만 각 목차의 해당 아이콘은 매번 동일한 아이콘이 사용되도록 유지해 주세요.
SEO 및 해시태그 작성 :
관련 키워드 : 추천하는 시, 관련 이슈, 감정적인 주제를 바탕으로 적절한 키워드를 도출하고 본문에 자연스럽게 포함하세요.
해시태그 : 네이버, 구글, SNS에서 잘 검색될 수 있도록 관련 해시태그를 작성하세요.
배경 이미지 및 음악 생성용 프롬프트 생성:
Flux AI 배경 이미지 프롬프트 : 추천된 시에 어울리는 배경 이미지를 Flux AI에서 생성하기 위한 프롬프트를 생성하세요.
시에서 느껴지는 감정(예: 위로, 평화, 희망, 슬픔 등)과 시의 내용을 이미지에 반영할 수 있도록 시의 주요 이미지를 바탕으로 자연 경관, 도시, 인물, 느낌, 감정 등을 포함한 프롬프트를 생성하세요.
이미지는 다큐멘터리 전문가 사진 수준의 고퀄리티 이미지여야 합니다.
프롬프트는 영어로 작성해 주세요.
Suno 배경 음악 프롬프트:
추천된 시에 어울리는 배경 음악을 생성하세요.

진행 방법

‎/

import trafilatura
import streamlit as st
from datetime import datetime, timedelta
import time
import json
import re
from pytrends.request import TrendReq
import pandas as pd

class GoogleTrendsScraper:
    def __init__(self):
        self.pytrends = TrendReq(hl='ko', tz=540)  # Korean language, UTC+9 timezone

    def build_payload_with_retry(self, kw_list, retries=3, delay=10):
        """지수 백오프를 사용하여 build_payload를 호출합니다."""
        for attempt in range(retries):
            try:
                self.pytrends.build_payload(
                    kw_list,
                    cat=0,
                    timeframe='now 1-d',
                    geo='KR',
                    gprop=''
                )
                return True
            except Exception as e:
                if 'returned a response with code 429' in str(e):
                    st.warning(f"요청이 제한되었습니다. {delay}초 후에 재시도합니다.")
                    time.sleep(delay)
                    delay *= 2  # 대기 시간 두 배로 증가
                else:
                    st.error(f"요청 중 오류가 발생했습니다: {str(e)}")
                    return False
        return False
        
    def get_interest_score(self, keyword):
        """키워드의 24시간 관심도를 가져옵니다."""
       
        try:
            # 현재 시간과 24시간 전 시간을 계산
            end_time = datetime.now()
            start_time = end_time - timedelta(days=1)

            # timeframe 형식: 'YYYY-MM-DD HH:mm:ss' -> timeframe 형식 수정: 'now 1-d' 사용
            timeframe = 'now 1-d'
            success = self.build_payload_with_retry([keyword])
            if not success:
                return {
                    'current': 'N/A',
                    'avg': 'N/A',
                    #'trend': [] 그래프 삭제
                }
            
            # 관심도 데이터 가져오기
            interest_df = self.pytrends.interest_over_time()

            if not interest_df.empty and keyword in interest_df.columns:
                # 24시간 동안의 평균, 최대, 최소 관심도 계산
                avg_score = interest_df[keyword].mean()
                # max_score = interest_df[keyword].max()
                # min_score = interest_df[keyword].min()
                current_score = interest_df[keyword].iloc[-1]

                return {
                    'current': f"{current_score:.0f}",
                    'avg': f"{avg_score:.0f}",
                    # 'max': f"{max_score:.0f}",
                    # 'min': f"{min_score:.0f}"
                    # 'trend': interest_df[keyword].tolist()  # 시계열 데이터 삭제
                }
            return {
                'current': 'N/A',
                'avg': 'N/A',
                # 'max': 'N/A',
                # 'min': 'N/A'
                # 'trend': [] # 시계열 데이터 삭제
            }

        except Exception as e:
            st.error(f"{keyword}의 관심도를 가져오는 중 오류가 발생했습니다: {str(e)}")
            return {
                'current': 'N/A',
                'avg': 'N/A',
                # 'trend': [] 시계열 데이터 삭제
            }
                     
    def fetch_trending_keywords(self):
        try:
            # 트렌딩 키워드 가져오기
            trending_searches_df = self.pytrends.trending_searches(pn='south_korea')
            if trending_searches_df.empty:
               st.error("트렌드 데이터를 가져올 수 없습니다.")
               return []

            trends = []
            keywords = trending_searches_df.head(10).values.flatten().tolist()

            # 문제 키워드 제외
            problem_keywords = ['수능) 국어']
            keywords = [kw for kw in keywords if kw not in problem_keywords]

            # 키워드를 5개씩 묶기
            for i in range(0, len(keywords), 5):
                batch = keywords[i:i+5]
                # st.write(f"Processing batch: {batch}")  # 디버깅용 로그 추가를 삭제
                time.sleep(60)  # 대기 시간 1분으로 처리(5명씩)
                try:
                    success = self.build_payload_with_retry(batch)
                    if not success:
                        st.error(f"Failed to build payload for batch: {batch}")
                        continue  # 다음 배치로 넘어감

                  
                        
                    interest_df = self.pytrends.interest_over_time()
                  
                    if interest_df.empty:
                        st.error(f"No data returned for batch: {batch}")
                        continue  # 다음 배치로 넘어감

                    for keyword in batch:
                       # st.write(f"Processing keyword: {keyword}")  # 키워드 처리 로그 추가를 삭제
                        try:                        
                            if keyword in interest_df.columns:
                                avg_score = interest_df[keyword].mean()
                                current_score = interest_df[keyword].iloc[-1]
                                # trend_data = interest_df[keyword].tolist() 시계열데이터 삭제
                            else:
                                avg_score = 'N/A'
                                current_score = 'N/A'
                               # trend_data = [] 시계열 데이터 삭제

                            trends.append({
                                'rank': keywords.index(keyword) + 1,
                                'keyword': keyword,
                                'traffic': {
                                    'current': f"{current_score:.0f}" if current_score != 'N/A' else 'N/A',
                                    'avg': f"{avg_score:.0f}" if avg_score != 'N/A' else 'N/A',
                                    # 'trend': trend_data 시계열 데이터 삭제
                                },
                                'timestamp': datetime.now().strftime("%Y-%m-%d %H:%M:%S")
                            })
                        except Exception as e:
                            st.error(f"Error processing keyword '{keyword}': {str(e)}")
                            continue
                except Exception as e:
                    st.error(f"Error processing batch {batch}: {str(e)}")
                    continue
            st.write(f"Total trends collected: {len(trends)}")
            return trends     
              
        except Exception as e:
            st.error(f"트렌드 데이터를 가져오는 중 오류가 발생했습니다: {str(e)}")
            return[]

    def get_cached_trends(self, cache_duration=300):  # 5분 캐시
        current_time = time.time()
        
        if (hasattr(self, '_cache') and hasattr(self, '_cache_time') and 
            current_time - self._cache_time < cache_duration):
            return self._cache

        # 새로운 데이터 가져오기
        trends = self.fetch_trending_keywords()
        self._cache = trends
        self._cache_time = current_time

        return trends

def main():
    st.title("Google Trends Korea Real-time")

    # Create a placeholder for auto-refresh status
    status_container = st.empty()
    scraper = GoogleTrendsScraper()

    # 새로고침 버튼
    if st.button("새로고침"):
        st.experimental_rerun()

    # 현재 시간을 표시할 placeholder
    with status_container:
        st.write(f"{datetime.now().strftime('%Y-%m-%d %H:%M')} 현재 실시간 인기 검색어 TOP 10")
        

    # 데이터 로딩 중 표시
    with st.spinner('데이터를 불러오는 중...'):
        trends = scraper.get_cached_trends()

    if trends:
        for trend in trends:
            with st.container():
                col1, col2 = st.columns([3, 1])
                with col1:
                    st.write(f"**{trend['rank']}. {trend['keyword']}**")
                with col2:
                    traffic = trend['traffic']
                    st.write(f"현재 관심도: {traffic['current']}")
                    st.write(f"평균 관심도: {traffic['avg']}")

                # 트렌드 라인 추가
                # if traffic['trend']:
                #     chart_data = pd.DataFrame({
                #         '시간': pd.date_range(end=datetime.now(), periods=len(traffic['trend']), freq='h'),
                #        '관심도': traffic['trend']
                #    })
                #    # st.line_chart(chart_data.set_index('시간')) 시계열데이터 삭제
                
                st.write(f"시간: {trend['timestamp']}")
                st.divider()
    else:
        st.warning("트렌드 데이터를 가져올 수 없습니다.")

   

if __name__ == "__main__":
    main()

결과와 배운 점

도움 받은 글 (옵션)

‎없음

⏰ 가장 빠르게 AI를 배우는 곳 | 지피터스 AI스터디 20기 대기자 모집 중 ⏰

구글트렌드 실시간 인기검색어 추출하기

소개

진행 방법

결과와 배운 점

도움 받은 글 (옵션)

👉 이 게시글도 읽어보세요