ChatGPT 와 함께 Vision 인식 솔루션 개발 일기 1

아래 글을 보고 거품비율 또는 실시간 영상 인식 조언을 주실분은 많은 조언 바랍니다.

목적
1. usbCam를 통해 들어오는 영상을 분석해서 거품 상태를 분석해서 마무리 단계가 되면 알림음을 플레이 되게 하는 솔루션을 만들고자 한다.
2. 저렴한 비용으로 인공지능 기능을 활용한 다양한 솔루션을 만들어 중소기업들에게 맞춤 솔루션 제공
3. 거품 비율을 분석하는 식으로도 코드를 수정하고 싶다.

개발환경
1. window 기반 컴퓨터 , VScode + Python(env) + 아나콘다 X
2. 모델은 촬영한 영상데이터기반으로 티처블 머신(https://teachablemachine.withgoogle.com/)으로 학습 시킨다
3. 티처블 머신으로 만든 모델을 가지고 실시간 ubsCam 영상을 분석해서 마무리 단계 인식률을 높인다.

개발진행 - 데이타 학습 시키기

1. 촬영한 영상을 3단계 라벨로 이미지 형태로 추출하여 학습
- 라벨1. Starting (거품 초기)
- 라벨2. Going
- 라벨3. Ending

* 추출 방법은 애프테 이펙트에서 추출하는 방식으로 진행

2. 티쳐블머신에서 획득함 노델 다운로드 이름은 keras_model.h5 이렇게 준다 신기하다...

그리고 티처블 머신에서 예제코드를 주어서 돌려보았다. 안돌아간다...
😟

3. 이때부터 ChatGPT와 대화를 시작했다.
아래 코드를 기반으로 계속 대화중이고 다시 영상을 촬영해 왔기때문에 다시 촬영한 영상으로 다시 학습을 시키고 코드를 더 수정해 보려고 합니다.
그래도 ChatGPT 덕분에 계속 질문하고 시도하고 반복적으로 개선되고 있어서 좋네요
물론 잘 아는 사람에게 가르침 받는게 최고이긴 한데....

import tensorflow as tf
import cv2  # Install opencv-python
import numpy as np
from tensorflow.keras.models import load_model
import pygame
# Disable scientific notation for clarity
np.set_printoptions(suppress=True)

# Load the model
model = load_model("keras_model.h5", compile=False)

# Load the labels
class_names = open("labels.txt", "r", encoding="utf-8").readlines()

# CAMERA can be 0 or 1 based on default camera of your computer
camera = cv2.VideoCapture(0)

while True:
    # Grab the webcamera's image.
    ret, image = camera.read()

    # Resize the raw image into (224-height,224-width) pixels
    image = cv2.resize(image, (224, 224), interpolation=cv2.INTER_AREA)

    # Show the image in a window
    cv2.imshow("Webcam Image", image)

    # Make the image a numpy array and reshape it to the models input shape.
    image = np.asarray(image, dtype=np.float32).reshape(1, 224, 224, 3)

    # Normalize the image array
    image = (image / 127.5) - 1

    # Predicts the model
    prediction = model.predict(image)
    index = np.argmax(prediction)
    class_name = class_names[index]
    confidence_score = prediction[0][index]
    
    if class_name[2:].strip() == "거품끝":
        pygame.mixer.init()
        pygame.mixer.music.load('LatinBgm2.mp3')
        pygame.mixer.music.play()

    # Print prediction and confidence score
    print("Class:", class_name[2:], end="")
    print("Confidence Score:", str(np.round(confidence_score * 100))[:-2], "%")


    # Listen to the keyboard for presses.
    keyboard_input = cv2.waitKey(1)

    # 27 is the ASCII for the esc key on your keyboard.
    if keyboard_input == 27:
        break

camera.release()
cv2.destroyAllWindows()

위 코드에서  아래 부분값을  비교하고 싶어요
if문으로 class_name 이  "거품끝" 이고  confidence_score 가 95 이상인 이 두조건을 만족할 때 [pygame.mixer.init()
        pygame.mixer.music.load('LatinBgm2.mp3')
        pygame.mixer.music.play()] 이 부분이 진행 되게 조건문을 만들어주세요
class_name = class_names[index]
confidence_score = prediction[0][index]

해결할점
1. 기존 영상 데이터로 Ending 시점으로 티처블머신으로 획득한 모델를 기반으로 ChatGPT와 코드 완성하기
2. 거품 비율로 Endig 시점을 모델없이 실시간으로 인식이 가능한지

https://lovetang.tistory.com/12

그럼 다음 개발 일기2에서 만나요
학습하고 있는 데이터

⏰ 가장 빠르게 AI를 배우는 곳 | 지피터스 AI스터디 17기 🚀

ChatGPT 와 함께 Vision 인식 솔루션 개발 일기 1

👉 이 게시글도 읽어보세요