🧠 Building a Korean Character Recognition Model with PyTorch
Hangul, the Korean writing system, is both elegant and systematic. Each character is a combination of components called Jamo: 초성 (initial), 중성 (medial), and 종성 (final). In this project, I built a deep learning model using PyTorch to recognize these components from rendered character images. Here's how I did it. ✨ Project Overview This project involves: Generating synthetic images of Hangul characters with their bounding boxes. Decomposing characters into Jamo components. Training a convolutional neural network to classify each character's 초성, 중성, and 종성. Performing inference on single or multiple images and visualizing the results. Let’s break it down. 🏗️ Step 1: Generate Hangul Character Dataset We use a TrueType font (like Malgun Gothic ) to render images of Hangul syllables and compute their bounding boxes. Each image is saved along with Jamo annotations in JSON. generate_korean_chars.py from PIL import Image , ImageDraw , ImageFont import os i...
Comments
Post a Comment