Breaking Limits in Scene Text Recognition: CUEE MDAP Lab’s AI Super-Resolution Model
From the CUEE MDAP Lab (Multimedia Data Analytics and Processing Research Unit), Department of Electrical Engineering, Chulalongkorn University

Turning Blurry Words into Crystal-Clear Text with AI
Have you ever taken a photo of a sign or document, only to find the text too blurry to read? This is a big challenge not just for us, but also for computers. When machines try to read text from low-quality images—whether it’s for traffic signs, street names, or scanned documents—they often struggle. This problem is known as Scene Text Recognition (STR).
To fix this, researchers developed a technique called Scene Text Image Super-Resolution (STISR). Think of it like giving glasses to a blurry image: it sharpens the text, making it clearer and easier for both humans and AI systems to understand.

📸 The Problem with Blurry Text
Cameras in real life—like the ones in phones, drones, or surveillance systems—don’t always capture perfect images. Poor lighting, shaky hands, or distance can make text hard to read. Traditional methods to sharpen images improve the picture, but they often miss the fine details that matter most for text: edges, strokes, and character shapes.

🤖 Enter MADN: Multi-Attention with Diffusion Network
A new model called MADN has been designed to solve this exact issue. Instead of just brightening or sharpening the whole image, MADN uses smart AI techniques to focus only on the important parts—the letters themselves.
Here’s how it works:
- Attention Modules: Like a human eye focusing on key details, MADN has a system that decides what parts of the image matter most. It zooms in on the letters while ignoring the background.
- Sequential Learning: Text isn’t just shapes; it’s a sequence of letters. MADN uses a memory mechanism (BLSTM) to understand how characters relate to each other, helping it guess missing or blurry parts more accurately.
- Diffusion Process: This clever trick gradually removes noise and sharpens details, almost like a digital artist carefully redrawing each letter.
The result? Text images that go from unreadable blobs to clear, structured words.

In tests with the TextZoom dataset (a benchmark for this problem), MADN outperformed previous models, delivering sharper images and higher recognition accuracy.

🚀 A Step Toward Smarter Vision
What makes MADN stand out is its balance of accuracy and efficiency—delivering state-of-the-art text clarity with moderate compute. The paper also notes real-time optimization for edge deployments in autonomous and smart-city scenarios as the next step.
🌍 Why This Matters
This technology isn’t just academic. It has a real-world impact in:

Autonomous Driving
Reading road signs in poor weather.

Smart Cities
Digitizing old or damaged signs.

Document Scanning
Recovering details from low-quality scans.

Security and Surveillance
Making sense of blurry camera footage.
In Short: The MADN model gives AI-powered “glasses” to blurry text images, making them readable and useful again. It’s a powerful step forward in how machines—and people—can interact with the world through clearer, sharper digital vision.
🔗 Learn more & official links
- CUEE MDAP official website
- Collaboration announcement: MDAP × Design Gateway
📬 Contact Us
Interested in research from the CUEE MDAP Lab—including this MADN work or other lab projects such as:
- Anomaly Detection
- Digital Image and Video Super‑resolution Techniques
- Digital Video Coding
- Face Recognition and Emotional Expression
Please reach out via Design Gateway to connect with the team and learn more.
