Subtitle Encoding Fixer

Auto-detect and fix subtitle file encoding. Supports SRT, ASS, SSA, VTT, LRC formats with preview and batch processing

1.0.0
Version
Auth
Batch

About Subtitle Encoding Fixer

Open a downloaded subtitle in your video player and what you see is not the Chinese, Japanese, Korean or Arabic text the file actually carries — it's a wall of mojibake: 文字 where 文字 should be, ��� in place of actual characters. The cause is almost always an encoding mismatch: the file was saved in GBK (Mainland China), Big5 (Taiwan), Shift-JIS (Japan), EUC-KR (Korea) or CP1252 (Western Windows), but your player assumed UTF-8. The fix is to re-save the file as UTF-8 — but guessing the source encoding by eye in a hex editor is a slow path.

This subtitle encoding fixer auto-detects the source encoding with chardet, reports a confidence score, and rewrites the file as UTF-8 (with or without BOM — useful when the player needs a BOM to recognise the file as Unicode). Supported source encodings cover the East Asian common cases — GBK, GB2312, Big5, Shift-JIS, EUC-KR — plus CP1252 and Latin-1 for older Windows files. Source format doesn't matter: SRT, ASS, SSA, VTT, LRC all convert with their timing and structure preserved. A before/after preview lets you confirm characters are now legible before downloading. Files up to 10 MB.

Use it to fix the recurring "downloaded SRT shows ?? in VLC" problem, normalise a multi-language subtitle archive into UTF-8, re-save a GBK-encoded Chinese fan-sub for use in modern players, prepare subtitles for YouTube or Vimeo upload (which requires UTF-8), or convert legacy Shift-JIS anime subtitles before sharing. Files are processed in a stateless serverless function and discarded immediately after the response.

Subtitle Encoding Fixer Use Cases

  • Fixing a downloaded Chinese SRT that shows as ?? or mojibake in VLC or MPV
  • Normalising a multi-language subtitle archive into one consistent UTF-8 encoding
  • Preparing subtitles for YouTube or Vimeo upload (which require UTF-8)
  • Re-saving a Big5 Taiwanese subtitle for use in modern players that expect UTF-8
  • Converting legacy Shift-JIS anime subtitles before sharing or embedding
  • Fixing a Korean EUC-KR subtitle for use on a non-Korean Windows machine
  • Adding a UTF-8 BOM so older players reliably detect the file as Unicode

Subtitle Encoding Fixer Features

  • Auto-detects source encoding with chardet and reports a confidence score so you can tell when guessing
  • Source encodings supported — GBK, GB2312, Big5, Shift-JIS, EUC-KR, CP1252, Latin-1 — the East Asian and Western legacy bases
  • Target encoding is always UTF-8, with optional BOM (utf-8-sig) for players that need a BOM to recognise Unicode
  • Source format agnostic — SRT, ASS, SSA, VTT, LRC all convert with timing and structure preserved
  • Before/after preview shows the source garbled bytes alongside the corrected UTF-8 so you can verify visually
  • Batch mode handles multiple files per submission, returned as a zip — useful for entire season archives
  • Files up to 10 MB processed in a stateless serverless function and discarded immediately after the response

How to Use Subtitle Encoding Fixer

Upload your garbled subtitle file

Drag-and-drop or click to select an .srt, .ass, .ssa, .vtt or .lrc file (up to 10 MB). The file is the one currently showing as mojibake or ?? in your player — this tool produces a corrected UTF-8 version.

Click Detect (optional)

Detect runs chardet on the file's bytes and reports the most likely source encoding (GBK, Big5, Shift-JIS, EUC-KR, etc.) with a confidence score. A confidence below 60% means you may want to set the source manually if you know it.

Pick the target — UTF-8 or UTF-8 with BOM

UTF-8 (no BOM) is the modern standard and what most players want. UTF-8 with BOM (utf-8-sig) adds the byte-order mark at the start — useful when an older player or a Windows tool needs the BOM to confidently identify the file as Unicode.

Click Convert

The serverless converter reads the bytes in the detected source encoding, decodes to Unicode strings, re-encodes as UTF-8 with the BOM choice you picked, and emits the result with all timing and structure unchanged.

Preview and download

The before/after preview shows the source (which probably looks like gibberish in your browser too) vs the corrected UTF-8. Confirm Chinese, Japanese, Korean or accented characters render correctly before downloading.

Subtitle Encoding Fixer FAQ

The file is in a legacy encoding (GBK for Mainland Chinese, Big5 for Taiwanese, Shift-JIS for Japanese, EUC-KR for Korean, CP1252 for Western Windows). VLC and other modern players assume UTF-8 by default. Converting the file to UTF-8 once with this tool fixes it for every player going forward.

GBK and GB2312 (Mainland Chinese), Big5 (Traditional Chinese / Taiwanese), Shift-JIS (Japanese), EUC-KR (Korean), CP1252 (Western European Windows), and Latin-1 (ISO-8859-1, older European files). UTF-8 input is also handled (the tool simply normalises it). The underlying chardet library covers about 30 encodings; these are the ones that show up in real subtitle workflows.

Most modern video players (VLC, MPV, mpv-android, Plex, Jellyfin) detect UTF-8 reliably without a BOM, so the no-BOM form is usually right. Add a BOM when you need backward compatibility — older Windows tools, some video editors that misdetect Unicode without it, and the occasional legacy player. YouTube and Vimeo both accept either.

Confidence scores below about 70% mean chardet is guessing. If you know the source language and time period, set the source encoding manually — GBK for modern Mainland Chinese, Big5 for older Taiwanese, Shift-JIS for older Japanese (Windows-1252 for Western files saved on Windows pre-2010 or so). The before/after preview is the fastest way to confirm you picked right.

No — only the byte-level encoding is rewritten. SRT timestamps, ASS dialogue lines, VTT cue settings and LRC timestamps all pass through unchanged. The converter operates on the file as a stream of decoded characters; structure preservation is automatic because encoding conversion is character-by-character.

No. The file is uploaded to a stateless serverless function, decoded and re-encoded, then discarded immediately after the response. Nothing is logged to durable storage. For unreleased media subtitles where even transit is a no-go, iconv on the command line (iconv -f gbk -t utf-8 input.srt > output.srt) does the same thing locally.

Notepad++ can do the same conversion (Encoding → Convert to UTF-8) once you've correctly identified the source. The differences: this tool auto-detects the source (no guessing in Notepad++), handles multiple files in batch, and works in the browser without an install — useful on a Mac, Linux or any machine where Notepad++ isn't available. Use Notepad++ for one-off single-file work; use this for batch conversion or for the auto-detect.

Supports SRT, ASS, SSA, VTT, LRC formats, max 10MB
Leave empty for auto-detection (recommended)
UTF-8 is recommended. Use "UTF-8 with BOM" for Windows Media Player / PotPlayer

Select a subtitle file to detect encoding and convert

Subtitle Encoding Fixer Tutorial

Why Does This Happen?

Subtitle files downloaded from the internet are often encoded in GBK, Big5, or Shift-JIS, but most modern players expect UTF-8. When the encoding doesn't match, Chinese/Japanese/Korean characters appear as garbled text (mojibake).

How to Fix

  1. Upload your subtitle file (.srt, .ass, .vtt, etc.)
  2. Click "Detect Encoding" to see the current encoding
  3. Choose target encoding (UTF-8 recommended)
  4. Click "Convert Encoding" and download the fixed file

Supported Encodings

  • Chinese: GBK, GB2312, GB18030, Big5
  • Japanese: Shift-JIS, EUC-JP, ISO-2022-JP
  • Korean: EUC-KR, CP949
  • Western: Latin1, CP1252, ISO-8859-1