What's the maximum file size?

50 MB per archive. For larger archives, prefer a desktop tool like The Unarchiver (macOS), Bandizip (Windows) or the command-line unzip -O (Linux / macOS). Those have no practical size cap and run faster than uploading over the network.

ZIP Unicode Processing

Fix and convert Unicode filenames in ZIP files. Supports encoding repair, file list extraction, and encoding conversion

1.0.0

Version

Auth

Batch

About ZIP Unicode Processing

Someone in Shanghai zips a folder of design assets and emails it over. You extract on macOS and the filenames arrive as 寤鸿_˙??_璁捐.pdf — pure gibberish. A Japanese contractor sends a ZIP and Windows shows ƒvƒƒWƒFƒNƒg.txt instead of project.txt. A German colleague's filenames have ü turned into garbage. The file contents are intact; the metadata about which encoding the filenames were stored in never made it into the ZIP, and your extractor guessed wrong.

This tool reads the ZIP central directory, identifies the original filename encoding — UTF-8, GBK, GB2312, Shift-JIS / CP932, Latin1 or CP437 — and re-encodes every entry name to UTF-8 with the proper Unicode flag set. You can repair the archive in place (output a new ZIP with corrected names), extract just the file list to verify the fix before downloading, or override the source encoding manually when auto-detection picks the wrong one. No installing The Unarchiver, Bandizip or 7-Zip; no fiddling with unzip -O cp932 on the command line.

Typical scenarios: receiving a ZIP from a colleague on a different OS locale, downloading email attachments where the UTF-8 flag was never set, working with legacy archives created before ZIP 6.3 made UTF-8 metadata standard, or sharing project files across multilingual teams. Archives up to 50 MB are processed in a stateless serverless function and discarded after the response.

ZIP Unicode Processing Use Cases

Receiving ZIP archives from Chinese-, Japanese- or Korean-language Windows machines
Sharing project files across cross-OS teams (Windows / macOS / Linux) without filename garbling
Extracting archives from email attachments where the UTF-8 flag was not set correctly
Cleaning up legacy ZIP archives created before ZIP 6.3 UTF-8 metadata became standard
Verifying filenames before extracting a large download to avoid filesystem write errors
Converting GBK or Shift-JIS archives to UTF-8 for upload to cloud storage services
Recovering re-zipped archives where mojibake was baked into the inner filenames

ZIP Unicode Processing Features

Auto-detects original encoding — UTF-8, GBK, GB2312, Shift-JIS / CP932, Latin1, CP437
Three actions: fix archive (rewrite with UTF-8 names), extract file list, or convert specific encoding
Outputs a clean ZIP with the proper UTF-8 flag set — opens identically on Windows, macOS and Linux
File-list preview shows original (garbled) and repaired names side-by-side before download
Handles ZIP archives up to 50 MB containing thousands of entries
No software install required — replaces The Unarchiver, Bandizip and unzip -O cp932 for ad-hoc fixes
Stateless processing — uploaded archive is discarded after the response, nothing retained on disk

How to Use ZIP Unicode Processing

Upload your garbled ZIP

Drag and drop a .zip archive (up to 50 MB) into the upload area. Inner files of any language work — the tool inspects the central directory to identify the original encoding.

Pick an action

Fix Encoding rewrites the archive with UTF-8 names (most common). Extract File List shows what's inside without modifying anything. Convert Encoding lets you force a specific source encoding if auto-detection picks wrong.

Click Start Processing

The tool scans every entry, decodes the original bytes, re-encodes to UTF-8 and writes a new archive with the proper Unicode flag set. Most ZIPs process in under a second.

Verify the file list

The result panel shows original and repaired names so you can confirm the fix worked. Entries where the decode still looks wrong usually mean the source encoding was unusual — switch to Convert mode and try another encoding.

Download the cleaned ZIP

The output archive opens cleanly on any modern OS — no configuring system locale, installing The Unarchiver, or running unzip -O cp932 from the command line.

ZIP Unicode Processing FAQ

Older ZIP archives stored filenames in whatever encoding the local OS used at compression time — CP932 on Japanese Windows, GBK on Chinese Windows, Latin1 on Western European systems. The ZIP format had no field to say 'these names are GBK', so extractors on a different OS guess (often CP437) and produce mojibake. ZIP 6.3 added a UTF-8 flag in 2006, but tools created before then or running on older OS locales still produce non-UTF-8 archives.

UTF-8 (the modern standard), GBK and GB2312 (Simplified Chinese), Shift-JIS / CP932 (Japanese), CP949 (Korean — via auto-detect), Latin1 / ISO-8859-1 (Western European), and CP437 (the historical IBM PC default). Auto-detection picks the most likely encoding from byte patterns; you can override manually in Convert mode if needed.

The file is processed in a stateless serverless function and discarded immediately after the response is returned. Nothing is logged to disk and no copy is retained. If the archive contains sensitive payloads (source code, contracts, PII), the file leaves only as the repaired ZIP download in your own browser.

50 MB per archive. For larger archives, prefer a desktop tool like The Unarchiver (macOS), Bandizip (Windows) or the command-line unzip -O <encoding> (Linux / macOS). Those have no practical size cap and run faster than uploading over the network.

No. The tool only rewrites filename metadata in the ZIP central directory — the compressed file contents are passed through unchanged byte-for-byte. Any binary, document, image or video inside the archive opens identically to the original.

Switch to Convert mode and select the source encoding manually. If you know the archive came from Japanese Windows, force Shift-JIS / CP932; from Chinese Windows, force GBK; from older Korean Windows, force CP949. Auto-detection is heuristic — manual override is more reliable when you know the origin.

The Unarchiver is excellent on macOS but requires a download and install, and only fixes during extraction — it doesn't produce a repaired ZIP you can re-share. unzip -O works on the command line but only extracts, with the same caveat. This tool produces a corrected ZIP you can hand back to your collaborator, runs anywhere with a browser, and supports more encodings than CP932-only tools.

ZIP File

Supports ZIP format, max 50MB

Processing Mode

Select processing mode: Fix encoding will automatically detect and repair filename encoding issues

File information will be displayed after selecting a ZIP file

ZIP Unicode Processing Tutorial

Fix Encoding

Select the ZIP file to fix
Choose "Fix Encoding" processing mode
Click "Start Processing" button
Wait for processing to complete and download the fixed ZIP file

Extract File List

Select a ZIP file
Choose "Extract File List" processing mode
Click "Start Processing" button
View the file list, including original and fixed filenames

Convert Encoding

Select the ZIP file to convert
Choose "Convert to UTF-8 Encoding" processing mode
Click "Start Processing" button
Download the converted ZIP file

Features

Supports Unicode filename encoding repair
Supports ZIP file list extraction
Supports filename encoding conversion (UTF-8)
Automatically detects and fixes common encoding issues (GBK, GB2312, etc.)
Processing results include file list and error information
Supports ZIP files up to 50MB

📦 ZIP Unicode Processing

About ZIP Unicode Processing

ZIP Unicode Processing Use Cases

ZIP Unicode Processing Features

How to Use ZIP Unicode Processing

Upload your garbled ZIP

Pick an action

Click Start Processing

Verify the file list

Download the cleaned ZIP

ZIP Unicode Processing FAQ

Why do filenames in my ZIP turn into gibberish when I extract on a different OS?

Which filename encodings does this tool support?

Is my ZIP file uploaded anywhere permanent?

What's the maximum file size?

Will this break the actual files inside the ZIP?

What if auto-detection picks the wrong encoding?

How is this different from The Unarchiver or running unzip -O cp932?

.gitignore Generator

BibTeX to CSV Converter

Chmod Calculator

Coordinate Converter

CSV to Beancount Converter

DICOM Metadata Viewer