Read 120MB HTML file problem

Aikon · March 7, 2026, 1:22am

Hello all,

Got a situation which I would really appreciate some help. Got a Whatsapp chat as HTML which is around 120MB. Browsers are unable to read it all due to the size which will create memory issues. I need the messages with one particular contact and not all. Is there any tool that can help in this pls? Maybe a tool that can split up the html file into smaller ones html ones and then I can check the content from each? Any help appreciated.

Thanks all

yoursoul · March 7, 2026, 5:07am

notepad++ is best for this purpose, use its search functions and just select the file or even the folder.

Aikon · March 7, 2026, 6:13am

It will grind to a halt and crash due to the sheer size (more than 72k records) when trying to load it completely in memory. I was thinking of a parser that would maybe load it piecewise.

yoursoul · March 7, 2026, 10:19pm

i hope this thing will work https://github.com/riversun/LLPAD
actually i have used a tool like this for logs file upto 500mb but it was a long time ago and i donot recall its name in my mind, still give the github repo a try.

Indianapolis · April 6, 2026, 9:50am

120MB WhatsApp HTML Chat Export: Extract Only One Contact’s Messages Without Browser Crashes

Yeah, that’s a classic pain — WhatsApp’s HTML exports can balloon to huge sizes when you’ve got years of chats, and browsers just choke trying to load the whole thing at once. You only need the messages from one specific contact, so there’s no reason to wrestle with the full 120MB file.

The good news is you don’t need any paid software or risky online uploaders. The cleanest and most reliable fix is a short Python script that parses the HTML and pulls out only the messages from that one contact (plus timestamps, media references, etc.). It runs locally, keeps everything private, and spits out a much smaller, usable file.

Best Solution: Quick Python Script (Takes 2–3 Minutes to Set Up)

This uses BeautifulSoup — super lightweight and perfect for WhatsApp’s HTML structure.

Step 1: Install the tools (one-time only)
Open your terminal / command prompt and run:

Bash

pip install beautifulsoup4 lxml

Step 2: Save this script as extract_whatsapp_contact.py

Python

from bs4 import BeautifulSoup
import sys

# === CHANGE THESE TWO LINES ===
input_file = "your_chat.html"          # your 120MB file
contact_name = "Exact Contact Name"    # put the exact name as it appears in the chat
output_file = "chat_with_" + contact_name.replace(" ", "_") + ".html"

# Load the big file
print("Loading the big HTML file... (this might take a minute)")
with open(input_file, "r", encoding="utf-8") as f:
    soup = BeautifulSoup(f, "lxml")

# Find all message blocks
messages = soup.find_all("div", class_="message")  # WhatsApp HTML uses this class

extracted = []
for msg in messages:
    # Look for the sender name (adjust selector if your export is slightly different)
    sender_tag = msg.find("span", class_=["author", "sender", "chat-author"]) or msg.find(string=lambda text: contact_name in text if text else False)
    
    if sender_tag and contact_name.lower() in str(sender_tag).lower():
        extracted.append(str(msg))

# Build a clean new HTML file
if extracted:
    new_html = f"""
    <!DOCTYPE html>
    <html><head><meta charset="utf-8"><title>Chat with {contact_name}</title></head>
    <body><h1>Chat with {contact_name}</h1><div class="chat-container">
    {"".join(extracted)}
    </div></body></html>
    """
    
    with open(output_file, "w", encoding="utf-8") as f:
        f.write(new_html)
    
    print(f"✅ Done! Created {output_file} with only {len(extracted)} messages from {contact_name}")
else:
    print("No messages found for that contact name — double-check the spelling.")

Step 3: Run it

Bash

python extract_whatsapp_contact.py

The resulting file will be tiny (usually just a few MB) and opens instantly in any browser.

Tip: If the script doesn’t catch the messages, open a tiny part of your HTML in a text editor and search for the contact name — tell me what tags surround it and I can tweak the selector for you.

Alternative: Split the Big File Into Smaller Chunks (If You Prefer Manual Checking)

If you really want to split the full HTML into smaller pieces first:

On Windows: Use a free tool like GSplit (portable, no install) → https://www.gdgsoft.com/gsplit/ Split by size (e.g. 10MB chunks) — then open each small HTML and Ctrl+F for the contact name.
On Mac/Linux: In terminal:

Bash
```
split -b 10m your_chat.html chunk_
```
Then rename each chunk back to .html and open one by one.

Note: Pure splitting can sometimes break the HTML formatting, so the Python method above is cleaner and more targeted.

Quick Tips

Make a backup copy of the original 120MB file before doing anything.
The script runs completely offline — nothing gets uploaded.
If your export includes media, the links will still work as long as the media folder is in the same place.

This should get you exactly what you need without any memory headaches. Drop the exact contact name (or a sample of how it appears in the HTML) if you want me to adjust the script further.

You’re all set now.

Nightypop · April 8, 2026, 7:04pm

you can create your own webhosting free here and read the html files easily via your browser : https://souini.eu.cc/

lee_lee · April 9, 2026, 8:05am

Thanks for this

Tempo_Khan · May 7, 2026, 4:40pm

1. Quick & Dirty Splitting (No Coding)

Use a text editor that handles huge files:
- EmEditor (free version supports huge files)
- Notepad++ with “Large File” plugin
- VS Code (with “Large File” settings)
Or use an online HTML splitter (if it accepts 120MB) or split the file into smaller parts using:
- 7-Zip → Split into volumes (e.g., 20MB each)
- HJSplit or GSplit (free file splitters)

Then open the smaller HTML pieces one by one and search for your contact’s name.

2. Other Good Tools

WhatsApp-Chat-Exporter (GitHub) → Great for handling large exports and splitting.
Convert the HTML to text first, then filter with any text tool.

look up on github for more tools, attaching this one tool here 1-Tool

TheStrength · July 4, 2026, 5:56pm

The file’s fine — the browser just tries to paint all ~72k messages at once and dies. Stop rendering it and the problem’s gone.

❌ browser → loads ALL 120MB at once      → 💀 out of memory
✅ tool    → reads/streams, keeps 1 person → ✅ tiny clean file

Every fix below runs 100% on your PC (never feed a private chat to an online “splitter”). Lazy pick = klogg. Grab a lane

🖱️ open the beast without it biting — zero code

klogg — free viewer that reads a file off disk instead of loading it into RAM (memory), so 120MB (or 120GB) opens instantly. Type the contact’s name → every hit shows live. Win/Mac/Linux.
EmEditor — opens files up to 16TB; even the free tier eats huge files + regex (search-by-pattern).

⌨️ one line → their whole thread, ripped out

Okay with a terminal? ripgrep (rg):

rg -i -C 3 "Contact Name" chat.html > just_them.txt

-i ignores caps, -C 3 grabs 3 lines around each hit (context, not naked matches), > saves it to a new file. Whole thing, ~2 seconds.

Message spans several lines? ugrep does multi-line + name AND date searches.
rga searches inside the zip without even unpacking it.

📄 walk away with a clean transcript (csv / json / txt)

Tiny Python script — just use a parser that streams instead of swallowing the whole file:

selectolax — fast, low-memory, grabs elements by tag; keep only your contact’s blocks. 5–30x lighter than BeautifulSoup (the library that chokes on big files).
lol-html — Cloudflare’s streaming HTML tool, literally built for files bigger than your RAM.

🚪 the door WhatsApp left open (skip the HTML)

WhatsApp’s own Export Chat makes a plain _chat.txt, not HTML — your 120MB came from some converter. Still have the chat on your phone? Re-export one contact and there’s no giant HTML to fight.

whatstk — feed the .txt; df[df.username=='Their Name'] is the whole job. Ships a whatstk-to-csv command.
WhatsApp-Chat-Exporter — from the phone database, --include pulls just one number.

🕰️ old tricks that still slap (steal the fix)

Pulling one sender out of one giant file was solved decades ago — for email.

grepmail — yanks every complete message from one sender, never chopping one in half, stays light on memory. An email archive is the same shape as a chat log.
Copy the move from DiscordChatExporter (per-user filter + file split, built in) or sigtop (Signal, matches names by regex).
Chopping the file yourself? Cut on message boundaries, not size — awk RS never slices a message.
Name shows as JosÃ© instead of José? Just an encoding mixup, not a broken file — ftfy un-garbles it so your search matches again.

💡 where this actually saves your ass

Evidence for court / HR / a dispute — hand over only one person’s messages as a clean file, not 72k lines to scroll.
Backing up before you block or delete someone — keep just their thread.
Group chat ballooned to 200MB — pull one member’s messages out and dump the rest.
Old phone died, all you’ve got is the export — recover a single conversation from it.
Feeding your own chats to an AI / building a dataset — export one person’s lines straight to CSV.

Recap: peek with klogg, rip with ripgrep, or skip the whole mess via _chat.txt. One clean file, ~10 minutes, browser never touches it.

Never open the haystack. Reach in and pull the needle.

Topic		Replies	Views
Need help downloading a file asking for premium subscription Discussion & Solutions solved	3	464	April 10, 2026
How to download a whole website Discussion & Solutions solved	4	1442	June 12, 2020
[ IMP. ] How to copy a website URL with all files (HTML,CSS,JS,PHP) Discussion & Solutions solved	5	2134	April 3, 2021
How To Split Large Text File In Windows Tutorials & Methods tips-tricks , file-management	1	760	April 11, 2022
How to grab ebooks from view only site? Discussion & Solutions solved	1	913	May 15, 2020

Read 120MB HTML file problem

1. Quick & Dirty Splitting (No Coding)

2. Other Good Tools

Related topics