Self-hosted WhatsApp message deduplication app with backup and TG forwarding

So I stopped being lazy and tried to solve a problem that has been bothering me for a while with the help of Claude Opus. For context, I know nothing of coding, I’ve never studied it formally and my field of work has very little in common with coding or software.

However, as a hobbyist with interest in tech and PCs, I’ve picked up a few things through sheer osmosis by reading and watching tons of youtube. I may struggle to write “Hello world” without clear instructions but I can understand and follow manuals and docs, fiddle around with trial and error and figure out stuff that way.

So this is my attempt to vibe-code with AI and figure out a solution to a problem that is annoying me in every day life. It might not be for everyone, but I am just so glad that people like me with such minimal background can use these tools and build solutions without bothering another human.

The Problem

Like many of you, my Whatsapp is filled with random groups, so much so that it has become an annoying task to keep track of everything. Some large groups fill up so fast that you can’t even read half of them if you aren’t constantly online. Most of these are hobby based, like 3d-printing or homelabbing etc. I have also joined a few deal-sharing or buy-sell groups, and I find the same users posting across different communities with the same messages every day, and I hate reading duplicate/same messages. The same deal, meme, or news gets forwarded to 10 different groups.

Another issue is Whatsapp search is quite limited and cumbersome. Whenever I am searching something, I can only see half the text in results and have to open individual chats to see if the results actually match my query.

I am not sure if there already exists a tool to organize and de-duplicate messages for me so that I can easily read through and don’t waste my time every day scrolling long chats. I tried searching for one, but couldn’t find much info.

The Solution

I told Claude Opus what problems I was facing and asked how it would handle it. And after a few hits and misses, this is what it came up with.

GitHub - bakasur-te/whatsapp-dedup-dashboard: Self-hosted WhatsApp message deduplication dashboard

WhatsApp Dedup Dashboard is a self-hosted application that:

  1. Captures all incoming messages in real-time
  2. Automatically detects duplicates (same sender + same content = duplicate)
  3. Stores only unique messages in a searchable database
  4. Provides a clean web dashboard to browse, filter, and search

How It Works

WhatsApp Web ──→ whatsapp-web.js ──→ Deduplication ──→ SQLite DB ──→ Dashboard
                                          ↓
                                    Telegram Bot (optional)

Message Flow

  1. You scan a QR code to link WhatsApp Web
  2. All incoming messages are captured via whatsapp-web.js
  3. Each message gets a unique hash (SHA-256 of sender + content)
  4. If the hash already exists → marked as duplicate, not stored
  5. Unique messages are saved to SQLite with full metadata
  6. Media files are deduplicated by content hash (same image = stored once)

Deduplication Logic

Message Type How Duplicates Are Detected
Text SHA-256(sender_id + message_body)
Image/Video SHA-256(file_content)
Mixed Both checks applied

Key Features

Dashboard

  • Dark-themed, responsive web UI
  • Filter by: links, prices, media, date range, specific chats
  • Full-text search across all messages
  • Click to view images/videos

Backup Export

I am keeping all files locally, so right now, they are all unencrypted for ease of use.

  • Export all unique messages as browsable HTML files
  • JSON export for data portability
  • Media files included
  • View offline on any device

Telegram Forwarding

This is an optional feature. One may or may not want to use it. I included it mostly for buy/sell groups so that I can keep track of all deals in a single place without going through multiple groups.

  • Select specific groups (e.g., buy/sell groups), and ignore other groups like friends or hobby ones
  • Forward unique messages to a Telegram channel
  • 30-second merge window: Multiple messages from same sender get combined
  • Never miss a deal, without the spam

Auto-Cleanup

  • Group messages: deleted after 30 days
  • Individual chats: kept forever
  • Orphaned media files: auto-removed

Technology Stack

Component Technology
Backend Node.js 20
WhatsApp Integration whatsapp-web.js (Puppeteer-based)
Database SQLite (via better-sqlite3)
Web Server Express.js
Frontend Vanilla HTML/CSS/JS
Deployment Docker

Who Is This For?

Good fit if you:

  • Are in many WhatsApp groups with overlapping content
  • Want to search across all groups in one place
  • Need a local, unencrypted backup of your messages
  • Want to forward specific group messages to Telegram
  • Have a home server or VPS to run Docker

Not for you if:

  • You only use WhatsApp for 1:1 chats
  • You don’t have a server to host it
  • You need real-time sync across devices (this is view-only)

Privacy & Security

  • 100% self-hosted — Your data never leaves your server
  • No cloud dependencies — Works offline after initial setup
  • Unencrypted storage — You can read/export your own data anytime
  • Open source — Audit the code yourself

I am hosting it on my home ubuntu server. No plan to use cloud anywhere in this workflow.

Getting Started

  1. Clone the repo
  2. Run docker compose up -d --build
  3. Scan QR code at http://your-server:3000
  4. Done!

Full instructions in the README file.

I did this over the weekend with Claude Opus in Google Antigravity. I have some other features I plan to add to this, like links filtering or AI summary of a single chat, but that might take some time, some other free weekend maybe.

I can’t imagine what these AI chatbots will be capable of coding in days to come. Have you guys also used vibe-coding to figure out any personal solutions? I would love to hear your experiences.

23 Likes

Amazing work! Will definitely try it

1 Like

Interesting project.

I have been using only github pro and perplexity pro, due to your post , I will be checking this out .

What other gpt offering have you tried for software development?

1 Like

Claude has been quite good for coding. Grok and Chatgpt for setting up stuff or experimenting with tools, solving errors etc. Gemini does everything well now, imo, but it used to be bad before last year. Anti-gravity helps with running code directly on your machine, mostly in ubuntu subsytem on windows.

1 Like

I see many are going gaga over claude code, some are recommending opencode apart from antigravity. Have you used the other two and how do these the compare? Anyone who used these can chime in.

If you are registered as a business on WhatsApp, using a third-party lib like whatsapp-web.js will run a minor risk of some action may be even a ban from Meta in the future.

Even regular users aren’t immune but less likely to happen. Not saying don’t use but just be aware of the risks involved.

3 Likes

I started off with claude code, it helped me automate a bunch of stuff that I do use for my work. It worked well for my usage. Basically, it uses WSL in Windows to run stuff using command line or powershell. Antigravity does the same thing, but it has fancier UI and is easier to manage different projects. After moving to Antigravity, I am not going back to Claude Code. Haven’t tried Opencode so far, so can’t comment.

Thanks for mentioning this, I should have done that in the top post.
Yes, I am aware of the risks, but after researching a bit, it was impressed upon me that Meta does this to stop people from sending out messages in bulk. This tool is view-only, meaning there is no option to send messages, only to view them. That should significantly reduce the risk, but it is still there, so anyone wanting to use this, please consider that.

Btw is there any other way to capture messages that does not violate any Meta polices? When I had asked Claude, it answered in negative. Meta doesn’t want any other tools between its official tools/apis and client side.

1 Like

WhatsApp doesn’t charge its users. So, its business model depends on gatekeeping its users.

So no bots and unofficial libs will be tolerated. Even the open platforms like twitter and reddit went back on their policy and only closed ones like discord are surviving.

There isn’t and will probably never be a blessed/official way to interact with them programmatically without paying big amounts.

1 Like

Isn’t this all client side JS handling based off their official Whatsapp Web ? If yes where is the risk ?

To OP, excellent work although I don’t have this use case. Now convert this to a APK and profit.

1 Like

It is a word of caution. Even the lib’s page says as much.

As for how they can detect, browser fingerprinting gives enough signals to figure out immediately or later in the future. Fingerprinting also includes os, the GPU and even attached peripherals.

They can just tolerate it for a while to avoid backlash and all it takes is one person to abuse and they’ll have their reason to block few or everyone using the lib.

2 Likes

I have been using the Whatsapp Web to Go app in one of my Android tablets for the last 5-6 years, which is similar. No issues so far.

I think they need to call it out as a disclaimer since one, they are not official obviously and two, they are dependent on WA not changing things in UI or functionality which they then have to catch up on. Its the similar cat and mouse game as YT and Revanced.

1 Like

Why am I seeing google antigravity in reddit posts (that too in india tinder somthing sub lol ), now op has mentioned it, new way of ad campaign?

post link

another one

Not really. They mainly want business users to use their api (and pay accordingly)