Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not surprised at all. Rage bait has been leading to clicks since before the dawn of AOL.

About that data though, just publish that. Throw the data and tooling up on github or huggingface if it's a massive dataset. Would be interested in comparing methodologies for deriving sentiment.





Here's the raw dataset: https://asof.app/static/hn_viral_dataset.json

159 stories that hit score 100 in my tracking, with HN points, comments, and first-seen timestamp.

Methodology: - Snapshots every 30 minutes (1,576 total) - Filtered to score=100 (my tracking cap) - Deduped by URL, kept first occurrence - Date range: Dec 2025 - Jan 2026

For sentiment, I ran GPT-4 on the full article text with a simple positive/negative/neutral classification. Not perfect but consistent enough to see the 2:1 pattern.


Thought about this during the morning. I'm run the posts through ministral3:3b, mistral-small3.2:24b, and gpt-oss:20b this weekend to build a sentiment mapping and see what I get. I'm optimistic about ministral3:3b, but the other two are pretty good at this type of stuff.

Interesting, would love to see the results. I'll be checking back here if you care to share them.

Yeah, I plan to do a follow up comment with data and results.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: