Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Solving trust 100% is hard. Having people review apps that have names which are within a short Levenshtein distance (accounting for Unicode tricks etc.) of a popular apps' names and banning those apps, the accounts that created them and their suppliers of fake votes is not that hard, especially for a company like Google. And look at those apps' descriptions, they are complete baloney, and any two-bit text classifier which a capable intern can mock up together in a weekend from off-the-shelf components can recognize that. These guys aren't even trying, and still aren't getting caught.

Yes, it may require some monetary investment, but we're talking about $700bn company. They could afford it if they wanted to. If they are not doing it, that means they do not want to.



Of course, in hindsight you "only" have to calculate the Levehnstein distance between any product name and _all other_ product names on the store. That scales well. In order to close one single avenue for fraudulent advertisement. Maybe it's a big one, and maybe the cost is recouped through improved customer relations. Maybe.

And maybe they implement this, and calculate hundreds of millions (billions?) of Levehnstein distances every day, but the next day someone publishes the same app but with a germanized name ("Was ist App Update") and fools a couple'o hundred thousand germans. Now the solution is obvious, run the names through Google Translate for ALL languages and calculate the respective levehnstein distances! I'ts foolproof! Shame on you google for not doing it already! Simply irresponsible.


> and _all other_ product names on the store

Not true. Nobody fakes random products. It's the top scoring ones that are getting faked - for the obvious reason that this is what people are looking for. If you're not in top N (100, 200, whatever), faking you is useless, you just replacing nobody with nobody (exception may be bank apps, where even faking relatively obscure ones can be lucrative, but let's not get into niches for now). Just scanning against the top ones would kick the floor from under the most current fakers.

And of course you don't need to continuously re-scan the data - you need to scan only once, when the app is submitted or the name is changed. So, in summary, when adding app or release to the store, you need to check its name and description against a list - let's be generous - of 1000 strings and maybe run a basic text classifier if you are feel in very AI mood today. Is that impossible to scale? Nope, it's fairly easy.

> but the next day someone publishes the same app

So your argument is because simple checks are not perfect and do not cover 100% of possible fakery, let's not do anything and allow even the dumbest fakers to run free and fill the store with trash. Does it make sense to you? Because it doesn't make sense to me. Probably you decided since your argument won't be perfect anyway, there's no point to even try for it to make minimal sense?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: