FR version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
68% Positive
Analyzed from 2537 words in the discussion.
Trending Topics
#data#voice#mercor#never#company#companies#more#account#where#stolen

Discussion (83 Comments)Read Original on HackerNews
Awesome, if you're a victim of an AI company having your voice, you can help yourself by sending another AI company your voice!
> Audio is never used to train commercial models without explicit consent
I'm sure Mercor has explicit consent as well, legal teams are reasonably good at legally covering their asses with license terms.
It seems like a lot of people were basically wiretapping themselves with the Mercor app (Insightful) AND their businesses!!!
While a lot of Mercor "contractors" claim Mercor over-reached with data gathering, it's kind of smart because people are too afraid to complain too much knowing they'll not only lose their primary job, but also put a spotlight on forbidden dealing with their company's private information.
[0] https://www.wsj.com/tech/ai/mercor-ai-startup-personal-data-...
Selling the solution to the problem you caused ought to be illegal.
Court records are public in the US. If creditors want to know if you’ve been in financial trouble, they should check for bankruptcies and lawsuits, not the extrajudicial version of those that the credit reporting companies run based on hearsay.
Germans (because of course) have a word for this: "Datensparsamkeit". Being frugal with your data.
I don't know if it's the reason you imply. In the 70s, there were big debates in Germany about privacy and data storage. They spoke of one's data shadow (Datenschatten). I suspect this word comes from that tradition. The reason the word exists would then be the reflection (Verwaltigung) on WW2.
But yes. We Americans know Germans more for their silly big words. But statements like that can be misinterpreted as the German perspective of themselves doesn't quite match the American stereotypes.
In the US of course the government buys this sort of information legally from corporations.
There is also the rather famous example of how earlier census data was used in the 40’s.
Once the government has your data, they have it. The next generation of representatives may not follow all the same rules and norms
Who doesn’t want that old post going extinct forever when they were shit faced outside of a bar in Nashville but now they are in their mid-life and are “respectable” members of society.
Related "monetizing user data" seems to just mean ads. Ads on everything, forever, until the userbase gets fed up and moves to a new service that definitely won't do that, and the cycle repeats about every 3 years.
Nowadays you just throw all the data into a black box and believe whatever it says blindly.
I see this whenever an LLM’s impact is assessed. We know. The issue is scale and the ability for smaller and smaller groups (down to individuals) to execute at scale.
Fake news always existed. Now one dude in India can flood multiple sock puppet media accounts with right wing content/images (actual example) at a scale previously unimaginable.
Except no company is learning this lesson.
The enterprise threat model includes "our own users", and the modus operandi is to maintain as much information on that threat as possible.
[0] https://www.opensourceshakespeare.org/views/plays/play_view....
[1] https://www.etymonline.com/word/data
Mercer hasn't released many public statements over the incident. Social media posts aren't necessarily public; but I did find this breach notification sample filed with CA - https://oag.ca.gov/ecrime/databreach/reports/sb24-621099 . I guess we'll see if our legislators finally take data privacy seriously.
Mercor has definitely released statements with boilerplate "investigations are underway."
I jest but the majority of the "normal" people I know are happy to hand over biometrics because _it's easier_. We need to start branding biometrics as "forever passwords" or something to help people understand just what they're handing over when they validate access to their checking account or enter Disney World or whatever else.
Fingerprints, DNA, iris scans, gait patterns, etc. are all something you can't change (much like a permanent account ID) and are constantly being presented to the world (much like an email address). In addition under US law, police can compel presentation of fingerprints, but passwords are protected under the 5th amendment.
in a certain light, it's kind of admirable. they live like the world is the way it should be
So I could easily see a lot of people viewing this as a positive.
The deeper problem is that most of these companies collected this data because they could, not because they needed it for the core service. 'Datensparsamkeit' is the right frame: the voice samples were a liability sitting on a server waiting for exactly this.
I mean, just look at what happened to Crowdstrike....
Even have a nice UI on top.
https://voicebox.sh/
https://commonvoice.mozilla.org/en/languages
Half the time I call a company they say “we are recording your voice for security / authentication purposes”.
The companies that do that have all the information on me that they require for me to set up an account, so their data breaches will be just like this one, but 1000x larger.
Can we just fast forward through the part where this works for ID theft, past the firefox age verification plugin that uses these datasets, and even through the part where people in the plugin dataset are digital outcasts (this voice has been used too many times. Want to try another?)
At the end of this dark predictable tunnel, maybe there will be a ban on biometrics for important stuff, a repeal of the age verification laws, and actual privacy legislation with teeth.
good luck with this. most finance people deal with hundreds to thousands of clients. they obviously cant remember everyones code word. commonly used finance systems arent setup to securely store these codewords. they dont have processes or policies in place to implement or adhere to any sort of codeword verification.
>Rotate where voiceprints are still in use. [...] Do that now, ideally from a new recording in a different acoustic environment than the leaked sample.
would this even have an effect? i have never heard of "rotating" a voice print. isnt the whole point of a voice print that you cant really change it? if simply switching your environment completely changes your voice print, that would make voice prints utterly useless to begin with.
there are automated systems for this already. my bank, isp, etc. use them when you call in to skip the traditional verification steps. this fact is also highlighted in the article.
the problem is that there isnt typically a system in place for setting up or validating code words, so the advice given is not practical to implement.
The other use cases (like calling payroll, etc) likely don’t have the same protections and probably would be more effective.
I've had to open a bank account for a company here a few years ago and that was right on the bubble of this happening and they still had an option to come by in person with the proper documentation, which I did, now it is all outsourced.
These companies are the fattest targets and they're run by incompetents. You should assume that anything you give them will eventually be part of some hack.
Can't wait for them to crash and burn.
:)