HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
52% Positive
Analyzed from 1615 words in the discussion.
Trending Topics
#data#palantir#health#https#government#bad#com#nhs#should#biobank

Discussion (73 Comments)Read Original on HackerNews
UK Biobank health data keeps ending up on GitHub
https://news.ycombinator.com/item?id=47875843
UK Biobank health data listed for sale in China, government confirms
https://news.ycombinator.com/item?id=47874732
If they've got anyone with a background in cyber security I can't see it.
https://www.ukbiobank.ac.uk/about-us/people-and-governance/
And then the CEO comes out with:
> We have never seen any evidence of any UK Biobank participant being re-identified by others.
This data contains sex, at least month and year of birth. I can't see any sensible security-oriented technical person coming out with a line like that.
I don’t get this trend of seeing bad thing happen and then commenting that other bad thing exists and therefore “in fairness” we should downplay it.
Bad things are bad. Comparing them to other things we don’t like doesn’t make them less bad. I don’t like Palantir either but they’re not intentionally leaking health details so this comparison doesn’t even make any sense.
To many, they are. They're leaking information that has been trusted to the NHS to their own databases.
The fact that it's being done under government contract and (arguably) within the law shouldn't immediately make it any less bad.
Of course it should, to say otherwise is absurd
what, the NHS shouldn't have _any_ subcontracting? All data must only be held by sacred NHS monks in a vault somewhere?
As long as palentir are holding the data on UK servers, to modern data security standards, and they have a contract to do so, they should be able to
yes
In fact the people I have spoken to who have worked on Palantir platform were deeply suspicious of their users treating data with respect, and so built security and immutable auditability as foundational tech.
Palantir is indeed in many ways just a software vendor but we shouldn’t downplay that they have a much more explicit agenda than most other companies do in seeking government contracts.
[1] https://gizmodo.com/palantirs-billionaire-ceo-just-cant-stop...
…says a happy frog who will be as cooked as everyone else.
I personally would like data like this to simply be published, together with a law that says using the data to make personalized decisions affecting those individuals is punishable with life in prison.
Basically, this data is 'opensource', but not for use to decide insurance premiums, job offers, or the contents of news articles.
As a researcher who regularly deals with such data there is a MASSIVE difference. Yes, I have access to the data but I am restricted on how it can be stored (no cloud), what I can and can't do with it, and for some of it I'm even mandated to destroy it once the research project is over. I have the informed consent of every participant, some of which withdrew halfway throughout the collection without any penalty to them. I also don't need a new law because I'm already bound by existing ones, by the contract I signed when I joined, and by the confidentiality agreement I signed when the project started. While I don't know that the leaker(s) will be identified, the existence of the data itself already calls for legal action while giving a starting point for investigation.
Your suggestion, on the other hand, seems to be "let's put this data out there without people's consent and make companies pinky promise that they won't use it in their black boxes in a way that's virtually impossible to detect or prosecute". Those two things are definitely not equivalent.
When you give O(20000) people you have a 1-0.9999^20000 (high) probability that that will leak anyway (either 1/20000 people not following the rules, or just the accident/attack surface area).
This works well in theory but is basically unenforceable. It's barely possible, if possible at all, to audit how FB or google make ad targeting decisions - but once stuff gets into the fragmented ecosystem of data brokers and market intelligence consultancies all hope is lost.
To say nothing of state actors, like countries who might deny you a visa based on adverse medical info or otherwise use your information against you.
Like a clean room implementation requirement.
licensing it to researchers allows you to create, monitor, and enforce policies like the one you describe
stealing it does not
Given the whack-a-mole takedowns, its pretty clear everyone involved knew what was going on.
If this is not traceable back to individuals, it would probably good to be made public. But I assume the UK Biobank only gives access to trusted partners since - as we know in our 'data analytics' day and age - with enough general data quantity you can trace back anything to anyone if you have the resources. And the capitalist-surveillance econonmy certainly provides the profit-motive.
https://nanoporetech.com/products/sequence/minion
But once your data has been digitized even if it is under your control the likelihood that it gets leaked is still high. Specially now with AI agents running everywhere, or people just asking AI services for medical advice.
Today the choice for advice is between low quality local AI advice or higher quality advice but lose your data control, the rational choice is probably losing your data control even if if will almost certainly comes back to bite you.
...until they're inevitably sold.
Should or shouldn't in general, but THIS one database shouldn't.
If I leak your medical information you confidentially shared it with your doctor that means you are okay with it because you opted in for that?
Or does the scope / details do not matter for others, but only matter for your data.
*I have much better word but I guess I should say it.