Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
61% Positive
Analyzed from 6314 words in the discussion.
Trending Topics
#telemetry#git#data#github#users#cli#need#don#why#user
Discussion Sentiment
Analyzed from 6314 words in the discussion.
Trending Topics
Discussion (181 Comments)Read Original on HackerNews
Sometimes HN drives me crazy. From this thread you’d think telemetry is screen recording your every move and facial expression and sending it to the government. I’ve worked at places that had telemetry and it’s more along the granularity of “how many people clicked the secondary button on the third tab?” This is a far cry from “spying on users”.
And you know what happens when you reach out to talk to your customers like human beings instead of spying on them like animals? They like you more and they raise issues that your telemetry would never even think to measure.
It's called user research and client relationship management.
In the OSS world this is not a huge deal. You get some community that’s underserved by the product (ie software package) and they fork, modify, or build something else. If it turned out to be valuable, then you get the old solution complemented or replaced. In the business world this is an existential threat to the business - you want to make sure your users aren’t better served by a competitor who’s focusing on your blindspot.
However, the plural of "anecdote" is not "data". People are unreliable narrators, and you can only ask them so many questions in a limited time amid their busy lives. Also, there are trends which appear sooner in automated analytics by days, weeks, or even months than they would appear in data gathered by the most ambitious interview schedule.
There is a third, middle-ground option as well: surveys. They don't require as much time commitment from the user or the company as a sit-down interview. A larger number of people are willing to engage with them than are willing to schedule a call.
In my experience, all three are indispensable tools.
Yes, vendors can, do, and should talk to users, but then a lot of users don't like receiving cold messages from vendors (and some users go so far as to say that cold messages should _never_ be sent).
So, the alternative is to collect some soft telemetry to get usage metrics. As long as a company is upfront about it and provides an opt-out mechanism, I don't see a problem with it. Software projects (and the businesses around them) die if they don't make the right decisions.
As an open source author and maintainer, I very rarely hear from my users unless I put in the legwork to reach out to them so I completely identify with this.
Everything went to crap in the metric-based era that followed.
And the difference between what they do and what they want is equally shocking. If what they want isn’t in your app, they can’t do it and it won’t show up in your data.
Quantitative data doesn’t tell you what your users want or care about. It tells you only what they are doing. You can get similar data without spying on your users.
I don’t necessarily think all data gathering is equivalent to spying, but if it’s not entirely opt-in, I think it is effectively spying no matter what you’re collecting, varying only along a dimension of invasiveness.
Excellent point.
> but if it’s not entirely opt-in, I think it is effectively spying no matter what you’re collecting, varying only along a dimension of invasiveness.
Every web page visit is logged on the http server, and that's been the default since the mid 1990's. Is that spying?
More like flying based on your knowledge as a pilot and not by the whims of your passengers.
For many CLIs and developer tooling, principled decisions need to reign. Accepting the unquantifiability of usage in a principled product is often difficult for those that are not the target demographic, but for developer tools specifically (be they programming languages, CLIs, APIs, SDKs, etc), cohesion and common sense are usually enough. It also seems real hard for product teams to accept the value of the status quo with these existing, heavily used tools.
We... we are talking about a CLI tool. A CLI tool that directly uses the API. A tool which already identifies itself with a User-Agent[0].
A tool which obviously knows who is using it. What information are you gathering by running telemetry on my machine that couldn't.. just. be. a. database. query?
Reading the justification the main thing they seem to want to know is if gh is being driven by a human or an agent... Which, F off with your creepy nonsense.
Please don't just use generic "but ma analytics!" when this obviously doesn't apply here?
[0]: https://github.com/cli/cli/blob/3ad29588b8bf9f2390be652f46ee...
Why is it that startups and commercial software developers seem to be the only ones obsessed with telemetry? Why do they need it to "optimize user journeys" but open source projects do just fine while flying blind?
Unthinkingly leaning on metrics is likely to help you build a faster, stronger horse, while at the same time avoiding building a car, a bus or a tractor.
No, because users have different needs and thoughts from the developers. And because sometimes it's hard to get good feedback from people. Maybe everyone loves the concept of feature X, but then never uses it in practice for some reason. Or a given feature has a vocal fan base that won't actually translate to sales/real usage.
> Would Git have been significantly better if it had collected telemetry, or would the data not have just been a distraction?
I think yes, because git famously has a terrible UI, and any amount of telemetry would quickly tell you people fumble around a lot at first.
I imagine that in an alternate world, a git with telemetry would have come out with a less confusing UI because somebody would have looked at the stats and for instance have added "git restore" right from the very start, because "git checkout -- foo.txt" is an absolutely unintuitive command.
Compilers and whatnot seem to suffer from the same problem that programs like git(1) does. Once you've put it out there in the world you have no idea if someone will still use some corner of it thirty years from now.
Thankfully, github has zero control over git. If they did have control they would have sank the whole operation on year one
> because somebody would have looked at the stats and for instance have added "git restore" right from the very start, because "git checkout -- foo.txt" is an absolutely unintuitive command.
How is git restore any better? Restoring what from when? At least git checkout is clear in what it does.
And this is exactly where disconnects happen, and where you need telemetry or something like it to tell you how your users actually use the system, rather than imagining how they should.
A technical user deep into the guts of Git thinks "you need to check out again this specific file".
A novice thinks "I want to restore this file to the state it had before I touched it".
Now we can argue about whether "restore" is the ideal word here, but all the same, end users tend to think it terms of "I want to undo what I did", and not in terms of git internals.
So a hypothetical git with telemetry would probably show people repeatedly trying "git restore", "git undo", "git revert", etc, trying to find an undo command.
1. git doesn’t have a UI, it’s a program run in a terminal environment. the terminal is the interface for the user.
2. git has a specific design that was intended to solve a specific problem in a specific way. mostly for linux kernel development. so, the UX might seem terrible to you — but remember that it wasn’t built for you, nor was it designed for people in their first ever coding boot camp. that was never git’s purpose.
3. the fact that every other tool was designed so poorly that everyone (eventually, mostly) jumped on git as a new standard is an expression of the importance of designing systems well.
Unfortunately this is due to a large part of "decision makers" being non-technical folks, not being able to understand how the tools is actually used, as they don't use such tools themselves. So some product manager "responsible" for development tooling needs this sort of stuff to be able to perform in their job, just as some clueless product manager in the e-commerce absolutely has to overload your frontend with scripts tracking your behaviour, also to be able to perform in their job. Of course the question remains, why do those jobs exist in the first place, as the engineers were perfectly capable of designing interaction with their users before the VCs imposed the unfortunate paradigm of a deeply non-technical person somehow leading the design and development of highly technical products...So here we are, sharing our data with them, because how else will Joe collect their PM paycheck, in between prompting the AI for his slides and various "very important" meetings...
I'm not sure if you're implying it's obvious but it's not obvious to me that it would be unhelpful.
Telemetry is a really poor substitute for actually observing a couple of your users. But it's cheap and feels scientific and inclusive/fair (after all you are looking at everyone)
I mean no solution is perfect, and some underused things are just only sometimes extremely useful, but data used smartly is not a waste of time.
I guess that's the price of regular and non-invasive software.
Git has horrible design and ergonomics.
It is an excellent example of engineers designing interfaces for engineers without a good feedback loop.
Ironically, you just proved your point that engineers need to better understand how users are actually using their product, because their mental visualizations of how their product gets used is usually poor.
[1] https://docs.brew.sh/Analytics
1. It's anonymous
2. They're telling you they're doing it
3. You can opt out of it
Not saying that telemetry more valuable than privacy, just that it's a straightforward decision for a company to make when real benefits are only counterbalanced by abstract privacy concerns. This is why it's so universally applied across apps and tools developed commercially.
If I run "gh alias set foo bar", and that takes even a marginally perceptible amount of time, I'll feel like the tool I'm using is poorly built since a local alias obviously doesn't need network calls.
I do see that `gh` is spawning a child to do sending in the background (https://github.com/cli/cli/blob/3ad29588b8bf9f2390be652f46ee...), which also is something I'd be annoyed at since having background processes lingering in a shell's session is bad manners for a command that doesn't have a very good reason to do so.
It's not that it's insufficient, new developers, product people and designers literally don't know how to make tasteful and useful decisions without first "asking users" by experimenting on them.
Used to be you built up an intuition for your user base, but considering everyone is changing jobs every year, I guess people don't have time for that anymore, so literally every decision is "data driven" and no user is super happy or not anymore, everyone is just "OK, that's fine".
All that said, having been in plenty of corporate environments I would be surprised if the data is anonymized and wouldn't be surprised if the primary motivator boils down to something like internal OKRs and politics.
The software user has no means to verify the explanation or disclosure is accurate or complete. Once the data is transferred to the company then the user has no control over where it goes, who sees it or how it is used
When the company states "We use the data for X" it is not promising to use the data for X in the future, nor does it prevent the company, or one of its "business partners", from using the data additionally for something else besides X
Why "explain" the reason for collecting telemetry
Why "disclose" how the data is used
What does this accomplish
This isn’t that surprising to me. Having usage data is important for many purposes. Even Debian has an opt-in usage tracker (popcon) to see wha packages they should keep supporting.
What I’m curious about is why this is included in the CLI. Why aren’t they measuring this at the API level where they wouldn’t need to disclose it to anyone? What is done locally with the GH CLI tool that doesn’t interact with the GitHub servers?
I've repeatedly talked about this on HN; I call it Marketing Driven Development. It's when some Marketing manager goes to your IT manager and starts asking for things that no customer wants or needs, so they can track if their initiatives justify their job, aka are they bringing in more people to x feature?
Honestly, with something as sensitive as software developer tools, I think any sort of telemetry should ALWAYS be off by default.
Linux and Git are fully open source, and have big companies contribute to it. If a company like Google, Microsoft etc need a feature, they can usually afford to hire someone and develop _and_ maintain this feature.
Something like gh is the opposite. It's maintained by a singular organisation, the team maintaining this has a finite resources. I don't think it's much to ask for understand what features are being used, what errors might come up, etc.
"oh no, they're aware of someone at the computer 19416146-F56B-49E4-BF16-C0D8B337BF7F running `gh api` a lot! that's spying!"
Yes, probably. Git is seriously hard to use beyond basic tasks. It has a byzantine array of commands, and the "porcelain" feels a lot closer to "plumbing" than it should. You and I are used to it, but that doesn't make it good.
I mean, it took 14 years before it gained a `switch` command! `checkout` and `reset` can do like six different things depending on how your arguments resolve, from nondestructive to very, very destructive; safe(r) operations like --force-with-lease are made harder to find than their more dangerous counterparts; it's a mess.
Analytics alone wouldn't solve the problem - you also need a team of developers who are willing to listen to their users, pore through usage data, and prioritize UX - but it would be something.
Sincerely, a Mercurial user from way back.
Because they're too shy, lazy, or socially awkward to actually ask their users questions.
They cover up this anxiety and laziness by saying that it costs too much, or it doesn't "scale." Both of these are false.
My company requires me to actually speak to the people who use the web sites I build; usually about every ten to twelve months. The company pays for my time, travel, and other expenses.
The company does this because it cares about the product. It has to, because it is beholden to the customers for its financial position, not to anonymous stock market trading bots a continent away.
Bug fixing absolutely gets taken care of immediately, and our customers are very active in telling us about them through these strange new feedback mechanisms known as "e-mail" and "a telephone."
But we don't spy on people to fix bugs.
Nothing that the big tech "telemetry" is doing is about bug fixes. In the article we're all talking about the spying that Microsoft proposes isn't to fix bugs. Re-read what it wrote. It's all for things that may not appear for weeks, months, or years.
And to think that a trillion-dollar company like Microsoft can't figure out how, or doesn't have the money available to scale real customer feedback is just sticking your head in the sand and making excuses.
Microsoft doesn't need people to apologize for its failure.
Now, let's replicate this with GitHub. What can go wrong?
There are all sorts of best practices for getting info without vacuuming up everyone’s data in opaque ways.
I’m not saying they don’t engage in any of those practices, I am specifically talking about the hardware survey.
The hardware survey is not that.
The problem I have with a lot of these analytics is that while there are harmless ways to use it, there is this understanding that they could be tying your unique identifier to behavioral patterns which could be used to reconstruct your identity with machine learning. It's even worse if they include timestamps.
Why not just expose exactly what telemetry is being sent when it's sent? Like add an option that makes telemetry verbose, but doesn't send it unless you enable it. That way you can evaluate it before you decide to turn it on. Whenever you do the Steam Hardware survey it'll show you what gets sent. This is the right way to do it.
The opt-out situation for gh CLI telemetry is actually trickier than it sounds. gh runs in CI/CD pipelines and server environments where you may not want any outbound connections to github.com at all, not because of privacy but because of networking constraints. In those environments, the telemetry being on by default means your CI fails or your Bastion host can't reach GitHub at all.
Compare this to git itself, which is entirely local until you explicitly push. The trust model is different: git will never phone home unless you configure it to. gh, being a wrapper around the GitHub API, has to make those calls to function - but that's separate from whether it should also be collecting and uploading your command patterns.
> Removes the env var that gates telemetry, so it will be on by default.
If you don't want your requests tracked, you're going to have to opt out of a lot more than this one setting.
Those two words have almost exactly opposite meanings, and as stated, they are literally saying they are collecting identifiable data.
Embrace, extend, extinguish.
The first two have been done.
I give it five years before the GH CLI is the only way to interact with GitHub repos.
Then the third will also be done, and the cycle is complete.
I'll take that bet. How much are you willing to put on it?
A quick summary of my Claude-assisted research at the Gist below. Top of mind is some kind of trusted intermediary service with a vested interest in striking a definable middle ground that is good enough for both sides (users and product-builders)
Gist: WIP 31 minutes in still cookin'
P.S. You look like villain from Temu.
For example
will run and poll the CI checks of a PR and exit 0 once they all passAlso, I believe GitHub Actions cache cannot be bulk deleted outside of the CLI. The first time I [hesitantly] used the gh CLI was to empty GitHub Actions cache. At the time it wasn't possible with the REST API or web interface.
And less social media shit, maybe adding better LFS alternative similar to huggingface and stuff.
Git isn't the popular choice in game dev because of this assets in tree hosting nonsense, why haven't we fixed it yet.
Similarly many edge cases, also finally they built stacked prs but man does it feel a under baked, and what it's like 2+ years late.
Please just improve Github, make me feel like I will be missing out if I am not on Github because of the features not because I have to be because of work.
* Dev tools because you need to be able to trust they don't leak while you're working. Not all sites/locations/customers/projects allow leaks, and it's easier to just blacklist anything that does leak, so you know you can trust your tools, and the same habits, justfiles, etc work everywhere.
* libraries that leak deserve a special kind of hell. You add a library to your project, and now it might be leaking without warning. If a lot of libraries decide to leak, your application is now an unmanageable sieve.
If you do need to run telemetry, make it opt in or end user only. But if you as developer don't even have control then that's the worst.
Today I use a Golang CLI made with ~200K LOC to do essentially the same thing. Yay, efficiency?
Regulators should wake up and fine them hard, so hard to become existential. Make an example for others not to follow.
I know lots of idealists -- I went to a public policy school. And in some areas, I am one myself. We need them; they can push for their causes.
But if you ever find yourself working as a regulator, you'll find the world is complicated and messy. Regulators that overreach often make things worse for their very causes they support.
If you haven't yet, go find some regulators that have to take companies all the way to court and win. I have know some in certain fields. Learn from them. Some would probably really enjoy getting to talk to a disinterested third-party to learn the domain. There are even ways to get involved as a sort of citizen journalist if you want.
But these sort of blanket calls for "make an example of GitHub" are probably a waste of time. I think a broader view is needed here. Think about the causal chain of problems and find a link where you have leverage. Then focus your effort on that link.
I live in the DC area, where ignorance of how the government works leads to people walking away and not taking you seriously. When tech people put comparable effort into understanding the machinery of government that they do into technology, that is awesome. There are some amazing examples of this if you look around.
There are no excuses. Tech people readily accept that they have to work around the warts of their infrastructure. (We are often lucky because we get to rebuild so much software ourselves.) But we forget what it's like to work with systems that have to resist change because they are coordination points between multiple stakeholders. The conflict is by design!
Anyhow, we have no excuse to blame the warts in our governmental system. You either fix them or work around them or both.
The world is a big broken machine. Almost no individual person is to blame. You just have to understand where to turn the wrench.
https://en.wiktionary.org/w/index.php?search=pseudoanonymous...
It is interesting how GitHub sort of prominently features this non-word in their article. Perhaps some South Asian or European person for whom English is a struggle.
There is no word that means "fake-anonymous". I would assume that the author of this article intended to write "pseudonymous" which is a real word with a real definition.
https://en.wiktionary.org/wiki/pseudonymous
But it would also be interesting if they very much intended the ambiguity of using a non-word that is more than it seems on the surface.
the old git command in your terminal
I think I'll keep using that
It might seems legit from them, but I'm quite sure that just listening to your user is enough. It is not like they lack an user base ready to interact with them or that they lack of bugs or features to work on.
In most cases, the telemetry is more a vanity metric that is rarely used. "Congratz to this team that did the flag that is the most used in the cli". But even for product decision, it is hard to extract conclusions from current usage because what you can and will do today is already dependent on the way the cli is done. A feature might not be used a lot because it is not convenient to do, or not available in a good way compared to an alternative, but usage report will not tell if it was useful or not. In the same way, when I buy a product, often there are a lot of features that I will never use, but that I'm happy to have. And I might not have bought the product, or bought another one if it was not available. But the worse would have the manufacturer remove or disable the feature because it is not used...
Corporations can and will do every scummy thing permitted to them by law, so here we are. Until the US grows a backbone on issues of privacy, we shouldn't be surprised, I suppose. But the US won't be growing such a backbone anytime in the near future.
export GH_TELEMETRY=false
export DO_NOT_TRACK=true
gh config set telemetry disabled (starting from version 2.91.0, which this announcement refers to)
gh version 2.90.0 (2026-04-16) https://github.com/cli/cli/releases/tag/v2.90.0
$ gh config set telemetry disabled
! warning: 'telemetry' is not a known configuration key
Also note that even though you get a warning about an unknown config key, the value is actually set so you're future-proof. Check `grep telemetry ~/.config/gh/config.yml`
What's strange is if you check your `~/.config/gh/config.yml` it will put `telemetry: disabled` in there. But it will put anything in that `config.yml` lol.
> gh config set this-is-some-random-bullshit aww-shucks > ! warning: 'this-is-some-random-bullshit' is not a known configuration key
But in my config.yml is
this-is-some-random-bullshit: aww-shucks