Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
69% Positive
Analyzed from 14991 words in the discussion.
Trending Topics
#stars#github#star#more#project#metric#don#vcs#fake#signal
Discussion Sentiment
Analyzed from 14991 words in the discussion.
Trending Topics
Discussion (367 Comments)Read Original on HackerNews
Here are the things I look at in order:
* last commit date. Newer is better
* age. old is best if still updating. New is not great but tolerable if commits aren't rapid
* issues. Not the count, mind you, just looking at them. How are they handled, what kind of issues are lingering open.
* some of the code. No one is evaluating all of the code of libraries they use. You can certainly check some!
What does stars tell me? They are an indirect variable caused by the above things (driving real engagement and third interest) or otherwise fraud. Only way to tell is to look at the things I listed anyway.
I always treated stars like a bookmark "I'll come back to this project" and never thought of it as a quality metric. Years ago when this problem first surfaced I was surprised (but should not have been in retrospect) they had become a substitute for quality.
I hope the FTC comes down hard on this.
Edit:
* commit history: just browse the history to see what's there. What kind of changes are made and at what cadence.
I do it all the time, whenever there are competing libraries to choose among.
It's a heuristic that saves me time.
If one library has 1,000 stars and the other has 15, I'm going to default to the 1,000 stars.
I also look at download count and release frequency. Basically I don't want to use some obscure dependency for something critical.
There are clearly inflection points where stars become useful, with "nobody has ever used this package" and "Meta/Alphabet pays to develop/maintain this package" on the two extremes.
I'm less sure what the signal says in-between those extremes. We have 2 packages, one has 5,000 stars, the other has 10,000 stars - what does this actually tell me, apart from how many times each has gone viral on HN?
If the goals are marketing or targeting or mass-market appeal or hiring pools then those stars say something else.
Will you continue to do this after reading TFA?
i.e. if the maintainer is serious enough to buy stars, is not in theory likely to spend time /money in maintaining /improving the project also ?.
Presumably he wouldn't just want fake users but also real users, which is a signal than a just purely hobby project, that is vibe-coded on a whim over a weekend and abandoned?
Just because you make a decision quicker doesn't mean you saved any time. It is good to save time, but not at the sake of quality. You spend more buying cheap boots, and they don't even keep your feet dry.
A bad one.
I listed many other useful heuristics. Do you not find value in them? Do you find stars more valuable than them?
Take a moment to consider stars as a useful metric may only be useful for packages created prior to ~2015 when they weren't such a strong vanity metric, and are already very well established. This is preconditioning you to think "stars can still sometimes be useful, because I took a look at Facebook's React GH and it has a quarter million stars".
Sure, it's useful for that. But you aren't going to evaluate if the "React" package is safe. You already trivially know it is.
You'll be evaluating packages like "left-pad". Or any number of packages involved in the latest round of supply chain attacks.
For that matter, VCs are the ones stars are being targeted at, and potential employers (something this article doesn't cover, but some potential hires do hope to leverage on their resume).
If you are a VC, or an employer, it is a negative metric. If you are a dev evaluating packages, it is a vacuous metric that either tells you what you already know, or would be better answered looking at literally anything else within that repo.
The article also called out how download count can be faked trivially. I admit I have relied upon this in the past by mistake. Release frequency I do use as one metric.
When I care about making decisions for a system that will ingest 50k-250k TPS or need to respond in sub-second timings (systems I have worked on multiple times), you can bet "looking at stars" is a useless metric.
For personal projects, it is equally useless.
I care about how many tutorials are online. And today, I care more about if there was enough textual artifacts for the LLMs to usefully build it into their memory and to search on. I care if their docs are good so I spend less tokens burning through their codebase for APIs. I care if they resolve issues in a timely manner. I care if they have meaningful releases and not just garbage nothings every week.
I didn't mean for this to sound like a rant. But seriously, I just can't imagine in any scenario where "I look at stars" as a useful metric. You want to add it to the list? Sure. That is fine. But it should not be a deciding factor. I have chosen libraries with less stars because it had better metrics on things I cared about, and it was the correct choice (I ended up needing to evaluate them both anyhow. But I had my preference from the start).
Choosing the wrong package will waste you so much more time. Spend the 5 minutes evaluating for stuff that is important to your project.
My first scan of a GitHub repository is typically: check age of latest commit, check star count, check age of project. All of these things can be gamed, but this weeds out the majority of the noise when looking for a package to serve my needs. If the use case is serious, proper due diligence follows.
There was nothing about going into the logs to see if they could do the game's mechanical challenges, minimizing their damage taken. It made for a worse environment yet the players couldn't stop themselves from using such criteria.
In short, humans are lazy and default to numbers and colors when given the chance. When others question them on it, they can have a default easy answer of being part of the herd of zebras to get out of trouble.
To be honest, these days I have more faith in an application or library with a moderate development pace where maybe the last commit wasn't 2 seconds ago co-authored by claude (in the most blatant examples).
The same is true for amount of commits, the type of commits, release cadence and the amount of fixes and hotfixes in releases. I don't feel like being a glorified alpha tester so I look for maturity in a project.
Which more often than not means that, yes there needs be activity. But, it is also fine if it was two days ago and there is a clear sign of the same pattern over a longer period. Combined with a stable release cycle, sane versioning and clear changelogs that aren't just a list of the last 10 commit messages.
On your point of stars, I think they used to be a valid metric in a similar category. Namely, community behind the software. But it has been a while since that has been true. It certainly hasn't been for a while, ever since I saw these star tracking graphs pop up on repos I knew that there was no sense in paying attention to them anymore.
For example, let’s say I want to run some piece of software that I’ve heard about, and let’s say I trust that the software isn’t malware because of its reputation.
Most of the time, I’d be installing the software from somewhere that’s not GitHub. A lot of package managers will let anyone upload malware with a name that’s very similar to the software I’m looking for, designed to fool people like me. I need to defend against that. If I can find a GitHub repo that has a ton of stars, I can generally assume that it’s the software I’m looking for, and not a fake imitator, and I can therefore trust the installation instructions in its readme.
Except this is also not 100% safe, because as mentioned in TFA, stars can be bought.
There are many other far more useful metrics to look at first, and to focus on first, and to think about. Every time you think about stars, you'll forget the other stuff, or discount it in favor of stars.
Forget stars. They now no longer mean anything. Even if they did before, they don't anymore.
In it they explicitly call it out as a ranking metric
> Many of GitHub's repository rankings depend on the number of stars a repository has. In addition, Explore GitHub shows popular repositories based on the number of stars they have.
Yet another case of metric -> target -> useless metric
But in an age of bots/agents, that's just kicking the can down the road by making it easier to fudge regular activity of practically zero importance. Even worse for the ecosystem than paid like counts.
You just create 5 GitHub accounts, and spread your Claude Code commits to 5 separate accounts to make it look like there's 5 active contributors.
If anything, we're better off with a fake star economy that is the main thing most people are trying to game, so the signal to noise can still be that it (at least so far) seems pretty easy to tell how many REAL active contributors there are.
Though, I should note, 2 heads are not always better than 1.
I'm more interested in a repository that has commits only from two geniuses than a repository that has 100s of morons contributing to it.
https://en.wikiquote.org/wiki/Napoleon
I do.
I don't review the whole repo, but every single time I update dep versions, I always look at the full diff between the two. It doesn't take that long
Same here. I've starred over 1500 projects on Github over the years, and only because I wanted to save them for later use or as a reference for something I was working on. These days I'll occasionally use the star metric as a signal to avoid certain projects as overhyped (especially if the project has a stars-per-day meter).
You might not have but the makers of dependencies that you use might so still problematic.
I have limited time on this Earth and at my employer. My job is not critical to life. I am comfortable with this level of pragmatism.
It’s a bloom filter of sorts for finding the right library.
* Most recent commit
* Total number of commits
This might have to die in the era of AI, but it's served me well for a long time. Rather than how many people are paying attention, it tries to measure the effort put in.
Sadly that is probably true.
At the very least I'd add release cadence to it and the quality of releases. Mature, good software will have hotfixes and patch releases every now and then. But not in every release and certainly not 50% of the changes. In the same sense I will often look at the effort put in changelogs. If they took the effort of putting things in category, writing about possible breaking changes, etc it is a possible indicator of some level of quality. At the very least I will have a lot more faith in software with good changelogs compared to something that is just a list of the last N commit messages.
It's only not meaningful because of how other people can game it and fabricate it, but everything you just said, if it was only people like you, that would be a very meaningful number.
It doesn't even matter why you bookmarked it, and it doesn't matter that whatever the reason was, it doesn't prove the project as a whole is overall good or useful. Maybe you bookmarked it because you hate it and you want to keep track of it for reference in your ted talk about examples of all the worst stuff you hate, but really by the numbers adding up everyone's bookmarks, the more likely is that you found something interesting. It doesn't even matter what was interesting or why. The entire project could be worthless and the thing you're bookmarking was nothing more than some markdown trick in the readme. That's fine. That counts. Or it's all terrible, not a single thing of value, and the only reason to bookmark it is because it's the only thing that turned up in a search. Even that counts, because that still shows they tried to work on something no one else even tried to work on.
It's like, it doesn't matter how little a given star means, it still does mean something, and the aggregation does actually mean something, except for the fact of fakes.
Yes...which is why I said it is an indirect variable, as caused by the other things I pointed out above. Age, quality, code, utility, whether issues are addressed, interest, etc. Or fraud. Pretty cut and dry.
FWIW, I almost never star repos. Even ones I use or like. I don't see the utility for myself.
Aim for a more concise post and don't couch your statements in doubt next time if you want a productive conversation, because I don't know what you are trying to say.
Six million fake stars is just what this small crew found, likely in a matter of hours.
A fine of $53,088 times six million is 318.528 billion.
Just going hard after a small portion of that should both put an end to it and a slight dent in the deficit.
This kind of fraud is rampant because everyone concludes the way to win is not to make a real advance, but to simply game the system. Seems they are not wrong because the lack of enforcement makes the rules meaningless.
Instead I look at (in addition to the above):
1. Who is the author? Is it just some person chasing Internet clout by making tons of 'cool' libraries across different domains? Or are they someone senior working in an industry sector from which project might actually benefit in expertise?
2. Is the author working alone? Are there regular contributors? Is there an established governance structure? Is the project going to survive one person getting bored / burning out / signing an NDA / dying?
3. Is the project style over substance? Did it introduce logos, discord channels, mascots too early? Is it trying too hard to become The New Hot Thing?
4. What are the project's dependencies? Is its dependency set conservative or is it going to cause supply chain problems down the line?
5. What's the project's development cadence? Is it shipping features and breaking APIs too fast? Has it ever done a patch release or backported fixes, or does it always live at the bleeding edge?
6. NEW ARRIVAL 2026! Is the project actually carefully crafted and well designed, or is it just LLM slop? Am I about to discover that even though it's a bunch of code it doesn't actually work?
7. If the project is security critical (handles auth, public facing protocol parsing, etc.): do a deeper dive into the code.
Build a SaaS and you'll have "journalists" asking if they can include you in their new "Top [your category] Apps in [current year]", you just have to pay $5k for first place, $3k for second, and so on (with a promotional discount for first place, since it's your first interaction).
You'll get "promoters" offering to grow your social media following, which is one reason companies may not even realize that some of their own top accounts and GitHub stars are mostly bots.
You'll get "talent scouts" claiming they can find you experts exactly in your niche, but in practice they just scrape and spam profiles with matching keywords on platforms like LinkedIn once you show interest, while simultaneously telling candidates that they work with companies that want them.
And in hiring, you'll see candidates sitting in interview farms quite clearly in East Asia, connecting through Washington D.C. IPs, present themselves with generic European names, with synthetic camera backgrounds, who somehow ace every question, and list experience with every technology you mention in the job post in their CVs already (not hyperbole, I've seen exactly this happen).
If a metric or signal matters, there is already an ecosystem built to fake it, and faking it starts to be operational and just another part of doing business.
Have an upvote. The first one is free.
https://www.xkcd.com/2899/
Short term, you pay the cost of fake signaling, which is simply deadweight loss. People spend resources to inflate signals instead of improving the actual thing.
Medium term, I suppose you could see how it increases consumption, since users would probably try something with 100k stars instead of 2, GitHub wants to seem that it's used more than it really is, repo owner is also benefiting.
Long term, the correspondence between how important a (distorted) system is perceived (Github, OSS, IT in general) vs how important it really is collapses quite abruptly and unnecessarily, and you end up with a lemon market [0] where signals stop being reliable at all.
[0] https://en.wikipedia.org/wiki/The_Market_for_Lemons
I'm increasingly convinced the issue isn't feedback itself, but centralized, global, aggregated feedback that becomes game-able without stronger identity signals.
Right now the incentives are tied (correctly or not) to these global metrics, so you get a market for faking them, with money flowing to whoever is best at juicing that signal.
If instead the signal was based on actual usage and attributions by actual developers, the incentives shift. With localized insight (think "Yeah, I like Golang") it becomes both harder to fake and harder to get at the metric rollup.
Useful reputation on the web is actually much more localized and personal. I gladly receive updates on and would support the repos I've starred. If I could chose where to put my dollars (not an investors), it would likely include the list of repos I've personally curated.
This suggests a different direction: instead of asking "how many stars does this have?", ask "who is actually depending on this, and in what context?" or better retroactively compare your top-n repos to mine and we'll get a metric seen through our lenses. If you want to include everyone in that aggregation you'll end up where we are now, but if in stead you chose the list, well, the stars could align as a good metric once more.
The interesting part is that the web already contains most of that information, we just don't treat identity as a part of the signal (yet? universally?).
What's more it became obvious to me two or so years ago that GitHub is going the way of LinkedIn slowly but surely. Lots of professionals on there just because it's expected of them, some interact occasionally with the "social media" aspect of it and fewer still really thrive on that part. Time will tell how this will pan out but just look how many Developer and Linux influencers became huge on YouTube and other places this last year. Most of them barely had more than 10k subscribers 3 years ago and now people look to them for their next tech stack and hot framework/tool/library/distro and so on.
We've recently decided to complicate life of AI bots in our repo https://archestra.ai/blog/only-responsible-ai, hoping they will just choose those AI startups who are easier to engage with.
[0] http://www.stat.yale.edu/~jtc5/papers/Ancestors.pdf [1] https://pubmed.ncbi.nlm.nih.gov/11542058/
Specifically someone submitted a library that was only several days old, clearly entirely AI generated, and not particularly well built.
I noted my concerns with listing said library in my reply declining to do so, among them that it had "zero stars". The author was very aggressive and in his rant of a reply asked how many stars he needed. I declined to answer, that's not how this works. Stars are a consideration, not the be all end all.
You need real world users and more importantly real notability. Not stars. The stars are irrelevant.
This conversation happened on GitHub and since then I have had other developers wander into that conversation and demand I set a star count definition for my "vague notability requirement". I'm not going to, it's intentionally vague. When a metric becomes a target it ceases to be a good metric as they say.
I don't want the page to get overly long, and if I just listed everything with X star count I'd certainly list some sort of malware.
I am under no obligation to list your library. Stop being rude.
https://en.wikipedia.org/wiki/Goodhart's_law
I've been thinking about this a lot. These metrics are all just marketing signals to draw people's attention, trying to make some kind of deals. So the fix should be: make the cost of the signal match what it claims to represent. I'm obsessed with something called DUKI /djuːki/ (Decentralized Universal Kindness Income, a form of UBI) — the idea is that instead of stars or reviews, trust comes from deals pledging real money to the world for all as the deal happens. You can't fake that cheaply.
So the metric becomes the money itself — if you fake X amount, it costs you X, and the world will thank you by paying attention...
Imagine if GitHub let you back a star with real money — the more you put in, the more credible the star. And that money goes out as UBI for everyone. For attention makers, star anything you want, as much as you want. For attention takers, just follow the money to filter through all the noise that's so easy to manipulate...
Well that's the WHOLE problem of trust. There is so much work on blockchain in proof of work, proof of stake, etc in order to protect ourselves from attacks, e.g. https://en.wikipedia.org/wiki/Sybil_attack
If you do find a way it would apply to a lot more than "just" GitHub star for VCs.
Are VC's just that lazy about making investment decisions? Is this yet another side-effect of ZIRP[2] and too much money chasing a return? Is nobody looking too hard in the hope of catching the next rocket to the moon?
From the outside, investing based on GitHub stars seems insane. Like, this can't be a serious way of investing money. If you told me you were going to invest my money based on GitHub stars, I'd laugh, and then we'd have an awkward silence while I realize there isn't a punchline coming.
[0] I'm from Cleveland. I get to pick on them.
[1] https://en.wikipedia.org/wiki/List_of_Cleveland_Browns_seaso... I think their record speaks for itself.
[2] https://en.wikipedia.org/wiki/Zero_interest-rate_policy
It's a bit like the old article about evaluating software companies on whether they have version control or not. Everyone has version control now.
The entire game of startup investing is to identify breakout companies early. Social proof (when valid, not faked) of interest is one of the strongest signals of product market fit.
If a product has a lot of attention (users, headlines, stars, downloads, DAU) that’s a signal that it could also have a lot of customers some day. This is also why all of those metrics are targets for manipulation.
> This would be like an NFL team drafting a quarterback based on how many instagram followers they have
Major sports team are about engaging fans. If a promising recruit had a huge social media presence then that could be a contributing factor toward trying to recruit that player.
This is actually easier to understand if you look at the inverse: Some times there are players with amazing stats but who have a cloud of controversy following them. Teams will skip over these problematic players despite their performance because having popular and engaging players is important for teams but having anti-popular players will drive away fans.
Over here the fans would be singing "You're getting sacked in the morning" halfway through that first season.
I guess not having relegation makes things slightly less ruthless for you.
The owner of the Cleveland Browns uses the team to generate more revenue. For NFL teams, performance has little to do with their value or ability to generate additional revenue.
There is no strong financial incentive to win in the NFL, aside from the owner's ego. The Browns' owner's ego is driven by money, and the result shows on the field.
Like an allegory for performative capitalism in America. Profit and quality completely decoupled in the wake of market capture (rent seeking).
But if they don't care about winning, why bother getting good draft picks?
I am so glad the proposed "European super league" was killed off so hard, so that we don't get a franchise model, it produces so many adverse incentives.
I do find the model European Football (soccer) using promotion and relegation to be much more interesting, both from the standpoint of culling out perennially hopeless teams from top-tier competition, and for having a place for people to play who aren't absolute superstars.
plus, what is an NFL fan going to do, stop watching football? hahahahahaha
The Haslams? Yeah, they should really sell the team, but I figure in about 10-15 years, they'll move it out of Cleveland.
Former Seahawks fan here, it's easier than you think. (It wasn't their record, I stuck with them through the 90s after all, it was realizing what CTE meant for the players).
Regardless, a coach is given some leeway their first season. They were coming off a 3-13 season, so 1-15 isn't that much of a drop. Jackson could make the case that he needed another season to build his ideal roster.
Then after going 0-16, they were on track to get Mayfield. He could have made the case that if he can't win with Mayfield, then maybe he just can't win.
Then he didn't win with Mayfield.
It’s not that they’re necessarily careless, it’s just that the bigger the net the more fish you catch. And when you own both all the fishing boats and all the nets…might as well cast wide.
And once it gets out that it’s a selection criteria it gets gamed to hell and back.
[*] https://en.wikipedia.org/wiki/Goodhart%27s_law
Once surfaced, there’s other signals to filter if an initial conversation is even worth it.
Assuming everyone else is just stupid and it’s all luck is a good way to hold yourself back from your potential.
sounds like how the ufc does it
Not quite the same, but the New York Jets (one of the few NFL teams that can match the dysfunction of the Browns — they have the longest active playoff drought in big 4 North American sports) passed on a few successful players because the owner, Woody Johnson, reportedly didn't like their Madden (video game) ratings [0]:
> A few weeks later, Douglas and his Broncos counterpart, George Paton, were deep in negotiations for a trade that would have sent Jeudy to the Jets and given future Hall of Fame quarterback Aaron Rodgers another potential playmaker. The Broncos felt a deal was near. Then, abruptly, it all fell apart. In Denver’s executive offices, they couldn’t believe the reason why.
> Douglas told the Broncos that Johnson didn’t want to make the trade because the owner felt Jeudy’s player rating in “Madden NFL,” the popular video game, wasn’t high enough, according to multiple league sources. The Broncos ultimately traded the receiver to the Cleveland Browns. Last Sunday, Jeudy crossed the 1,000-yard receiving mark for the first time in his career.
...
> Johnson’s reference to Jeudy’s “Madden” rating was, to some in the Jets’ organization, a sign of Brick and Jack’s influence. Another example came when Johnson pushed back on signing free-agent guard John Simpson due to a lackluster “awareness” rating in Madden. The Jets signed Simpson anyway, and he has had a solid season: Pro Football Focus currently has him graded as the eighth-best guard in the NFL.
[0] https://www.nytimes.com/athletic/6005172/2024/12/19/woody-jo...
Union Labs is the most consequential case. It was ranked #1 on Runa Capital's ROSS Index for Q2 2025 - a widely cited VC industry report identifying the "hottest open-source startups" - with 54.2x star growth and 74,300 stars. Our analysis found 32.7% zero-repo accounts, 52% zero-follower accounts, and a fork-to-star ratio of 0.052. The StarScout analysis flagged it with 47.4% suspected fake stars. An influential investment-sourcing report that VCs rely on was topped by a project with nearly half its stars suspected as artificial.
Record labels did this with soundcloud or only picking up people who already had a following. Movies have done this repeatedly with adaptation (due respect to people who like their books remade faithfully, there's a reason the 70's/80's were that decade and it basically stopped once Comics/LotR arrived). A24 represents a disruption, not a normal studio. Books did this with webnovels or paying dirt cheap and making the Author market.
What people tend to forget is they're just resource gatekeepers. They could just choose to invest in offices with cats because an office with cats popped massively one time and you can't say they're wrong, because there's no alternative funding you can get to A/B test with. In theory there are different firms - but they often went to the same schools, same peer group, same fraternity/sororities, and once they're in the wild they all know each other. It's not a different behavior if it's VC or if it's Nashville.
The real question is how long before either governmental-busting or someone notices the lack of care with money and shops alternatives. In theory this is also partially why American firms face international risk - lacking people respecting their laziness, someone can break their model.
That said - I'm not saying they're not smart - just that often there's a tendency to delude that shortcuts taken represent a "good job" rather than "no one can really say we're doing it poorly".
Nevertheless, VCs are in fact pretty dumb sometimes and it'd be stupid to invest soley based on stars.
The only claim here is that there’s a report that tracks GitHub star growth that is that is presumably read by VCs:
>> Union Labs is the most consequential case. It was ranked #1 on Runa Capital's ROSS Index for Q2 2025 - a widely cited VC industry report identifying the "hottest open-source startups" - with 54.2x star growth and 74,300 stars. Our analysis found 32.7% zero-repo accounts, 52% zero-follower accounts, and a fork-to-star ratio of 0.052. The StarScout analysis flagged it with 47.4% suspected fake stars. An influential investment-sourcing report that VCs rely on was topped by a project with nearly half its stars suspected as artificial.
Again, the claim here is that the report is “influential”. Maybe?
Perhaps you don't owe morons, VCs, or rich people better, but you owe this community better if you're participating in it.
and it's not just ZIRP. every recent IPO or liquidity event creates literally 500 more of these guys.
Hold up — one can be mature without any of those things, but cars are especially optional.
VCs themselves probably suffer from chronic overestimation of their own intelligence, but there just aren't many good signals at the stage of companies they're looking at. No customers, no revenue; often just an idea and hopefully a prototype. GitHub stars are as good of a signal as letters of intent, which is to say: a bad signal, but at least a signal. Other than that, they have to just evaluate what the founders are telling them (generally unrealistically optimistic at best) and whatever market research they can do (which is hard enough for the founders to do for their own product; image doing this for a dozen different companies every day).
Of course GitHub stars are a terrible signal, but the bar for signal quality is just really low.
These people go to the extreme and feel they have to outdo each other in an arms race to win whatever category it is today.
You can have extreme ambitions without being a moron. It's possible for someone to be empathetic, but also really driven. The problem is that they are locked in a downward spiral and they can't possibly be vulnerable. It's only when they run out of money, or some other extreme event occurs that they change tack. That's moronic, especially when the outcomes are predictable.
There is a lot to be said about SV culture and the people that surround these VCs. A lot of people love these environments and more than tolerate the environment these VC folks create. It's hardly a new phenomenon.
speak for yourself, I guess? Some people know things in many areas. But even if they are not experts outside of their areas of expertise, they may recognize their limitations in other areas and thus avoid making costly mistakes. This may even be the rule for adults, rather than the exception.
Using things like github stars is clearly stupid, but not in the way you're suggesting. They're using the GH stars as a proxy metric for "someone else will come along and give money bags to this person later, so I should get in early so I can take that money eventually."
They're operating on metric of success which is about influence and charisma and connectedness, not revenue or technical excellence.
Again, VCs don't care if you'll make a profitable business some day. They're just interested in if someone else will come along and pay out giant bags of cash for it later in a liquidity event. If they get even one of those successes, all the stupid GH star watching pays off.
Here's another way of framing it: any harms from the false positives around "He has a lot of GH stars" or "He went to Stanford" or "I know his father at the country club" are more than mitigated by the one exit in 1000 that makes a bunch of people filthy rich.
We shouldn't expect VCs to be something they're not. But we are missing something inbetween VCs and "self financing" and "bootstrapping"
And if that's true, they should be slapped, hard. They're no longer performing a socially useful function, and and have degraded towards pure financialization. Some middleman between fools and their money.
As much as I don't like Altman, VC should be pumping money into startups like Helios--companies pursuing cutting-edge technology that could totally fail (yes, that's an organic em-dash).
If you mentally say “well 90% fail so I’ll just throw in this dog shit to see what happens” then you increase the failure rate.
- VCs definitely cared about our Stars, especially in early stages, but not as our primary metric. I suppose Stars might be the primary metric if they're truly off the charts, but usually they're just one of many social proof signals an investor might look at.
- Investors, especially at the earliest stages, are quite a varied bunch. Some were diligent about looking at who was leaving Stars on the repo (i.e. are these accounts fake/do they belong to potential future customers). Some less so. This is true for basically every metric (see: startups that grossly misreport ARR)
- Fake GitHub stars were a thing way before 2022. I'd have to look in more detail at the methodology here, but I'd question any analysis that finds that paying for GitHub Stars (or any social following kind of metric) is a strictly post-2022 thing. Any metric that can be construed as social proof will immediately have its own grifter economy. Investors know this and (mostly) do their diligence.
Finally, showing numbers is hard for an early stage open source startup. At later stages, you should be able to show an actual business with typical metrics, but at the seed stage you often just have a repo and a website. Your goal is just to get a lot of people using your software. You can add telemetry to track that, but that's a thorny decision. GitHub Stars aren't a terrible proxy for popularity, provided that you audit the quality of the following. A project with a lot of organic stars and forks is, at the very least, a project that a lot of people are familiar with.
I'm not saying that GitHub Stars aren't wildly overvalued or gamed, but contextualized properly, they're a reasonable metric to consider, particularly at earlier stages. Most investors aren't just throwing millions at random repositories with 20k Stars from obviously spam accounts.
The market is completely flooded and promotors cannot practically sift through the sheer volume of mixes published online -- so they go by Internet points instead.
Github stars used to really mean something. Having 1k+ was considered a stable, mature library being used in prod by thousands of people. At 10k+ you were a top level open source project. Now they've been gamed by the dead internet just like everything else, and it's depressing as hell.
I believe that is how they made the final decision on Watson over Mayfield. Oh, wait, I don't think anything can explain that decision.
Also from Cleveland.
Go Guardians! Go Cavs!
I agree it's idiotic; I'm quite confident that it wouldn't be that hard to cheat this system, and even if there absolutely no way to cheat the system, it's not like Hacker News points translate to smartness; my most upvoted posts have basically nothing to do with software engineering.
If you want to make it as an actor today, you need a social media following [1]. It is directly relevant to you getting cast. It also helps you connect with other actors, with producers and directors, etc.
Thing is, this isn't new. before social media, your influence was measured in "tear sheets" [2], basically any published story that you're in. This could be something as simple as going to Cannes or Sundance or even just to the hottest club.
Sports also uses a points system (kind of) but it's meant to reflect ability. Take the NFL, for example. Going from high school to college and college to the NFL, you will have stats relevant to whatever position(s) you play. For a QB it's things like interceptions, passing years, running yards, completed throw percentages, etc. You then have the NFL Combine [3]. This is an intensive camp where certain metrics are taken like how much you can lift, 40 yard dash, etc.
All of this tries to make it a science, or at least quantitative. But what I find funny is that despite all this work, it can still fail spectacularly. Like, being the #1 draft pick for the NFL is kind of a curse [4].
And then there's Tom Brady. For people unfamiliar with American sportsball, Tom Brady is arguably the greatest quarterback in the game's history, having 7 Superbowl rings. Thing is, he was a 6th round draft pick in the 2000 NFL draft. For anyone not familiar with what that means, 6th (and especially 7th) round draft picks are like the bottom of the barrel. You're not expected to take a starting position. You may not even play unless 1-2 people get injured. Nobody expects you to be a great.
[1]: https://www.backstage.com/magazine/article/social-media-acto...
[2]: https://avenueagency.wordpress.com/tag/tear-sheet
[3]: https://www.nfl.com/combine
[4]: https://www.si.com/more-sports/2011/01/13/sportscasting-exce...
It's purely incentives. Heavy competition for early signal identification has pushed them to crappier and crappier indicators.
Yes actually
Needless to say they didn’t like when I said this was a worthless metric and we needed to be using something like “working policies” or “time saved training”
I just wanted to build a good product but unfortunately good products are not relevant
There were no complementary workflows or infrastructure or anything.
It was explicitly a move to try to counter epic’s positioning and internally it was very obviously a JR versus Tim pissing contest (and JR was the only one in the contest because Tim didn’t give a fuck about Unity)
I have personally seen several company CEOs (that were billionaires!) do this in different ways. Sometimes hiring people because of it.
I think as a proxy it fails completely: astroturfing aside stars don't guarantee popularity (and I bet the correlation is very weak, a lot of very fundamental system libraries have small number of stars). Stars also don't guarantee the quality.
And given that you can read the code, stars seem to be a completely pointless proxy. I'm teaching myself to skip the stars and skim through the code and evaluate the quality of both architecture and implementation. And I found that quite a few times I prefer a less-"starry" alternative after looking directly at the repo content.
Imagine you're choosing between 3 different alternatives, and each is 100,000 LOC. Is 'reading the code' really an option? You need a proxy.
Stars isn't a good one because it's an untrusted source. Something like a referral would be much better, but in a space where your network doesn't have much knowledge a proxy like stars is the only option.
100k is small, but you're right, it can be millions. I usually skim through the code tho, and it's not that hard. I don't need to fully read and understand the code.
What I look at is: high-level architecture (is there any, is it modular or one big lump of code, how modular it is, what kind of modules and components it has and how they interact), code quality (structuring, naming, aesthetics), bus factor (how many people contribute and understand the code base).
Looking at the commit history, closed vs open issues and pull requests provides a much more useful signal if you can't decide from the code.
(Sometimes still is, but the agents garbage does not help)
If the number of stars are in the thousands, tens of thousands, or hundreds of thousands, that might correlate with a serious project. But that should be visible by real, costly activity such as issues, PRs, discussion and activity.
(This was for admissions iirc - they had limited slots and a portion of them were allocated to people with a strong github rank.)
It is the meaning of having dozens or hundreds of stars that is undermined by the practice described at the linked post.
https://en.wikipedia.org/wiki/Goodhart%27s_law
That said, I believe the core problem is that GitHub belongs to Microsoft, and so it will still go more towards operating like a social network than not - i.e. engagement matters. It will still take a good will to get rid of Social Network Disease at scale.
There are much better ways of finding those who have good taste.
Two projects could look exactly the same from visible metrics, and one is complete shell and the other a great project.
But they choose not to publish it.
And those same private signals more effectively spot the signal-rich stargazers than PageRank.
https://web.archive.org/web/20170715120119/http://advogato.o...
They don't.
I've helped with due diligence on a couple projects. VCs know that metrics can be gamed because they see it all the time. Stars, followers, views, clicks, likes. A portion of entrepreneurs have been gaming every metric since before you and I learned how to program. It has always been this way and always will.
Most of the VC-related comments have interpreted this article to mean that VCs are so dumb that they haven't realized that stars can be faked, but in reality VCs spend so much time sorting through fake metrics that they understand this probably better than most here.
If you've ever gone through due diligence for an acquisition or big investment round it's amazing how much work you have to do in order to prove that your metrics are real. When things got crazy after COVID there was a short time when VCs were trying to move so fast that they skipped this, but it resulted in some high profile fraud cases.
During normal times, you will get grilled on metrics. They might see stars as a signal for rising stars, but they're not throwing money at projects based on star count like many commenters assume. They will do a deeper dive before investing and they will call it off if things aren't adding up. The amount of diligence scales with the investment, so someone getting a $10K check can get away with a lot of fraud but that $2mm funding round isn't going to cross the finish line based on star count.
The fake accounts often star my old repos to look like real users. They are usually very sketchy if you think for a minute, for example starring 5,000 projects in a month and no other GitHub activity. One time I found a GitHub Sponsor ring, which must be a money laundering / stolen credit cards thing?
Even 10 years ago most VCs we spoke to had wisened up and discarded Github stars as a vanity metric.
GitHub should also introduce a way to bookmark a repo, additional to the existing options of sponsor/watch/fork/star-ing it.
one VC told me, you'll get more funding and upvotes if u don't put "india" in your username.
Founders need the ability to get traction, so if a VC gets a pitch and the project's repo has 0 stars, that's a strong signal that this specific team is just not able to put themselves out there, or that what they're making doesn't resonate with anyone.
When I mentioned that a small feature I shared got 3k views when I just mentioned it on Reddit, then investors' ears perked right up and I bet you're thinking "I wonder what that is, I'd like to see that!" People like to see things that are popular.
By the way, congrats on 200 stars on your project, I think that is definitely a solid indicator of interest and quality, and I doubt investors would ignore it.
I think VCs just know that there are no reliable systems, so they go with whatever's used.
- link: https://github.com/pathwaycom/pathway
- watch: 115, fork: 1.6k, star: 63.5k
- issues: 32, PR-s: 3
And compare to other ETL tool, like Apache Airflow - used by me and many machine learning folks:
- link https://github.com/apache/airflow
- watch: 777, forks 16.9k!!!!!, Stars: (only!) 45.1k
- issues: 1200 (!!!), PR-s (501!!!).
It’s easy to dunk on VCs, but the herd effect is rational after considering the typical VC’s background, the intense competition for good deals, and the job requirements — to prudently deploy capital.
Who wants to pitch their boss on investing $1-10M in a product no one uses, built by a team of anons?
This is not to defend the process, but merely explain it. It’s not so different from customer marketing. To win a VC, first understand the VC.
Once hired, VCs cannot easily get fired yet they exert immense strategic control.
Nonetheless, many founders interview summer interns harder than VCs.
Heuristic: after removing capital, would you hire the VC to be your boss?
Great VCs are worth the equity and will turbocharge startups. When you find one, don't haggle. Get a fair deal, and get right back to coding.
Bad VCs will destroy companies the same way soccer stars would destroy basketball teams if made the head coach.
you instantly got like 40k likes - but there was a catch
algorithm saw you getting a lot of likes from Iran/Pakistan, so went on recommending the post to those countries, got no response and stopped recommending said post altogether
in a sense, it became a self-regulating system, where fake impressions extinguish their very reason to be bought
In general, I’ve been dissatisfied with GitHub’s code search. It would be nice to see innovation here.
You'd want to discard a lot of the noise in the bottom 20% of linking power. You want to focus more on the 'trust' factor.
* https://arxiv.org/abs/2412.13459 (2024/2025) - Six Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Spams, and Malware
Why am I not surprised big Capital corrupts everything. Also, Goodhart's law applies again: "When a measure becomes a target, it ceases to be a good measure".
HN Folks: What reliant, diverse signals do you use to quickly eval a repo's quality? For me it is: Maintenance status, age, elegance of API and maybe commit history.
PS: From the article:
> instead tracks unique monthly contributor activity - anyone who created an issue, comment, PR, or commit. Fewer than 5% of top 10,000 projects ever exceeded 250 monthly contributors; only 2% sustained it across six months.
> [...] recommends five metrics that correlate with real adoption: package downloads, issue quality (production edge cases from real users), contributor retention (time to second PR), community discussion depth, and usage telemetry.
Finding any curse words in hidden comments in the commit history is for me a good indication of a human working on a passion project, though ymmv.
And there are always exceptions to the exception of the exceptions.
In my opinion, nothing could be more wrong. GitHub's own ratings are easily manipulated and measure not necessarily the quality of the project itself, but rather its Popularity. The problem is that popularity is rarely directly proportional to the quality of the project itself.
I'm building a product and I'm seeing what important is the distribution and comunication instead of the development it self.
Unfortunately, a project's popularity is often directly proportional to the communication "built" around it and inversely proportional to its actual quality. This isn't always the case, but it often is.
Moreover, adopting effective and objective project evaluation tools is quite expensive for VCs.
I'm not supporting this view but it is what it is unfortunately.
VCs that invest based on stars do know something I guess or they are just bad investors.
IMO using projects based on start count is terrible engineering practice.
Surely a project's popularity is often related to its utility. A useful and popular project seems like exactly the kind of thing a VC might be interested in.
Hype helps raise funds, of course, and sells, of course.
But it doesn't necessarily lead to long-term sustainability of investments.
It’s more expensive to compute, but the resulting scores would be more trustworthy unless I’m missing something.
https://github.com/karakeep-app/karakeep
Sounds useful.
I’ll star it and check it out later ;)
Unfortunately I still look at them, too, out of habit: The project or repo's star count _was_ a first filter in the past, and we must keep in mind it no longer is.
> Good reminder that everything gets gamed given the incentives.
Also known as Goodhart's law [1]: "When a measure becomes a target, it ceases to be a good measure".
Essentially, VCs screwed this one up for the rest of us, I think?
[1] https://en.wikipedia.org/wiki/Goodhart%27s_law
Id suggest the first question to ask is "if the project is an AI project or not?" If it is, dont pay attention to the stars - if it's not, use the stars as a first filter. That's the way I analyse projects on Github now.
I agree that it has been a first filter, but should it ever have been? A star only says that someone had a passing interest in a project. Not significantly different from a 'like' on a social media post.
As a side note it's kind of disheartening that everytime there is a metric related to popularity there would be some among us that will try to game it for profit, basically to manipulate our natural bias.
As a side note it's always a bit sad how the parasocial nature of the modern web make us like machine interfacing via simple widgets, becoming mechanical robot ourselves rationalising IO via simple metrics kind of forgetting that the map is never the territory.
Specifically if those avatars are cute animie girls.
I know you are half joking/not joking, but this is definitely a golden signal.
It’s supposed to get people to actually try your product. If they like it, they star it. Simple.
At that point, forcing the action just inflates numbers and strips them of any meaning.
Gaming stars to set it as a positive signal for the product to showcase is just SHIT.
It does feel like everything is a scam nowadays though. All the numbers seem fake; whether it's number of users, number of likes, number of stars, amount of money, number of re-tweets, number of shares issued, market cap... Maybe it's time we focus on qualitative metrics instead?
I measure my own projects by the enjoyment I got out of them. No sense in chasing validation from others when ones only metric will forever be what’s in their own control.
Is he really? I’ve only heard of him because HN is obsessed with his “AI” takes. Is he really that popular outside of this bubble?
We should do a hall of shame!
I'd give a lot of credit to Microsoft and the Github team if they went on a major ban/star removal wave of affected repos, akin to how Valve occasionally does a major sweep across CSGO2 banning verified cheaters.
For Microsoft this is another kind of sunk cost, so idk how much incentive they have to fix this situation.
My first Open Source project easily got off the ground just by being listed in SourceForge.
I am not successful at all with my current projects (admittedly am not trying to be nowadays), so feel free to dismiss this advice that predates a time before LLM driven development, but in the past, I have had decent success in forums interacting with those with a specific problem my project did address. Less in stars, more in actual exchange of helpful contributions.
On Github stars, I'd argue they are the most suitable comparison, as all the funny business regarding stars should be, if at all, detectable by Github directly and ideally, bans would have the biggest deterrent effect, if they happened in larger waves, allowing the community to see who did engage in fraudulent behaviour.
Easily 1-3k stars per hackathon from student or hackathon participants for a cost of $1-5k. And some free marketing comes with too since participants may post on LinkedIn or other social media if they win something.
Github stars is akin to 'link popularity' or 'pagerank' which is ripe for abuse.
One way around it is to trust well known authors/users more. But it's hard to verify who is who. And accounts get bought/closed/hacked.
Another way is to hand over the algo in a way where individuals and groups can shape it, so there's no universal answer to everyone.
Need to move from skill downloads to skill usage.
They make it easier to sort through options, help with search and discovery, and at least give you a baseline signal for trust can get better over time.
So to me, some signal better than no signal at all.
- I think the zero follower account might be the weakest signal of a low quality account, I think I had zero followers for maybe 5+ years.
I guess it's like fake followers on other social media platforms.
To me, it just reflects a behaviour that is typical of humans: in many situations, we make decisions in fields we don't understand, so we evaluate things poorly.
https://www.youtube.com/@programmersarealsohuman5909
I paid github for years to keep my repos private...
But then I don't participate in the stars "economy" anyway, I don't star and I don't count stars, so I'm probably irrellevant for this study.
Stars only matter when there are very few, like if it has almost none, that’s a red flag. Otherwise it’s just noise.
The way to beautify the pig is to put lipstick on the pig!
This is just clearly...incorrect? You can both modify code without forking it and most software is distributed via a registry or binary download, which also wouldn't be represented in forks. For most projects, the number of forks is a lossy signal for how busy the contributor ecosystem is, nothing else.
Now that money is flowing to Github stars, no wonder people are buying fake "stars"? Seems capitalism is working as expected...
"We ran our own analysis sampling 150 profiles per repo across 20 projects and found repos where 36-76% of stargazers have zero followers and fork-to-star ratios 10x below organic baselines"
This does not looks like appropriate signal to use on github, i doubt that this is organic baseline.If this is used as metric than study might be flawed.
We figured out a workaround to limit activity to prior contributors only, and add a CI job that pushes a coauthored commit after passing captcha on our website. It cut the AI slop by 90%. Full write-up https://archestra.ai/blog/only-responsible-ai
> When nobody is forking a 157,000-star repository, nobody is using it
that is completely not true, i don't fork a repo when i use it, only when i want to contribute to it (and usually cleanup my forks)
> Runa Capital publishes the ROSS (Runa Open Source Startup) Index quarterly, ranking the 20 fastest-growing open-source startups by GitHub star growth rate. Per TechCrunch, 68% of ROSS Index startups that attracted investment did so at seed stage, with $169 million raised across tracked rounds. GitHub itself, through its GitHub Fund partnership with M12 (Microsoft's VC arm), commits $10 million annually to invest in 8-10 open-source companies at pre-seed/seed stages based partly on platform traction.
This all smells like BS. If you are going to do an analysis you need to do some sound maths on amount of investment a project gets in relation to github starts.
All this says is stars are considered is some ways, which is very far from saying that you get the fake stars and then you have investment.
This smells like bait for hating on people that get investment
...the "Likes" on a post - on FB, twttr, LI, HN, ...
...the "Hearts" on post
...the "bookmarks" on a post
...the "upvotes"
...its corollary, the "downvotes"
...the fake dollars in your fake game
...the fake lives in your fav fantasy game
...ad inf
Download counters are abused similarly and are even easier to inflate.
Understanding the real popularity of a project is now even harder with all the AI bots spamming about it.
but i think based on their statement that north of 90% of the buying repos were terminated by github, i'd say there would be very very many more fake stars without any github intervention.
i guess i just wish they hadn't made the first words of the article "Six million fake stars" without putting that into scale.
The thing is, they are all scammers whose emails go unopened… and the tragic thing is, most likely the VCs would require the same treatment if they did get all hyped up and try to get involved in my project.
There is nobody real who's desperately trying to reach me to extend a line of business credit. I'm not working in AI, rather the opposite, was not in crypto, etc etc, so I know it is just email scams from beginning to end, dozens every day.
It's kind of pitiful that if VCs tried to jump in, they would be indistinguishable from the scams.
https://claude.com/contact-sales/claude-for-oss
> Who should apply:
> You’re a primary maintainer or core team member of a public repo with 5,000+ GitHub stars
I can't blame people for maximizing star counts when benefits like these are tied to them. This is a $200 a month subscription, and it did tempt me a bit... Can't imagine what people would do if some venture capitalist dangled millions in front of them. I suppose they'd do pretty much anything.
It's weird that people are using stars as a signal though. Anyone can star a repository, it's essentially a public bookmark. I think the real popularity signal is the number of people participating in the project.
> As one commenter put it: "You can fake a star count, but you can't fake a bug fix that saves someone's weekend."
I'm curious what the research says here---can you actually structurally undermine the gamification of social influence scores? And I'm pretty sure fake bugfixes are almost trivial to generate by LLMs.
“gstack is not a hypothetical. It’s a product with real users:
75,000+ GitHub stars in 5 weeks
14,965 unique installations (opt-in telemetry, so real number is at least 2x higher)
305,309 skill invocations recorded since January 2026
~7,000 weekly active users at peak”
GitHub stars are a meaningless metric but I don’t think a high star count necessarily indicates bought stars. I don’t think Garry is buying stars for his project.
People star things because they want to be seen as part of the in-crowd, who knows about this magical futuristic technology, not because they care to use it.
Some companies are buying stars, sure, but the methodology for identifying it in this article is bad.
Just look at how many cool and legit open projects have the star-meter graph in their README.md - so of course people will start measuring against that metric and start gaming it.
I was surprised myself when I suddenly saw a starstruck badge on my profile. I never advertise my projects but I do feel honored when people think that my contributions are useful and stars are an easy way of showing that gratitude. At least I think that's how it was intended. And now someone is breaking that for scraps (or not scraps.)
This is exactly the bs that pushes services to not offer their own logins anymore, now you have to login with FB or GH or $randomFamousSvc instead of the more anonymous "by email" - just happened to me recently when I wanted to use a trial account, but I totally get it - with abuse trust is substituted with control. It's the same everywhere.. even voter ID.
Sorry, that went off track. I guess just don't look at the stars anymore. Wait, no, don't do that, stars are beautiful and so are you if you read all the way to here. Here's a * for you :)