Show HN: Superlog (YC P26) – Observability that installs itself and fixes bugs

MMagnanten 1 day ago 45 commentsRead Article on superlog.sh

ES version is available. Content is displayed in original English for accuracy.

Hey HN, we’re Nico and Arseniy, co-founders of Superlog (https://superlog.sh). We're building a self-installing, self healing observability tool meant not to be opened. It has a wizard that daily sets up proper logging and an agent that investigates errors and opens PRs.

Super short demo: https://www.youtube.com/watch?v=xFhU9Mk247M.

In our earlier startups, we tried Sentry, Datadog, Grafana, Dash0, and nothing was good enough. Proper telemetry and alerting still requires a ton of manual setup. We struggled with adding good logs, so debugging was tough, especially as codebases grow at a faster pace. Meanwhile, the Datadog/Dash0 bill kept climbing, and we still spent engineering hours to learn, configure, and maintain our observability tooling.

With Sentry, we found ourselves flooded by a stream of alerts into our Slack channel, most were duplicates or lacked context, so alert fatigue/constant interrupts were a real pain. The #ops notification is consistently the worst feeling on a Saturday morning

We’ve seen too many times servers run out of memory and disk, and three AWS metrics giving us three different values. Half of the graphs on dashboards are normally empty or outdated, and manually clicking through UIs, especially when the team is small, seems like a huge waste of time.

At some point we realized that solving this problem would be more valuable than the things we had been working on, and we had the expertise to do it, since Arseniy had spent years at Datadog, getting paged during the night to debug production incidents. So we decided to build a platform that would just work: agent-first, MCP-native, zero-setup.

Here’s how Superlog works: we have a wizard that scans your repo, and automatically instruments it with well-structured logs, traces and metrics via OpenTelemetry. We make sure to highlight main failure modes, endpoint performance, usage per tenant, and LLM/upstream cost (by callsite, tenant and model).

Errors get fingerprinted and grouped into incidents, so you see one issue, not a thousand duplicates. When you get a notification from Superlog, you see a clear failure summary, its inferred severity and impact upfront.

Then the agent investigates and tries to solve the issue. If it has enough context, it produces a concise and tested PR. If it doesn't, it posts its findings for the investigating team, and automatically pulls in the engineers that could contribute more context based on documentation, previous investigations and Slack threads.

Either way the output is one clean PR per incident, posted in Slack, that you can merge, ignore, or open as a Claude Code session and modify.

Three things we think are different from other observability vendors:

(1) We solve the setup pain. The wizard will instrument everything with native OTel SDKs, respecting the semantic conventions, with proper service and environment tagging. We’re also working on native automatic dashboards and alerts, so that you can see what’s going on in a glance and don’t miss subtle failure modes.

(2) Our telemetry doesn’t decay. The wizard runs daily, and keeps adding logs, alerts and dashboards where it’s needed. You don't have to remember to instrument new features. The next time something breaks, the data you need to debug it is already there.

(3) Our goal is to solve alert fatigue. We use agents to merge similar errors and refine the summaries, giving you relevant information upfront. We have a custom evaluation setup that makes sure that our summaries are dense and correct, and severity and impact is on point. We also give you confidence scores for every LLM-enhanced metric so that wrong guesses don’t get boosted.

Important: superlog telemetry is vendor-neutral, so you keep all the logs/metrics/traces we install. Pricing is on the site. We're early, so expect rough edges and please tell us when you find them.

You can try it at https://superlog.sh. We'd love to hear what you're using today, what's broken about it, and whether the "one mergeable PR per incident" model sounds useful or terrifying. Especially keen to hear from folks running integration-heavy products, anyone who's rolled their own observability, and anyone who has tried Sentry / Datadog MCPs and given up. Comments and feedback welcome!

⚡ Community Insights

Discussion Sentiment

83% Positive

Analyzed from 2145 words in the discussion.

Discussion (45 Comments)Read Original on HackerNews

eddy-sekorti•4 minutes ago

Congratulations with the launch and it looks good, do you provide any API for other tools to integrate superlog.

htrp•about 21 hours ago

Not their fault

Railway their hosting provider is entirely down as well

From https://status.railway.com/

>Identified

>Google Cloud has blocked our account, making some Railway services unavailable. We have escalated this directly with Google. The Railway Platform team has since confirmed access to Google Cloud and is working on restoring access to all workloads. We have access to some of our Google Cloud–hosted infrastructure and are working to restore the rest of the service. We apologize for the disruption.

signalbright•about 20 hours ago

That's true, unfortunately. Very sorry abut this, and it's my fault that we weren't resilient to this failure. We'll be migrating the webapp to a more resilient architecture as a first priority.

The landing page is up again now but unfortunately will have to default to accepting demo requests for now :(

OsrsNeedsf2P•1 day ago

There's very few startups that I look at these days and don't think to myself, "I could just write a Claude skill for that". This one seems pretty cool. Congrats on launch

signalbright•1 day ago

Thank you! super happy that's how you feel about Superlog. Let us know if you want to try it out and/or have any feedback :)

behat•1 day ago

>> Superlog scans your codebase and infrastructure to add new alerts, metrics and dashboards, preventing tricky failure modes and observability decay.

This is interesting, and my prior belief here has been that this automates a one time set up, and perhaps a quarterly clean-up or reactive monitoring changes that people do today. Curious what your experience has been - do teams accept these ongoing maintenance PRs at a good rate?

For full disclosure / context: we work in a related space - investigation agents for production issues.

jonnyasmar•about 21 hours ago

Building on the "investigation > patch" point — running Claude Code, Codex, and Gemini CLI daily, the pattern I keep noticing is that auto-fix is fine on "obvious bug, obvious fix" (off-by-one, null check, missing await, error not propagated). It falls over on "subtle invariant" bugs where the existing code is intentionally weird to preserve something non-obvious — the PR looks right and breaks something three modules away.

The tool I'd actually want isn't "tries harder to fix everything." It's one that credibly says "this touches an invariant I can't see — here's what I think might happen, you handle it." Calibrated humility beats confident patches.

Curious how your high-confidence threshold actually works. Self-reported model certainty (notoriously unreliable), test coverage in the affected area, blast-radius of the change, something else?

ottoid•about 21 hours ago

I would love to use it but the website is down

"Please check your network settings to confirm that your domain has provisioned.

If you are a visitor, please let the owner know you're stuck at the station."

Would love to learn more and consider being a customer!

signalbright•about 21 hours ago

Apologies everyone! indeed our webapp unfortunately went down together with Railway - we're working hard to bring it up

e12e•1 day ago

Interesting project - but you need to add some information on where the data goes. As far as I can tell, code goes to some upstream ai provider (for installing, for analyzing).

Telemetry goes to some provider or local hosted solution? And then to your upstream ai provider for analysis?

signalbright•1 day ago

Thanks for the feedback!

When you're installing Superlog, you can use any coding agent you'd like, including a local model.

Your telemetry then goes into our data stores, and right now we have one DC on the US west coast.

Whenever there's an error log or trace, Superlog can analyze it and prepare a resolution PR (or a note if something needs to be done manually).

This can be turned off and then the incident can be sent to your own models via a webhook.

We use one of the frontier models for that (it's an upstream AI provider). We're working on our own fine-tuned version of a SoTA model to minimize dependency on other AI providers.

To investigate an incident, we clone the repo in our worker, and pass the repository files to a coding agent in a sandbox. The agent has an MCP that gives it access to the telemetry (logs/metrics/traces) of the project.

The coding agent will then investigate the incident and prepare a patch. It hands over the patch via a tool. The worker then deterministically pushes the patch to a branch and opens the PR.

This way the agent doesn't have full Git access and can't do anything it's not supposed to do in the repository.

tuo-lei•1 day ago

investigation is the hard part, not generating patches. we've had prod issues where the fix was obvious once you knew the cause, but finding the cause meant connecting an error trace to a config change from 3 deploys ago. if the MCP only surfaces traces and logs from one service the agent is going to propose workarounds instead of actual fixes. how deep does the investigation context actually go?

signalbright•1 day ago

Great question! The investigation agent has access to all the telemetry - not only one service. So we can actually trace the root cause in such complex cases!

There are good ways to link operations between different services with OpenTelemetry (for example, passing the parent trace id in an inter-service HTTP/gRPC request). It's a bit tedious to do by hand, that's why we're publishing the skill that does that for you.

And totally agreed on config changes and deploy info. We've seen that having good environment and version control (commit hash, file name, line number) tagging is extremely important for root cause analysis, so we go hard on this in the skills.

We also have many infra integrations in our roadmap to make sure that we can deeply analyze the infra/config side of things.

byoj•1 day ago

Interesting product, but had similar question, i think it will take a little time to be mature for production systems: as what i can see right now is very straightforward, most of the observability providers are doing this, in case you already have the observability stack setup. we currently use Openobserve they have an ai agent that provides correlation, cause and fix for any issues . The real differentiator can be on how accurately you can do the investigations, and how brutally you can steelman the ability for it locate the issue, cause and fix. Good luck on the launch

tommy29tmar•about 23 hours ago

Before running the install prompt, I’d want to see a dry run: which files it would touch, what telemetry leaves the box, provider calls, and what “high confidence” means. For debugging tools, generating a PR is the easy part; knowing whether it’s grounded in enough evidence is the part I’d worry about.

0xferruccio•1 day ago

Congrats on the launch, this looks very promising. I hadn't seen any installation that uses a URL to point to a skill, seems like an evolution of wizard scripts

That been said for more complex setups like on kubernetes where you need a collector and an operator I found OTEL to be super painful to setup a couple of years ago. Has it gotten any easier now?

signalbright•1 day ago

Thank you! Glad you liked the install process :)

I'm afraid a collector and the operator are still the recommended way to go by OpenTelemetry (https://opentelemetry.io/docs/platforms/kubernetes/getting-s...). We're still working on a custom skill for Kubernetes, but the general skill should give you a sane default already.

A good way to start can be to start sending traces/logs directly by instrumenting the service and putting our backend as the collector.

I also help out personally whenever our clients have any questions on setting up the telemetry :)

sskates•1 day ago

I love the launch! Automated observability that feeds back into the product development process is the future of this category vs having to spend a lot of time configuring and managing the infrastructure yourself.

It's something we've thought a lot about at Amplitude. We'd love to talk.

signalbright•1 day ago

Awesome, let's definitely have a chat! I'll shoot an email via BF :)

exabrial•1 day ago

It deleted the codebase, which technically.. is a valid way to get rid of all of the bugs.

I kid, nice work. As others have said, investigation, and understanding "the why it was originally done that way", not the patch, is usually the lion share of the work.

solfox•1 day ago

Love the concept! Some feedback: I went to sign up to give it a go, but the set up process left me feeling a bit untrusting - so I backed out for now. I'd prefer more explanation about what to expect, what I will get, how it is safe, etc before asking me to run a prompt.

signalbright•1 day ago

Thank you! Very good point.

Right now, the prompt will enumerate all the services and install the OpenTelemetry SDK (https://opentelemetry.io/) in each service.

Then for every service, the skill will make sure that:

- Every time something breaks and an operator needs to take a look, there's an error log - All important steps in a process emit info/debug logs (so that an issue can be investigated) - Operations are covered with spans with relevant attributes. - Cost (LLM tokens), API performance (latency/RED), tenant activity (cost/usage per tenant) are covered by metrics so that you can use Superlog MCP to build cool dashboards.

For most common stacks like NextJS, FastAPI, React Native/Expo etc. we have a custom skill that explains the best practices for this specific technology. For all the other stacks we ask the agent to use general best practices.

We have evals for all custom skills where we start from a starter project, run the agent with the skill and use LLM-as-a-judge to compare it to a human-written 'golden patch'.

In general, we try to:

- minimize diff, so that the instrumentation is easy to review - make small chunks of additive diffs vs huge indents / moving logic around - minimize new dependencies - use well-supported and audited OTel SDKs vs custom libs

You can read the skills here: https://github.com/superloglabs/skills.

I'll make sure to add this to our landing and print this out as the agent writes the code!

Thank you for the feedback!

user-•1 day ago

I would love to try it but I got stuck when it asked for Slack since I dont use that.

signalbright•1 day ago

Hm, sorry about that!

I made the Slack onboarding step mandatory for now since we thought that a lot of our value was in sending investigations and PRs, and Slack was what we used ourselves.

What tool do you use for communication around your project? If you don't want to share publicly, could you please shoot a line to:

ash [at] superlog.sh?

Would love to learn about your usecase in more detail too!

user-•1 day ago

Lots of places use various slack alternatives or teams/google.

For my current project I would use webhooks/email just like I do currently for my monitoring and alerting.

quinncom•1 day ago

I don’t use Slack either. What about solo indie founders who don’t use “team communication”?

signalbright•1 day ago

Got it! What channel would you prefer instead? Would Telegram/WhatsApp/Signal/iMessage be good?

The platform itself doesn't need Slack to function, we just observed that users got more value if they could get notifications somehow, so I'm more than happy to add more comms platforms :)

evil-olive•1 day ago

on your pricing page:

> Start with one repo. Price the rest when the signal is real.

which makes it sound like possibly the $150/mo price is per-repo?

I think that could use some clarification - if I have 10 services in a monorepo vs 10 individual service repos, does that 10x my cost?

signalbright•1 day ago

Very good point, thank you! Let me remove this phrase, you're right, it's misleading.

The pricing is only by usage (traces/logs/metrics) and investigation credits. We don't charge extra for repos :)

rdataguy•about 23 hours ago

Seems very useful, congratulations on the launch!

3form•1 day ago

Any plans for an on-prem version?

signalbright•1 day ago

Good question! We don't have one as of today, just because we're iterating very quickly and a cloud version is the quickest way for us to keep things lean and up-to-date, but we're not far from having one.

Could you please send me an email at ash [at] superlog.sh? I'd love to hear more about your use case - we might have something for you very soon!

FantasyLabai•1 day ago

This is a very interesting idea and im excited to see where this goes. Congrats!

signalbright•1 day ago

Thank you! :)

tontinton•1 day ago

What's your moat?

signalbright•1 day ago

Great question! I like to think about this in two ways:

1. Counter-positioning. Most existing tools have invested heavily in their web platforms and compete on their UI/UX. But actually, what matters to our clients is that bugs are fixed. Our top clients would rather never open our tool at all. If our competitors want to beat us, they essentially have to fight against their established business models that hinge on users looking at their browsers.

2. Evals. In order to have the most accurate RCA analysis you need a very good suite of evals: what was the right root cause in this bug? what is the right fix?. We're investing into this heavily, and as one of the early movers we have a big advantage here.

At the same time, I tend to approach strategy with a lot of caution. A lot of the canonical reasoning behind 'startup positioning' is based on extrapolation from trends, but surprisingly few analogies work in economics.

Our focus right now is: - talking to our users - making sure they have the best experience

aloknnikhil•1 day ago

The typical issues I have seen with LLMs / Agents tend to be reactive in their fixes. So they tend to "patch" the symptom more than "fix" the root cause. Interested to see how you solve this problem.

signalbright•1 day ago

You're right! It's a big issue and I don't think there's a silver bullet.

We have an eval suite with code+telemetry fixtures and a golden RCA+patches and an LLM-as-a-Judge. So whenever we get feedback from our users and they're OK with it, we use their feedback to create an eval case (it's still quite manual since you have to calibrate the case).

We use Superlog to observe Superlog, so I often extract cases from our own errors. The PRs get better and better, but, of course, it's sort of a continuous improvement process.

TZubiri•1 day ago

Sorry to be crude, but this sounds either dead on arrival, or at least needing a pivot, or a rephrasing of the pitch:

The moment something changes the system, it no longer observes it, in fact observing something might cause it to change ( https://en.wikipedia.org/wiki/Observer_effect_(physics) )

Either it's a tool for observing or it's a tool for fixing issues, it cannot be both, by physical principle.

Best case scenario here is that the product succeeds, and then you need to instrument the product itself in order to observe it, like debugging the debugger. But it wouldn't be an observability tool, it would shift the product that needs to be observed from the previous source code that is now a target language into the new source code that is now your product.

signalbright•1 day ago

Love the analogy! We honestly just wanted to have this product ourselves, and that was our primary motivation behind building it.

I agree with the philosophical principle! If you give a rigid observer an incentive to 'remove bugs', it will happily silence all alerts and report success.

Our goal is to make sure that doesn't happen. The investigation agent is actually a separate agent with a separate goal.

In practice, we rarely see the agent just silencing stuff. When this happens, I get on it and make it an eval case :)

PhunkyPhil•1 day ago

How does a grep or read affect the observing system?

I guess the change in voltages, arrangement of registers, filling of buffers in the network stack are changing but... what?

TZubiri•about 12 hours ago

It is a well known phenomenon in haskell or OOP in general that just reading can cause systems to change their execution paths. Which is why it's not sufficient to make write/setters private, but reads/getters too.

However in the case of SuperLog the path to system change is quite direct

"We fix bugs. Superlog prepares a resolution PR for every incident. If Confidence Gate fails, it posts findings for the investigating team and pulls in the engineers who can add context."

The system literally pushes (or Pull requests, whatever, github is dead anyways) a code change.

philipallstar•about 22 hours ago

The observer effect is that observing itself modifies the system. This is nothing to do with that.