Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
56% Positive
Analyzed from 3180 words in the discussion.
Trending Topics
#type#string#types#user#email#don#more#using#system#something
Discussion Sentiment
Analyzed from 3180 words in the discussion.
Trending Topics
Discussion (51 Comments)Read Original on HackerNews
So I just define my types and then use typescript-json-schema or similar to build a JSON Schema at build time (i.e. from an npm script) which then I use to validate input using ajv.
The only thing I do on top of that is to use annotations like "@minimum 0" (or, in the email example, "@format email") where the base types are not enough, but those simply go inside comments.
So the compiled package only has ajv as runtime dependency (which you're likely to have anyway, as it's everywhere), you're just defining regular types with some annotations on top and use a dev dependency to build you the JSON Schema. And as popular as zod is, I think JSON Schema is more of a standard and likely to stay with us longer.
I also reference those generated JSON Schemas from my OpenAPI definition, as a bonus.
Zod is the acceptable middleground in my opinion. Zod will allow you to throw a schema against an object and it'll tell you "yes the result fits your schema". This is fine for most projects.
If you want to go zero-dependency, you can see how far you can get with TS's type system. Branded types are kinda cool. NewTypes are also cool, but also high maintenance. Unless you're building a library that millions depend on, it's probably not worth it.
What do you mean?
I'm into Effect from long time and it really scales well the more complex your applications.
Schema is way more advanced than Zod by the way, both at type level and functionality it has a proper decoder/encoder architecture.
You can encode "this isn't just a string -> non-empty-string -> valid email pattern" but a confirmed email the user has clicked on at the type level, by leveraging effectful schemas (and durable workflows if you want).
You may not need it 99% of the time, I myself rarely use that, but it's not a fair comparison.
Zod is more ergonomic, has easier apis and is perfect for most users. Would not recommend schema unless one buys the whole package.
The friction with the rest of the ecosystem is real, though. Most code out there expects you to handle errors with exceptions.
I get the impression that polymorphic return types could get in the way of JSC/V8/SpiderMonkey's JIT, but I haven't measured it and I'm not sure of the actual impact on hot and cold paths. Same for all the allocations caused by custom Option<T>/Result<T,E> implementations.
I think using Zod at the edge (with branded types and whatnot), while keeping return types as T/Promise<T> to keep a sane relationship with the ecosystem is a good middle ground.
If I could add one feature to Typescript it would be something like "as" that actually validates the result against the type system and can fail. Unfortunately, that's way, way easier said than done. It's the bad type of keyword that has unbounded runtime cost because it would have to be a runtime comparison, and there are a lot of design questions about how to write it. However, I still petulantly want it even though I can hardly define it. "zod" is pretty good but you can see how trying to add that as a "keyword" is nightmare fuel for a language-level change.
You can use Pydantic in Python and serde_derive in Rust. I assume most languages have a thing like that.
Suppose I have a User with some attributes like birthday, email and whether they have been verified.
in common codebase, you can see `if (user.verified_at != null)` or something along the lines, in case of parsed code I do feel like I should have types for each of them (or interfaces):
(and imagine having a method which accepts user with birthday and email to send an email day before their birthday, would you create UserWithBirthdayAndEmail type?)it feels like it is going to bloat the interface space, how do you tackle this problem?
The types are cheap to write (they're all derived) and have no runtime impact (types are erased at build/compile time) and these parsing functions are quite small to write
https://www.typescriptlang.org/play/?#code/FAFwngDgpgBAqgZyg...
Suppose you want to add one more property to VerifiedUserWithBirthday and UnverifiedUserWithBirthday, you might get 2 more new types, and somewhere at the higher layer call chains you need to know which enclosing type you should pass so that some method in the bottom chain will accept it.
I am sure there are more elegant ways, but I am struggling to generalize it to most enterprise SaaS CRUD apps, where you have one object with bunch of properties and can conditionally traverse the code logic
If you have VerifiedUserWithBirthday, any value that fails the parsing function is implicitly UnverifiedUserOrUserWithoutBirthday... No need to define it separately. You get the inverse type for free IE a value that is of type User and not of type VerifiedUserWithBirthday.
A new property doesn't mean a new derived type. Only if that new property impacts what a VerifiedUserWithBirthday should represent should the VerifiedUserWithBirthday type be updated and even then, it's not a new type, just an update to an existing type. Again minimal updates needed.
The compiler handles all the validation and will tell you exactly where there are any issues - the compiler is what makes the maintenance cost quite low.
In your instance, you could have:
In this instance, your logic with a method that accepts birthday and email has all the information it needs to make its choice.The difference between this and an assert is that it gets checked at compile time (it can get quite expensive to do the check though).
What can you do in mainstream languages? As much as is worth and no more than that. String -> User is worth it, User -> UserWithBirthday is not.
The reason I've not is - say there's an optional field. Currently we call that null, probably, and check each time if it's there or not. I could instead make a type, like User and UserWithPhoneNumber. Should we be making types for each combination of present/absent fields? That can't be right.
The classic answer is to move the logic inside the domain object, or have a helper function outside the object, so you aren't constantly checking for field presence/absence, but are instead writing the logic once and calling some code.
I'm not sure in practice types can help with this. But I'd love to be proven wrong.
The combinatorial explosion you're picturing only shows up if you make a separate type per combination of present fields, but you don't need to. An independent optional field stays one `T | null`. You only reach for distinct types when fields are correlated and present together because they represent a state, and then it's a discriminated union on a status field, which is N states, not 2^N.
Using types like this also means you can more easily avoid assignment errors, as everything will have a very specific type (e.g. Age instead of int).
The short version is: the shape of a type is inherent to the type itself, but the optionality of its members is dependent on the situation. A type system that solves this problem separates these concepts to allow for this distinction.
I _suspect_ it's possible to implement something like that in typescript but I haven't tried it myself (and I doubt it's very ergonomic).
It's more about writing
over> monoid
nullables with `??` and `?.` are also give-or-take monoids. is it common though to `or` two MaybePhoneNumbers together or to apply a PhoneNumber->MaybePhoneNumber function to it? if not then why mention it?
let's see something meaningfully different like a database schema.
[1] https://esolangs.org/wiki/Trivial_brainfuck_substitution
I don't think disclosing helps here. If the article wasn't obviously generated, why would that affect you ?
The only issue I have is being half-way through the article and realizing I am reading hallucinated text. If I can mark the author once, I won't see them again. This works fine for me. You could argue that disclosing would fix this issue, but the issue is not that AI was used, but that it was not curated.
> Booleans look tidy until somebody adds a third case and exhaustiveness silently doesn’t kick in. Strings narrow honestly.
Like, nobody truly writes like that. It wouldn't get past any competent editor.
Strings narrow honestly? What does that even mean? This kind of 3-word precision is useless and they appear everywhere in the article. We get the point with in the first sentence, no need to add more.
It’s frankly depressing when (2018) oldies-but-goodies get reposted here for the Nth time. The clarity of thought and obvious effort that went into communicating that thought was expected for top-voted posts at the time. Now those posts appear exceptional in this era’s standard of “the LLM just cleaned up my notes” slop.
If the result is better for having used AI, why wouldn't an author want to disclose it?
At the end of the day, the ideas within the content are what matters. An idea has or does not have merit regardless of if it was produced entirely by a person, or by a person using AI as an editor, or 100% generated by AI. If you need a disclosure on if an idea was produced by AI, you are saying that you have no interest on debating the content on the grounds of the arguments it is making, while simultaneously ceding you can’t tell the difference between someone using AI and someone who isn’t (which undermines one of the primary arguments against AI, that it makes for inferior outputs).
I don't speak typescript so am probably missing something obvious. but. why would you parse an email(or anything really) into a string? (or string equivalent) When parsed it will end up as a specific email object, that is, something closer to a C struct. What is the articles dance doing?
In sufficiently nominal type systems, I can hide the constructor for an EmailAddress type (as in: nobody can just construct an EmailAddress type). In Haskell speak, I can then export a function parseEmailAddress = rawString :: string -> EmailAddress. The function parseEmailAddress is the only place that has access to the constructor. Which means that the only way to turn a string into an EmailAddress is by calling parseEmailAddress.
Note that at runtime EmailAddress is just a string. The boundaries live in the type system, not on the value level. A structural typing system (as in TypeScript) does not enable that, it forces you to turn EmailAddress into something else than just a string.
Are you confusing Email vs EmailAddress? I think that in many cases would prefer to be EmailAddress represented as a dumb string at runtime. But if you don't, you will easily find other examples where you have 2 structurally similar types, that you don't want to mix up.
If I parsed an emailAddress the thing that came out it would look like {'domain':'example.com', 'user':'john-doe'} or emailaddr.domain emailaddr.user and a emailaddr.address method if you like that form. Even if what I parsed ended up as a single string-like field, I would still name that field. emailaddr.address
Salutes for the bit on hiding the constructor, that makes a lot of sense.
It probably does not help anything that in my one attempt at making a javascript web application I did not bother trying to understand how javascript likes it's objects and just forced a python looking model onto it. If any of the web development team saw my code I would definitely get laughed out of the club.
The article's dance is to avoid having extra fields that are completely unnecessary here. They want some kind of nominal email type, that is actually a string, so can be used in places where a string is needed, but when a method requires an "email" you can't use any string.
It's a pretty common pattern in functional programming and in many other languages nowadays
It's the same thing. In the latter case, something has validated that your NonEmpty has a first and a last element. It's all validation before you stick it in a type that asserts that the validation is guaranteed to have occurred so every function receiving it doesn't need to do it itself.
Any non-trivial use of a type system will involve making guarantees the type system itself can not actually express [1]. There's nothing wrong with saying "this is a valid email in accordance with my standards" in a type. Merely using the type system to assert "I have some sort of value in the name and host fields" is valid but a degenerate use. "struct Email { name: Name, host: Hostname }" is an even stronger use of the type system, where Name and Hostname are themselves values you can only get by passing some incoming string through a validation process. Asserting that these things exist is just the most basic check possible, but your type still permits {name: "\0\0\0\0\0\0", host: "!"}, whereas under my definition, assuming that Name and Hostname are reasonably defined, that value will not be ever be something that can be witnessed.
In fact in general, while I don't absolutely rigidly apply this, especially in smaller script-like programs, when a "string" appears in my strong types that specifically means "this has unbounded contents". It's an appropriate type for "stuff I got off a network" or "stuff a user typed". What stuff? Don't know. Haven't checked it yet. When I do it'll get a more specific type like a Username or DecodedUTF8String or something else. Thanks to people using way too many "strings" and "ints" in the world I have to constantly explain to my LLM that I want stronger types. I'm yet to find the invocation to put into my CLAUDE.md or equivalent to get it to do it right the first time consistently.
[1]: With a wistful stare into the distance acknowledging the theoretical utopia of dependent types... but it doesn't seem to be coming down from "theoretical" any time soon.
What did you mean by that? You don't accept mutability or any inputs on your state of mind?
This one barely scrapes by at what feels like 30-40% "slop": "honestly", "the one thing", etc...
...but I did learn something about "Brand" types, and have personally tried to do more of "parse don't validate" in my own code.
Recently I did this similar trick for `exec( ValidExecutable(...) )` [python], where it required tagging/washing through a private function/variable to "get" the private bit.
All the scanners tend to light up when they see "exec" at all (eg: `exec( "pandoc" )` for PDF generation), but I needed to hard code a few "expected" pandoc locations so the imaginary hackers couldn't shadow "pandoc" on a path location they controlled.