Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
BigInt Shipping in Firefox (wingolog.org)
327 points by ingve on May 23, 2019 | hide | past | favorite | 163 comments


Here is the proposal[0].

There are important real use cases for this. I authored an implementation of the FNV64 hash function[1] at my last job. I needed to use the BigInteger.js[2] library. It truly surprised me that Javascript in 2018 did not support something so simple as 64-bit integers.

[0]: https://tc39.github.io/proposal-bigint/

[1]: https://golang.org/src/hash/fnv/fnv.go

[2]: https://github.com/peterolson/BigInteger.js


JavaScript can't easily evolve past the optimization tricks used in the major implementations. They all heavily rely on NaN tagging which makes 64-bit value types problematic. For similar reasons the architecture of V8 in particular has effectively dictated the design of WebAssembly, especially regarding control flow constructs like goto and coroutines.


> For similar reasons the architecture of V8 in particular has effectively dictated the design of WebAssembly, especially regarding control flow constructs like goto and coroutines.

The weirdness of WebAssembly with regards to control flow constructs is rooted in Emscripten's "relooper", which was developed in a naive manner ignoring previous (superior) research on irreducible control flow.


Yes, but in both cases this is because the JS runtimes (v8 in particular) makes it really difficult to feel okay with unconstrained control flow. The relooper may have been an influence but it was primarily pressure from the v8 people (I was in these meetings)

I'm less certain on my memory but I recall that Spidermonkey was more easily able to do fancier things with control flow.


I worked at Mozilla Research in the early days of Asm.js and sat next to that team. I would frequently tell them that they should support arbitrary control flow, and that there were better ways to achieve it with Emscripten than the Relooper. There wasn't much interest in improvements at the time, although it appears that the LLVM WebAssembly backend now uses a better approach.

While SpiderMonkey itself might support fancier things with control-flow, they were writing a separate compiler for Asm.js that relied on structured control flow for SSA construction, but I have no experience with how things evolved after that. Reading the thread at https://github.com/WebAssembly/design/issues/796, it seems that V8 and SpiderMonkey have roughly equivalent design constraints.


I am pretty sure the reason for this is that alternative designs for control flow cannot be implemented in linear time; Java, for example, has experienced epic denial of service attack issues on their verifier design, where code of specific forms leads to quadratic verification time.


There are a few different performance issues with the Java bytecode verifier. Are you referring to this one?

https://pdfs.semanticscholar.org/7dd2/b5359ab507fec1b1223db6...

The cause of this is that the dataflow analysis performs nontrivial merging of dataflow facts, requiring a reanalysis. In the WebAssembly design there is no merging; any divergence of stack effects is an error.

There are other deeper issues with the Java design: bytecode subroutines, the fact that the Java subtype relation doesn't form a semilattice, and way in which Java bytecode models object initialization.


I have an interest in this - can you point at the earlier, superior research, please?


Yeah - plus Javascript doesn't even support 32 bit integers.


The implementations usually do (in the sense of optimizing them)


v8 only has 31-bit 'SMI' and not 32-bit ints, iirc.



Correct, but it is the only one to do so.


You are right. We can store and manipulate Int32 arrays as a typedarrays. The arithmetic is limited to a few operations.


Eh? What exactly is JavaScript's numeric type then? I always presumed it was a 32-bit, signed integer..


JavaScript uses double. Though its precision is more than enough to support 32 bits. Actually it supports 53 bits for integers, then it gets imprecise.


lua does the same thing. It's not crazy.


Failing to distinguish between integer and floating point types is crazy. Not as crazy as TCL’s everything being a strong, but not the work of a mentally well person.


It's a work of RAM-constrained and time-constrained people in early 1990s, following approaches developed on 1970s for Lisp.

Packing as much as possible into a single cons cell, or a JS value, was (and is) important, so bit masking to distinguish between integers, floats, strings, functions, objects, was an inevitable approach.


Lisp has distinct integer and floating point types with the correct behavior for those types. At runtime they might get passed around as tagged data, but that’s not visible in the language semantics. In JavaScript, by contrast, the language exposes only a double float rule and you have to use floats where you’d otherwise use integers.


I agree. A Lisp may not make available all bits of a machine word to an integer, but they won't mix up an integer and a float.

JavaScript tried to make things artificially "simpler" by implicit conversions. As usual, the lack of consistency only lead to more eventual complexity than a sound solution would have in the first place.


"Numbers" actually conceal a lot of complexity.

Is 2 between 1.9 and 2.1? What about between 1.9 and 2.1111111111111111111111111111111111111111?

What about over/underflow? Do you wrap, clamp, throw an exception? Do you round or set +-inf? Can you divide by zero? Can you divide an integer by a floating point number, and if so what would the result be?

What happens if, say, you multiply an 8-bit and 16-bit integer and the result (using twos complement) doesn't fit in 8 bits?

"Use IEEE-754 doubles for everything" answers all these questions. I think JavaScript is basically junk, but I find at least this aspect to be rather elegant (to the point that I suspect it came from somewhere else, haha).


Lua 5.3 added 64-bit integers along with native bitwise operators. There's no end of bickering on the mailing-list over the finer details, but without a doubt 64-bit integers is a win.


> without a doubt 64-bit integers is a win

I'm not so sure about that. There is an alternative implementation of Lua called LuaJIT, which uses a similar NaN-tagging trick, and it also happens to be incredibly fast. LuaJIT uses the 5.1 version of the language, with some backported 5.2 features for compatibility, but it will likely never backport 64-bit integers from 5.3.

It's resulted in something like the Python 2/3 split. Except it's even worse because Lua has relatively little penetration as a general purpose scripting language, but is quite popular as a language that can be embedded in a program to add scripting capabilities. This means it's up to the developer of said program which version of Lua they choose to embed, and from what I can tell, most developers choose speed over 64-bit integer support.


> What exactly is JavaScript's numeric type then? I always presumed it was a 32-bit, signed integer.

It's not ideal to presume such a thing when there are ways to find out the answer, either from the link in a sibling comment or by observing JavaScript code that uses the Google Maps API or similar mapping APIs. How would you represent a latitude or longitude with a 32-bit integer?

I don't mean to pick on you. I have made these same kinds of assumptions too many times and it always got me in trouble.

Here's a particularly embarrassing example. When I did my first election results map for Google, I needed a way to represent the outline of a state. I saw that the Maps API supported polygons, so I thought "that's great, I can use a polygon for each state!"

Until the map went live and someone asked me "What happened to the northern part of Michigan [which isn't connected to the rest of the state]? And where are the rest of the Hawaiian Islands?"

It turned out that a single polygon wasn't enough to represent the outline of a state. Who would have guessed?

Don't let this happen to you. :-)



Kind of bummed this is a new primitive type. Every instance of `typeof a === 'number'` just became `typeof a === `number` || typeof a === 'bigint'`...

EDIT: The more I think about it (and the more comments I get), the more I think this might actually be a good thing. I think bigint will probably not be used in places where integer numbers are currently used, and it might be unreasonable to expect code to work with numbers or bigints. In that case, it's actually probably good that you can easily tell the difference using typeof, without requiring some hacky fix like `Array.isArray`.

It does mean that "don't mix numbers and bigints" is going to become very popular JS advice soon.


To be fair, it really shouldn’t be conflated with ‘number’ - it isn’t the same thing, the behavior differs. And you can (and probably should) use small helper functions for these kinds of checks, pretty much for this kind of flexibility. Kind of curious what other libs (like lodash) will do.


If you're really going to do it right, that small function should also be published as npm package (isnumeric) that takes at least 2 other packages (isnumber and isbigint) as dependencies.


I’m super strongly against this concept actually, which you might gather from mention of lodash.


I think they were making a joke.


Obviously it was some degree of tongue-in-cheek, but I’m not really that amused, to be honest. It needs a leftpad reference or something.


I see I’ve struck a nerve of some kind so I’d just like to clarify that I am not sorry.


Anybody who thinks that's actually good (or even real) advice is going to write horrible code regardless, IMHO.



That's horrifying, I thought the whole isNumber thing was a joke.

Do these guys assemble their application as hundreds (or thousands) of NPM packages containing single functions and then import them together? Because now I actually believe they are capable of it.



I'm still holding out for an int type, I use a double about once a year but I'm completely insecure about using integer operations (e.g. div, mod, as an offset-based index) in javascript.


You can use w3c typed array stuff at least... though its obviously cumbersome.

Bitwise operators also coerce to integer, so they should roughly always be OK.


For code that actually needs bigints mixing numbers and bigints is just asking for horrible bugs like when a bunch of Twitter clients completely broke after 2^53 posts had been made.


By that logic, you should also test if `a` is a string that can be parsed into a number/bigint as well. Or if it's a Uint8Array that you can decode into a number/bigint.

But that's not how you write software. In an application, you have a canonical representation that you pass through your business logic.


I don't know if that follows. I think it's reasonable to expect a piece of code that works with integer numbers to also work with bigints. But it's not really reasonable to expect it to work with anything that closely resembles a number. That sort of weak typing is much more problematic and should be avoided.

But I dunno, maybe it shouldn't be expected to work with bigints either. It's not a regular int, so I doubt people will start using these as array indices in for loops or what have you, so maybe it should be a different type...


Well, bigints are fine as array indices. However they are not interoperable with numbers in any type of arithmetic. I don't think it's reasonable to expect to use bigints in any place where one uses numbers.

"Regular int" is not a distinct type in javascript. There are now two numeric types: 64-bit floating point, and bigint.


Obligatory informational: JavaScript numbers are IEEE-754 standard. You get 53 bits of precision, but can still represent other numbers outside that range, and there are the usual issues relating to dealing with floats.

    Math.pow(2,53)
    // 9007199254740992
    Math.pow(2,53)+1
    // 9007199254740992
    Math.pow(2,53)+2
    // 9007199254740994
    0.1 + 0.2
    // 0.30000000000000004


But it's really not. Pretty major difference between floats and arbitrary precision, like how you'll get different arithmetic results passing in 42 vs 42n. Just like any language that doesn't let you mix float and integer operands. Also, what's even the return value?

You need to make a deliberate decision about your data, not deck the halls with just-in-case programming. The software you write will be much better for it.

Besides, even if you could justify a bunch of typeof checks and you were doing it everywhere, Javascript already has a way to spare you from repetitive code: a function.


> In an application, you have a canonical representation that you pass through your business logic.

Not if your code works with arbitrary inputs.

I have some collation sequences that I’ll need to tweak.


It seems to me that in applications where there is widespread use bigint, you would simply be doing `typeof a === 'bigint'`. And, in nearly all other applications, it would remain as simply `typeof a === 'number'`


That’s what we did for our financial and technical (science/math) backend services that use node.js. We just don’t use the built-in number/float libraries, instead using an abstraction library based on bignumber.js/decimal.js etc, consistently.


It may be a good thing, for historical/backward-compatibility reasons.

But there's no reason that fixed-size ints and IEEE floats can't be intermixed with arbitrary-size/precision numbers, even in a small language, such as Scheme:

https://schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-...

Racket adds a bit more:

https://docs.racket-lang.org/reference/numbers.html

IIRC, Brendan Eich was aware of Scheme when he defined what's now known as JavaScript, but on a crazy-tight schedule, and JS wasn't intended to be an applications language.


You can't do arithmetic with numbers and bigints, so when would you ever accept a parameter that could either be a number or a bigint? The places that `typeof a === `number` || typeof a === 'bigint'` could be used seem rare to me.


What makes you think that, exactly?


The more I think about about it the less I like this new feature. They should have implemented BigInt as a class instead and rely on a transcoder to make algorithms easier to write.


Anyone knows how does bigint serialization and deserialization to JSON look like in practice in Chrome and Firefox?


You can define a toJSON property on the BigInt prototype that gets used for serialization:

    Object.defineProperty(BigInt.prototype, "toJSON", {
        get() {
            "use strict";
            return () => String(this);
        }
    });
Of course you could also serialize to a Number, but that risks losing precision. For example, the Twitter API often returns IDs as both a Number ("id") and a String ("id_str") to safely handle cases where the value might fall outside of a normal double-precision float: https://developer.twitter.com/en/docs/tweets/data-dictionary...


I'm sure its for backwards compatibility but hot damn is that one hell of a footgun. Even with that in place would you ever feel confident returning an id that didn't fit in a double knowing the sheer amount of code that definitely didn't handle that case?


Only Strings are safe,

forty years past K&R.

Wish I knew Tcl.


I like your haiku.

But stay away from TCL...

It is such a mess.


I love TCL, personally. It's my favorite language to play around in.


Fixed it for you:

> I love TCL.

> It's my favorite language

> to play around in.


Your attempt conflicts with both previous haikus. By their evidence, "TCL" is only two syllables, but you treat it as three.

This does make me curious how they're pronouncing TCL in two syllables, though.


TCL is pronounced "tickle." https://en.wikipedia.org/wiki/Tcl


> third party ids are not yours. Treat them as strings, the other party can change them to string any day (happened to me more than once over the years) or they could use a different integer size than what you support (this case, or an unlimited integer value type - some languages have that too)


Only in JS do people seem okay with constantly mucking with built-ins.


Ruby has a culture around it too: https://www.justinweiss.com/articles/3-ways-to-monkey-patch-...

And you can get some of the patching effect in any language with uniform function/method calls, though typically with a little more scoping and thus less ability to inflict unexpected side effects: https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax


For pretty much the past decade (specially the last few years) I feel the culture has shifted to consider core monkey patches as a pretty bad practice.


Sure, but it's also a bad practice in Javascript.



If people could muck with builtins in Python, they would. And worse, they'd do it with inheritance and metaclasses. I say this as a Python dev.


Python3 mucked with built in types...


I don't think monkey patching is really common in any version of python. I've not seen it done anywhere.


Setuptools monkeypatches distutils. Pip always uses setuptools so pretty much everything installed via Pypi is the fruit of a monkeypatch.


Talking about the intro of bytes, bytearrays and the not quite the same types that were backported to 2.x


Afaict, JSON serialization is not supported yet, got a "TypeError: BigInt value can't be serialized in JSON". I'm on Firefox Developer edition 68.0b3.


I suppose one could use a replacer[0] when calling JSON.stringify and a reviver[1] when calling JSON.parse

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


You can certainly serialize to a string but knowing when you should revive to a BigInt seems tricky. You could prefix the strings I guess?


How about serializing to an array ["bigint", "170141183460469231731687303715884105727"]? It takes more space, but is easy to recognize, and easy to generalize to other types.


Presumably that could be confusing if you ever happen to have an actual array whose first element is the string "bigint".


Well, no matter what you do, surely you plan on documenting your API to such a bare bones level that your user isn't racking their brain whether a key is a number or an array of data.


Nah, you just have to also encode all arrays like

    ["array", [...]]
;-)


True, but if you control both the serialisation and deserialisation you can get away with it


Reuse the literal? (postfix with 'n')


You don't want strings that are "digits..n" to automatically deserialize into bigints. What if you have a string like "8n", which is a tractor model, or "9n", which is a smartphone model?

But it would be nice to have literal 8n to convert to a bigint 8n. That's not in the JSON standard though.


JSON serialization does not seem to be supported yet. `JSON.serialize(42n)` gives me a TypeError "Do not know how to serialize a BigInt" in Chrome. Deserialization support also doesn't exist as far as I'm aware. Would be great to be able to choose to decode integer JSON literals as BigInt, but I'm not holding my breath.


You could decode all of them as BigInt, but then you would be accepting a large overhead for the much greater number of constants like 1 or 42 where it's unnecessary. Everywhere you pass that JSON data would need to be updated to operate in terms of BigInt instead of number.

If you only decoded big ones as BigInt, you would have the same problem of updating the code but now you have 2 codepaths at every callsite!

It's perhaps a defect in the JSON spec that it doesn't have a provision to support BigInt, but it's not really clear what should happen!


You need some way of serialization that makes clear it was serialized as a BigInt and thus must be de serialized as a BigInt without the likelihood of it misinterpreting the data like 9n being a tractor model. A single indicator would be too error prone, so the schema has to be more complex. With that complexity it needs to perform and be (cheaply) verifiable.

Quickest thing I can think of is storing it as power of 2 + the difference ending with character n as good measure. It is easy to serialize/de-serialize and you can verify the number by re-serializing to be a proper BigInt and not for instance a String. Confusion still would be possible, but rare. So 1 would be stored as 0+1n, 42 would be stored as 5+10n, etc.


BitInt's "n" suffix appears to stand for "numeric", but that seems confusing when typeof 1 === "number" and typeof 1n === "bigint". Why not an "i" suffix for "integer"? I understand that a "b" suffix is probably reserved in case JavaScript adds support for binary literals.

https://tc39.github.io/proposal-bigint/#prod-BigIntLiteralSu...


'i' suffix are often used in languages for complex numbers. For example, in ruby '1i' evaluates to 0+1i as a complex number.

I suspect they didn't want to conflate the two types.


1i is a bit more difficult to read than 1n in many fonts. I don't think any numeral is plausibly confused with n, so 'iNt' makes some sense as a suffix IMO.


As an aside, c# uses "L" to denote literals as 64-bit integers. "l" works too, but "L" is encouraged for the same reason you give ("l" could be confused as "1")


64-bit integers are referred to as 'long' in languages I'm more familiar with, so personally I would have expected an "L" suffix. 32-bit integers are referred to as 'integers' in many languages, so I wouldn't expect an 'i' suffix. In any case, I certainly agree that "n" is confusing since Javascript already has a numeric type.


Maybe to avoid confusion with mathematical imaginary notation.


I suspect it is more that in mathematics, n is used to indicate integer values as opposed to for example x, which is used with real values.


> BitInt's "n" suffix appears to stand for "numeric"

It actually stands for "not numeric"


For context, more general browser compatibility:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

In Chrome, but not Safari or Edge.


For those who care about JavaScript in embedded environments, Moddable released BigInt support in their open-source XS engine a few months ago: http://blog.moddable.com/blog/bigint/


What's with the comment at the end of the blog entry? lol.


I know right? Big accusation (that’s wrong), with nothing to back it up?


Maybe it's the algorithmic evolution of those gibberish comments WordPress blogs used to get all the time...


I think the comment has been deleted, do you remember what it was?


Basically, somebody left a comment saying the author was wrong and didn't know what he was talking about in a very aggressive way without demonstrating where it was wrong and why.

Sorry for the drama. I know its off topic, but its all I can contribute since I know very little about the implementation of base data types.


Amusingly the drama continued with a new post:

> censoring my comment will not resolve the inherent dishonesty of this article!!!

Even more amusingly, the commenter's name links to disney.com. Or perhaps it is a strange new astroturfing campaign.


Keeping up: The commenter has now acknowledged that the previous accusation was baseless


Andy has previously had some asshats telling him he's wrong on things where he obviously is not. I don't know if there is one individual that holds a grudge or if he attracts an annoying number of know-it-alls.


This is big news! Shouldn’t the title be amended to include that it’s in Firefox Beta for now?


That's awesome! Works great on my Pi digits calculation demo :) https://observablehq.com/@mourner/calculating-pi-digits


Once I was working with a database with large integer ids loading them through json and showing them in the browser. There was a bug where ids didn't match when updating a record. The issue was that the json decoder for Javascript didn't handle large numbers, so an id like 12345678901234567 would show up as 12345678901234568. That was fun to debug.

I wonder if the interpreters will convert between BigInts and small ints automatically now? The backend side was PHP and it did so we never noticed the problem there.


This is one reason why people use strings for IDs in JSON, even if they are actually integers.


Now I can have 2^128 grid squares in my procedurally generated browser MMO, instead of 2^106. That sounds like a joke, but it's actually a fact.


I'm curious why this wasn't shipped as a replacement of the existing Number type implementation, instead of this new type and syntax.


Because Numbers in javascripts are floats, and not allowing floats in numbers would be a serious backward compatibility issue.

Also, I'm assuming there are performance issues with using bigints.


Sounds like you're correct according to MDN:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


If I understand you correctly (which I’m not sure I do), I imagine it is because they can’t simply convert everyone’s floats ( type number) to ints (type bigint) as that would break a large portion of the internet.


Because number is practically the same as double, and you can't just go and change double values to integer values. Also, doubles are fast. BigInts are not.


Python has only Number though, and that can get as large as fits in your memory. Not sure why they didn't ship this with decimals and enabled it by default in javascript.

I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3). It was basically remembering the sign and handling some other notation like 1e100, splitting on the dot, and adding them up the normal way (as integers). To merge the result of the two additions, take care of the carry (if any) and concatenate the two numbers again with a dot.

So from my primitive understanding, if you're already storing a variable-length number (a space win in most cases, compared to the previous 4-byte float), you might as well store two variable-length numbers. Something like ABxx xxxx for the first byte, where if A=0 there is no decimal part (no second variable-length number) and if B=0 there is no next byte (the variable length part of it), then use Bxxx xxxx for each following byte. Then you have arbitrary size and decimals with perfect addition (no floating point approximations anymore).

Would it be that much slower, that it's an awful idea to do by default? Like, would it be noticeable on an average website with the usual godawful amount of javascript? Python gets away with it, so that would be weird. And you could still introduce special syntax (like, I don't know, suffixing an n maybe) to use old and faster primitive types for those who really need that.


Decimals are conceptually similar to binary numbers, just with a different base. So no, they can’t represent all rationals accurately, and yes, they involve approximation.

For example, you can’t represent 1/3 in decimal, for the exact same reason you can’t represent 1/5 in binary. (1/3 is the infinitely repeating decimal 0.3333..., whereas 1/5 is the infinitely repeating binary 0.00110011001100...)

In general if you need to perfectly represent rational arithmetic you shouldn’t be using decimal or binary; you should have a type with an integer numerator and integer denominator. I don’t see the value of making arithmetic dramatically slower without actually solving the approximation issue.

If you are ever calling == on floating point numbers, you are doing something seriously wrong. Floating point numbers are supposed to be used for scientific and numerical computations where there is a notion of measurement error, and “exactly equal” is nonsense, so yes, speed is the entire point.

That’s why making a language without integers is such a serious mistake.


> [first two paragraphs]

Do you really expect I mentioned this example, mentioned I wrote some code that solves this issue, and still never looked up or came across an explanation of why most programming languages behave this way?

> If you are ever calling == on floating point numbers, you are doing something seriously wrong.

Not sure if the 'you' is actually directed to me or if it could be replaced with 'one', but since I mention that it would be nice to do so, I guess I should feel addressed. Thanks for saying I'm doing things seriously wrong, that really helps.

Your comment completely steers any further comments down this thread towards explaining to me why floating point addition is fast but imprecise, rather than what I mentioned that I am actually wondering about: is it that much slower to do arbitrarily large integers by default (separate from the decimal issue), and secondarily solve the decimal issue at the same time (given the example I mention of the method that solves it, at least for addition, in roughly O(2))?


> Do you really expect I mentioned this example, mentioned I wrote some code that solves this issue, and still never looked up or came across an explanation of why most programming languages behave this way?

Well, what you described does not actually solve the issue despite you claiming that it has, so I thought you might be confused. Which is not an insult -- many people are confused about this issue.

And you appear to have misunderstood my comment, which is not about explaining to you why binary arithmetic is imprecise, which you obviously already know. It is about explaining that decimal arithmetic is also imprecise, for the exact same reason, which is something that much fewer people understand.

Your "fix" makes it so that 1/10 + 2/10 == 3/10, but it still doesn't make it so that 1/3 + 1/3 == 2/3. So how is it actually "precise"?

To answer your question about speed: yes, doing things with arbitrarily-sized integers is much slower than doing them with floats (or normal integers for that matter). In the best case, you add at least one branch to every arithmetic operation. And a binary-coded decimal scheme like you described would be even slower still.

It doesn't really matter whether it would make the average website slower, since the average website should not be using floats (OR binary-coded decimals like your scheme) in the first place except for calculating layouts or other numeric calculations where asking whether 0.1 + 0.2 == 0.3 would never come up. For discrete computations they should be using integers -- that's what integers are for.


> Your "fix" makes it so that 1/10 + 2/10 == 3/10, but it still doesn't make it so that 1/3 + 1/3 == 2/3. So how is it actually "precise"?

Fair point! I don't think anyone ever put it quite this way. I mean, I knew that 1/3 cannot be represented in decimal and that decimal, like binary, is imprecise for the exact same reason, but I don't think anyone asked me about the definition of precise and why I think my version of addition fits that definition of precise better :). I think the answer is that, in code, we type in decimal: 0.1+0.2 and not 0b0.1+0b0.10 (if that would even be valid syntax). We work in base 10 most of the time, so we know that operations on 1/3 can not have infinite precision. But that's just something I came up with on the spot, I'm not sure that this is the true reason why it feels more correct.

> It doesn't really matter whether it would make the average website slower, since the average website should not be using floats

Fair enough about floats, but why about arbitrarily large integers? The feature being introduced could have been introduced as 'works out of the box' instead of 'opt in using the n suffix'.

Actually, I just realized it would probably break code that does bit shifts. Maybe that's why bigint is not the default?


> I think the answer is that, in code, we type in decimal: 0.1+0.2

I think you are exactly right. People think of decimals as being the "actual", "primary", "fundamental" numbers, and binary as being an imperfect representation of those. Whereas in reality, both binary and decimal are imperfect representations of rational numbers, and we only think of decimal as being more fundamental because of our writing system.

> Fair enough about floats, but why about arbitrarily large integers

How exactly would you represent them? The best way I can think of is:

    struct BigInt {
        int64_t first_64;
        char *data; // pointer to extra, dynamically allocated data
        int data_len;
    };
This would allow you to avoid doing a dynamic allocation for the most common case of being under 64 bits. And the addition algorithm would probably special case that too, and look something like this:

    BigInt add(BigInt x, BigInt y) {
        if (x.data_len == 0 && y.data_len == 0) {
            int64_t new_val = x.first_64 + y.first_64;
            if (overflow_signaled()) {
                return add_slow_path(x, y);
            }
            return {new_val, 0, nullptr};
        } else {
            return add_slow_path(x, y);
        }
    }
As you can see, there is a ton of complexity here, even just for the simplest possible case. Replacing what was before literally just one instruction, e.g. addq %rbx, %rcx . Also, each number is represented by a 20-byte struct now instead of 8. So now each 64-byte cache line can fit only 3 values instead of 8. Because of all this, it would be dramatically slower.

This is just for the easiest case of no overflow! If you overflow and have to then go allocate memory dynamically and loop over it, it would of course be even worse.


> Python has only Number though, and that can get as large as fits in your memory.

This is not true.

    >>> type(1)
    <class 'int'>
    >>> type(1.5)
    <class 'float'>


Oh, my bad. I thought those were abstracted away.

Still though, any int can get as large as you like by default, no weird -n suffix (that I never saw in any other language -- just like most of Javascript's other recently added syntax, by the way, it's the new Perl).

I do wonder where I got this notion of Number. Is there some other language that has this?


I think stuff like wolfram language and mathematica probably have some "universal" numeric type.

However, I don't know a single mainstream application programming language that has a single numeric type that can handle: arbitrarily large integers, floating point values, and correct decimal arithmetic (0.1 + 0.2 == 0.3). I have at least a passing familiarity with probably about a dozen general purpose programming languages, and none of them can do it. If anyone knows of one, I'd be interested to learn about it.


Common name for what you call "universal numeric type" is "number tower". Most lisp dialects have something like that. What that means is that you have classes for small integers (fixnum), arbitrary precision integers (bignum), fractions, floats, and even complex numbers along with the appropriate abstract base classes (eg. integer, rational, real...) and arithmetic operations transparently use the most appropriate type for the result, i.e. the result of "1 / 10" comes out as "1/10" (of type fraction) and not as float "0.500...something".

Python 3 has mostly same approach to number types.


How odd to notice that my brain really messed that number type up. I could swear Python has a type called (capitalized) Number and that this handles arbitrarily large numbers as well as decimals. Seems like that 'memory' is completely fictional.


Python does actually have an abstract type for numbers, and it is called Number: https://docs.python.org/3/library/numbers.html.


> I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

That's normal floating point math behaviour and fine in most use cases. You can't fix this for add, sub, mul and div without serious performance implications.

> Would it be that much slower, that it's an awful idea to do by default?

Yes. Easily an order of magnitude slower, perhabs two. The fixed size of uint32, float, double, etc. is an important property that allows computations to be fast. And since the size is fixed, you have to accept trade-offs in how or which numbers can be represented. Also, as someone else already mentioned, you can't even store 1 / 3 as a single numeric value. You'd have to store it as a rational number. Things are going to get super complicated once you combine rational numbers in computations and complex formulas.

I'm doing lot's of number crunching tasks with javascript, with performances of up to almost ~50% of equivalent C++ code. If js would have used a non-natively supported number format by default, it would have been useless for me.


I did some testing with pypy, which also works with arbitrarily large integers but iirc does JIT instead of interpretation (like cpython would do), so that should be similar to JS in V8 except that it has arbitrarily large integers.

    a = Math.pow(2, 1023)
    t = new Date().getTime()
    for (var i = 0; i < 1e7; i++) {
        a += i;
    }
    console.log(new Date().getTime() - t);
vs

    a = pow(2, 1024)
    t = time.time()
    for i in range(int(1e7)):
        a += i
    print(time.time() - t)
Both ran a bunch of times: pypy does it in 279ms and nodejs in 250. I chose 1024 for Python because that is where JS starts to return infinity, so the JS code does operations on a number just below that. The time seems to be spent in the loop, as an empty loop or a loop doing a+=0 is 20x faster.

Lowering the exponent to 100, JS spends 269ms and pypy 142. Not sure why that is, but having arbitrarily large integers doens't seem to make this arithmetic any faster.

I don't know how to quickly toy around with fraction-based floats, but at least for arbitrarily large integers, I'm not sure why we're going to have to put up with new syntax.


25 nanoseconds is much longer than a normal double-precision addition, loop counter increment, and conditional jump back to the top of a loop should take, so there's something other than the time taken by additions that's going on in your benchmark. I'm not a Node.JS expert, but I suspect it's not getting JITted properly, or getting poorly optimized if so.

I tried in C:

    #include <math.h>
    #include <stdio.h>
    int main()
    {
      double a = pow(2, 100);
      for (double i = 0; i < 1e7; ++i) {
        a += i;
      }
      printf("%f\n", a);
    }
and timed it. The time taken was 17ms.


Rearanged your js sample a bit, now it runs in ~26ms (first time) and ~11ms (subsequent times) instead of ~220ms in the chrome developer console.

    {
        let a = Math.pow(2, 1023)
        let t = performance.now();
        let max = 1e7;
        for (let i = 0; i < max; i++) {
         a += i;
        }
        console.log(performance.now() - t);
    }
Main problem was, that you should declare a with let.

That benchmark is a bit strange/flawed anyway. You're initializing a as pow(2, 1023), then adding numbers in the loop. But since a is already such a large double value, the result won't change. The numbers you add are too small to make a dent in the value of a, likely because a isn't an integer. It's a double with a limited precision for large integer values.


> I once wrote a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

...this is a terrible, bad-faith way of making a complaint that isn't even valid. Computers do do proper addition. And they handle the exact example you give perfectly, if that's what you want. JavaScript might have difficulty with that problem, but that's because, unlike the computers it's implemented for, it has no integers.

What do you think "proper addition" would involve if I asked you to tell me 1/7 + 1/3, on paper?


The correct answer to someone who insists upon "proper addition" is 10/21 and for it to be annoyingly slow so that they learn to be sure if they really care about "proper addition" or are just being awkward.

If you mean how should the machine do that, it can find the least common multiple of 3 and 7 (which is 21) and then convert both fractions to be in that denominator, then simplify if possible. This is, as I said, annoyingly slow, but if you want "proper" answers that's what you got.

I wouldn't bother because I'm aware that _Almost All Real Numbers are Normal_ and so they're usually completely impossible to express in this fashion anyway and we should stop our foolish pretence that you can add non-integers together and expect to get "correct" answers just because it can be done for some easy cases.


I'm sorry but where did I insist upon proper addition? I'll annotate the parts of my post that might be mistaken for it:

> a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

Just saying they can't do it (I edited this: first I said that JS doesn't do it properly, but I thought that was rather too narrow. I guess 'computers' is too broad again. Pick a name, you know what I mean)

> [an explanation of what worked for me some years ago] Then you have arbitrary size and decimals with perfect addition (no floating point approximations anymore).

Again, just mentioning that this would solve it for addition, not saying this is the perfect way for life, the universe, and everything.

> Would it be that much slower, that it's an awful idea to do by default?

See, I'm not insisting on anything, I'm wondering and asking.

> Python gets away with it,

It doesn't do correct decimal addition either, so I'm not even focusing on resolving floating point inaccuracies, I'm more interested in "if it would be so slow to do arbitrary precision integers---oh and by the way, wouldn't it also solve this addition thing?"

Now, you also didn't exactly say that I was insisting, you said "someone who insists". So maybe this only applies to your parent comment. But every time I bring it up, people stumble over each other to tell me why it is this way. I already know why it is this way. There is a lot other words in the comment that one could reply to, and it's rather frustrating that it's completely overshadowed - every time - by people ignoring everything except those twelve magic characters: 0.1+0.2!=0.3.


>> a replacement for the + operator in javascript because computers can't do proper addition (0.1+0.2!=0.3).

> Just saying they can't do it (I edited this: first I said that JS doesn't do it properly, but I thought that was rather too narrow. I guess 'computers' is too broad again. Pick a name, you know what I mean)

You can say it, but that won't make it true. They can do it, and they do do it. Your comment is so much nonsense. The algorithm computers use to add 0.1 and 0.2 is the same algorithm that you use, which is, unsurprisingly, why they produce correct results.

> There is a lot other words in the comment that one could reply to, and it's rather frustrating that it's completely overshadowed - every time - by people ignoring everything except those twelve magic characters: 0.1+0.2!=0.3.

I'll point out again that I'm focusing on your completely unjustified claim that "computers can't do proper addition", which you didn't bother to include in "those twelve magic characters" that everyone is complaining about.


>> You can say it, but that won't make it true. They can do it, and they do do it. Your comment is so much nonsense. The algorithm computers use to add 0.1 and 0.2 is the same algorithm that you use, which is, unsurprisingly, why they produce correct results.

Are you aware of floating point math? That's what most languages, and almost all languages aimed at performance, use. It's defined in the IEEE 754 standard and supported on a hardware level in many devices.

Here is a listing of the result of 0.1 + 0.2 in various languages: https://0.30000000000000004.com/

As the URL already indicates, most languages, including C, Rust, C++, Java, Javascript, Clojure, FORTRAN, Python, etc. evaluate this to 0.30000000000000004. I'm fine with this, I need fast rather than precise math. But it's not "correct".


I'm so glad we have two shipping implementations now


I can't wait to kick the tires on this feature. I wonder how it compares to the integer implementation in WebAssembly.


WebAssembly has not arbitrary precision integers, so presumably is able to execute arithmetic as direct hardware instructions. bigint, on the other hand, is arbitrary precision, so it obviously can't do that.


Now let's wait another decade before they add BigDecimal...


While not directly related to BigDecimal, my employer (Bloomberg) primarily sponsored this effort through the arduous TC39 process and sponsored the implementation in any browser vendor that wanted primarily as a stepping stone to standardising decimal floating point.

We see BigInt as a relatively uncontroversial but important extension of the numeric type system of JavaScript. Once the door is open for a type that is not immediately compatible with Number, potentially new numeric types can be introduced. Perhaps BigDecimal (or Rational?) can follow.


I think a friend of mine said he was in a meeting back when Javascript was being standardized. Mike Cowlishaw proposed that Javascript use REXX's Decimal type instead of IEEE 754.

Everyone thought that was a really bad stupid idea.


It probably is when it's about "instead". IEEE 754 floating point computations are much much faster (given hardware support) while the precision is sufficient for most of the applications that don't have to deal with money.


I talked to my friend, he said Mike Cowlishaw tried very patiently explaining multiple times the need for a decimal type and basically ran into your argument.

IE. If we provide a decimal type as a default. Then everyone will use it and it'll be 'slow'. Thus this is a bad idea. So we should not make that even an option. Because IEEE 754 is fast and fast is 'good'

Frankly that decision was the worst language decision ever made bar none.


The technical committee are a bunch of dumbasses. There's no good reason to not specify bitint and bigrat at the same time.


can it represent and manipulate ipv6 addresses (128 bit quantities) efficiently? not having to split into two 64 bits or deal with them as bit-strings may be an improvement for some things


Internally it is implemented by spliting the number into an array of conveniently sized chunks (traditionally called limbs). That might mean either the machine word size for optimized assembly implementation on many common platforms or one bit less than native word size for portable C implementation (and architectures without carry flag) or some fixed number like 15 bits (IIRC mbedTLS has or at least used to have hardcoded limb size of 15bits in its bignum implementation).


I feel sorry for whoever has to write the polyfill for this...


Bignums themselves are actually pretty easy to implement as a plain library, provided your language has a uint8-array type (not just a String type, which might require the content be utf-8 valid, or might only have facilities to operate on codepoints, or might cut the data off at the first NUL.)

JS already has a Uint8Array, so bignum libraries are easy. (Let me tell you though, there were JS bignum libraries before Uint8Array, and they were, ahem, “interesting.” I think a popular one built a byte-array abstraction on top of hex-encoded strings, and then built bignums on that.)

Actually supporting the literal syntax or the operators via a plain library is impossible; but you can just do it the other way around: encourage developers to use a bignum library (for now), and have said library just “bake down” to native BigInts when they’re available.

Since everybody until now who was using bignums in JS was using them through a library, this won’t be a hard change to make.

(And it’s already been done! https://www.npmjs.com/package/big-integer now bakes down to native BigInts when it can.)

As for the brazen developers who start writing code to directly use the native BigInt... I suppose they’ll have to have two versions of their compiled code units in their minified blob, with a loader shim that evals out the version of the class/module that does/doesn’t use bigints. Maybe we’ll see an JS-pipeline pass to generate this. (I’m not a JS dev, so I’m not sure if there’s already support for this kind of thing.)


Babel (a popular transpiler) already has a plug-in for BigInts https://babeljs.io/docs/en/babel-plugin-syntax-bigint


> Bignums themselves are actually pretty easy to implement as a plain library, provided your language has a uint8-array type (not just a String type, which might require the content be utf-8 valid, or might only have facilities to operate on codepoints, or might cut the data off at the first NUL.)

Or if your String type doesn't do any of those. Javascript strings will happily hold an arbitrary series of 16 bit numbers.

Even when something requires valid code points, you can still easily store arbitrary 8 bit values. Or 15/16/17/20 depending on how fussy you are. Unless you want it to be hacky, it shouldn't be notably worse than an array.


> Actually supporting the literal syntax or the operators via a plain library is impossible; but you can just do it the other way around: encourage developers to use a bignum library (for now), and have said library just “bake down” to native BigInts when they’re available.

Can't you rewrite the syntax into something usable at runtime? Not that it were a good idea.


This ...might be possible, but the performance would indeed suck; you’d need to essentially ship half a compiler toolchain in your polyfill, and ensure that it gets run separately, non-async, so that the browser finishes installing it before attempting to interpret any more <script> tags. (And, presumably, all of your app’s source would be in the next script tag, so that your app can take advantage of native bigints; so your app wouldn’t load at all until the toolchain had finished bootstrapping.)

I’m guessing this has been done, to let the browser run—without backend compilation—<script type=“foo”> where foo is CoffeeScript or TypeScript or ClojureScript or what-have-you; but this one would be especially bad, because the thing it’d be targeting would still be Javascript, and so you’d have to rewrite all Javascript sources you encounter, with most of them likely never using native bigints but suffering the double-parsing overhead anyway.

Oh, and you’d have to override the ES6 module loader with one that also rewrites what it loads. Ugly.


A more popular approach of building a bigint library is to store a number of a series of limbs, where each limb is an unsigned 32-bit or 31-bit quantity. Since IEEE754 double allows exact integer arithmetic up to 2^53, that's fine.


Yeah, the Chrome team basically recommended not to, favoring using something like Babel instead:

https://developers.google.com/web/updates/2018/05/bigint#pol...


> and they are also making it infeasible (in most cases) to transpile BigInt code to fallback code using Babel or similar tools.

That doesn't sound like a recommendation.


Yeah. The recommendation is to write your code now making explicit calls to a specific bigint library. Later, when the JavaScript in the browsers you need to target has built in bigint support, Babel can replace your library calls with code that uses the built in bigints.

For this to work, you need a library whose bigint behavior either exactly matches the built in implementation, or whose deviations are known and can reasonably be account for when translating to built in bigints.

The library they recommend is a JavaScript translation of their implementation of built in bigints, so should match numerically what the built in implementation does.

There may be other libraries that would be suitable, but it would be nice if everyone agreed on just one for this so that when the time comes to use Babel or similar to replace the library calls with use of the built ins, it only has to deal with one library.


Note the comma: I was specifically referring to the advice against attempting to polyfill it. The Babel reference was attempting to convey the idea that instead you need something on the same order of complexity as a compiler to do this properly, but the details about JSBI are why I linked to their writeup rather than trying to summarize it.


There are some good big integer libraries already around.

https://www.npmjs.com/package/big-integer


JSBI[1] is a polyfill for the BigInt semantics, without the syntax. Then there's a Babel transform[2] to compile to the native syntax if all your targets support it.

[1]: https://github.com/GoogleChromeLabs/jsbi

[2]: https://github.com/GoogleChromeLabs/babel-plugin-transform-j...


Have you tried implementing a simple bigint library? Except for division everything is easy. I can even write it in assembler, and for example addition is just an add instruction followed by some adc instructions. Subtraction likewise. Assuming you want grade school multiplication (not Karatsuba or something fancy) it's also straightforward.


Can you actually do a polyfill for this? I'd have guessed that it was too low level for that.


> This contrasts with JavaScript number values, which have the well-known property of only being able to precisely represent integers between -2⁵³ and 2⁵³.

JS's Number can represent way more integers than just those between -2⁵³ and 2⁵³. There's a great StackOverflow answer[1] that answers a similar question, ("What is the biggest "no-floating" integer that can be stored in an IEEE 754 double type without losing precision?") which also applies here (A JS Number is an IEEE double):

> The biggest/largest integer that can be stored in a double without losing precision is the same as the largest possible value of a double. That is, DBL_MAX or approximately 1.8 × 10³⁰⁸ (if your double is an IEEE 754 64-bit double). It's an integer. It's represented exactly. What more do you want?

> Go on, ask me what the largest integer is, such that it and all smaller integers can be stored in IEEE 64-bit doubles without losing precision. An IEEE 64-bit double has 52 bits of mantissa, so I think it's 2⁵³:

More seriously, this is great news, and I think it'll be nice to have such a numeric type in JS.

[1]: https://stackoverflow.com/a/1848762


You seem to be doing an uncharitable reading of what you quoted. They can precisely represent numbers outside that range, true. They cannot represent all numbers outside that range.


That's obviously not what was intended by the question, come on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: