I've been programming JS for a very long time and have learned to just stop trying to do a generic deep copy. Since JS is a dynamically typed language, it will always lead to issues down the road. Instead I write domain specific merge methods for whatever objects I'm merging.
function mergeOptions(...options) {
const result = {};
for (const opt of options) {
result = {
...result,
...opt
arrayValue: [
...(result.arrayValue || []),
...(opt.arrayValue || [])
],
deepObject: {
...result.deepObject,
...opt.deepObject
}
};
}
return result;
}
Know the shape of your objects and merging deeply becomes painless and won't have edge-cases.
If anyone here ends up using it, make sure you learn how it works first. There are performance implications. It may not matter but it's important to know what they are. Also it's generally best practice to learn how magical utilities like immer do what they do before using them!
Java's deserialization will instantiate any classes that the data tells it to, which in practice leads to myriad vulnerabilities as many classes have constructors that can be used to write files, execute shell commands, etc. Many programmers didn't realize this, and bad things happened.
Using structured cloning for deep copies is clever, but may or may not give you the behavior you want for SharedArrayBuffers. The copied value would be a new SAB with the same underlying data buffer, so that changes to one value will be visible to the other. That's good for most uses of structured cloning, but it's not what I would expect from a deep copy.
The point of structured clone was originally for passing data to web workers. Since the point of shared array buffers is to share data with workers, it makes sense that the structured clone algorithm keeps the SAB identity.
Also, a program that only works in one browser, and only due to that browser being willing to open security holes other browsers are not willing to open, is arguably still defective. That said, that might describe a lot of things people are doing nowadays (e.g. anything that uses WebUSB).
Fundamentaly, deep-copy in Javascript is a typed operation and no "generic" deep-copy algorithm is possible--you have to know what meaning the value is supposed to have to copy it.
There's nothing inherently wrong with structured clone, but it's only for JSON-able objects with extensions for circular references and some built in value-like classes. It's also special-cased for safe transmission between Javascript domains so it has a bunch of undesirable behavior for local copies. (no dom, no functions, no symbols, no properties...)
Even primitive types can't be safely copied unless you know what they're going to be used for later, a nodejs file descriptor is just an integer but it's also a reference to an OS resource that can't be duplicated without a syscall.
> Fundamentaly, deep-copy in Javascript is a typed operation and no "generic" deep-copy algorithm is possible--you have to know what meaning the value is supposed to have to copy it.
Why? I see no problem if the deep-copy behaves exactly the same as the original, from the perspective of any operation in the Javascript API (except for the === operator).
Exactly the same as the original is a shallow copy.
Deep copy roughly means "If I do `dst = deepcopy(src)`, modifying anything in the world I can reach through a reference from dst should have no side effect visible through src" which is a reasonable thing to ask in some special cases (a tree of plain old javascript values, a DOM node) but not reasonable in others (a database connection object, a file descriptor, a user-id)
Like, what should a deep copy function do if it reaches a reference to the global object? or a function that closes over some references in the object being copied?
The answer is always "it depends on what the object will be used for later and what specific behavior you want"
The main difference between === and Object.is() is that === treats +0 and -0 as equivalent, despite the fact that some math operations treat them differently. I think they added Object.is() because it was awkwardly difficult to make code tell the difference between +0 and -0.
It is fundamentally very hard (impossible?) to deep copy everything in JavaScript. References cannot be escaped because they may be hidden in non-introspectable (cruicially, non cloneabel) places. Viz:
function f(){var x = 0; return function(){return x++;};}
var x = {foo:f()};
print(x.foo());
var y = {foo:f()};
print(y.foo());
var z = someDeepCopy(y);
print(z.foo());
print(x.foo());
print(y.foo());
print(z.foo());
If a copy were sufficiently deep then one could expect:
0
0
1
1
1
2
However if it were not deep one would get:
0
0
1
1
2
3
Even if one allows a deep copying of closures then this still might not work as an object which contains two (potentially different) functions closing over the same binding (ie particular instance of a particular variable) may be copied into two functions each closing over their own separate binding.
I think the only good solution to this is to either give up trying to do deep copies or give up immutability and stop caring about deep copies.
The language really should have a true immutable type (without freezing, etc.) and deep copy method built in, with as many caveats and parameters as needed. Coroutines would be awesome as well. (Yes, I'm thinking, "How could JavaScript be more like Go or Erlang?")
And then it needs to stop adding new features for at least a couple years so the world can catch up.
This is presented in the linked document with the following context-specific gotchas:
> Unfortunately, this method only works when the source object contains serializable value types and does not have any circular references. An example of a non-serializable value type is the Date object - it is printed in a non ISO-standard format and cannot be parsed back to its original value :(.
That's false. If you use JSON.stringify on an object, it will call the method toJSON of each values recursively. The toJSON method of a Date returns the ISO string.
I did not say that JSON.parse would parse the dates.
I'm saying that the article is wrong when it says that JSON.stringify does not transform Dates into ISO string.
Right after that, the article says "cannot be parsed back to its original value" when it definitely can be parsed back. JSON.parse does not do it by default but that was never his point.
> Right after that, the article says "cannot be parsed back to its original value" when it definitely can be parsed back. JSON.parse does not do it by default but that was never his point.
The broader point, which is entirely correct, is that you don't get back an exact copy of the object you want to clone, because the date fields don't end up being the same type.
Why? Why would people want to clone objects? When I have encountered this in the past it is from people who are new to the language.
My advise to any person who really believes they need a clone of an object: do some self-reflection on plan as to why you think you need a cloned object. Any other approach is more efficient and more simple in the code.
It is an extra step but when using redux or something like that you have to serialize stuff anyway to store it and is especially useful for mobile react-native stuff in keeping state when phone restarts or connection fails.
I would recommend immer (https://github.com/mweststrate/immer) instead of ImmutableJS. You can work with regular JS objects, plus it plays much nicer with TypeScript.
I suppose immutable.js supports partial reuse of objects, that is, if you only change the value of one attribute, the changed copy has this attribute set differently, but the rest is shallow-copied?
If so, indeed immutable objects would not run into the problem of copying, as long as you can afford them to be immutable. (That is, you're not working with any APIs that assume and use mutability.)
ImmutableJS is a pretty huge library. I'd conjecture most web apps don't have a legitimate need for something so comprehensive and would be better off with a simpler solution.
If you know the shapes of your objects ahead of time you can create one-off functions, which will probably be faster and require far less code.
I cannot recommend ramda enough. It provides the immutability and flexibility of immutablejs, but because it's build in a functional paradigm, the logic is fully composeable so complex, deeply nested changes are very simple and readable.
both are used to approach the same problem: immutable manipulation of javascript data. The fact that immutableJS provides a persistent object is an implementation detail in the strategy used to solve that problem.
Basically, it's an AST translator that lets you use imperative syntax against immutable types. That is, `x.a = b` becomes the clunky `x = x.set('a', b)`, and it really gets convenient when you have complex structures.
Would it be worth it to look into a babel plugin for Javascript and ImmutableJS?
I like the shallow copy, and never needed a deep copy. I am using JS for a few years tops, mostly React.
To me its exactly what native languages do with pointers. In some languages (like Delphi) its implicit (like JS) and some (like C) its explicit in syntax.
"It works if you provide your own additional code that loops over and fixes known problems" is not the same as "it works". It's useful, but at most it saves you a few lines of boilerplate code (and perhaps a few negligible cycles) to write your own recurser to do the same thing immediately after the transform.
> Do you want JSON.stringify to keep dates intact ? The whole point is to turn it to a string.
There are ways to serialize data structures to strings. JSON doesn't generally do that, and it's been to its benefit as it helps keep it generic and easily usable from many languages.
That said, there are serialization tools which can correctly serialize and unserialize more complex structures, given the correct circumstances (only core constructs, or the objects in question have serialization helpers, or they are ensured to not have features that might cause problems such as references to outside data).
The point of this discussion is Javascript object cloning, not turning objects into a string. Offering up JSON and then responding to a criticism of it based on its methods as "the whole point is to turn it into a string" is somewhat ridiculous given many of the other options you're espousing JSON over don't use strings at all.
EDIT: Also, how can you determine if the string should be a date, or a string? Sure, if it fits, you can always convert it to a Date, but if your first object is coming from something that sends ISO Dates as strings and you used JSON.parse with a reviver, you would get a different object.
Every single time I see a similar article on Javascript, I feel so lucky for being able to use Clojurescript instead.
Seriously - Javascript platform is great. The language itself? Not so nice. Clojurescript makes so many things simply better.
Rust has a Clone trait which provides the .clone() method, which isn’t a shallow clone or a deep clone, due to Rust’s ownership model—the terms “shallow clone” and “deep clone” actually don’t make sense in Rust.
For completeness, I must mention that when you get to types like Rc<T>, deep cloning becomes a meaningful operation, but even there it’s not exposed as a deep clone method, but rather via make_mut (https://doc.rust-lang.org/std/rc/struct.Rc.html#method.make_...) which skips the deep part of cloning if there are no other references to the inner value.
Rust’s ownership model is absolutely delightful to work with. I miss it all the time when working in Python or JavaScript, and encounter and write bugs that would have been structurally impossible in Rust from time to time—to say nothing of inefficiencies that would have been either inexpressible or unreasonable in Rust.
Most functional programming languages don't make a distinction between values and references, so this is kind of a moot point since the copying/merging is never exposed to the developer.
That article is wrong. A date can definitely be serialized in JS. It is in fact converted to a ISO string when it is transformed into JSON, but the article says it does not. The author needs to learn about JSON.stringify and the toJSON method of built-ins.
Ok but please read carefuly because JSON.parse can in fact recover dates. The JSON.parse function accepts a 'reviver' function as an argument to do just that. 99% of the time it's not useful so there is no reason for JSON.parse to do it by default but JSON.parse definitely can parse dates back if you use all it's features.
I like that it does not do it by default but I can make it parse the dates if I want.
And what do you pass as the reviver argument? Your own function to check a string to see if it's valid IS8601 and parse it? That's just using parse()'s recursive walking of the object: it does no date reviving itself, right?
That way you can get different result, as you could actually have date object and date as string stored somewhere. You will need custom strigify callback to differentiate which should be converted back to date on parse and which should be a string.
Still, this is not pure json this way, it's your own transformations
Indeed it can, but I believe the point is that it's not round-tripable, meaning you can't parse the resulting serialized JSON back into its original form, the one that contained the Date objects.
If you could do so, JSON.stringify/parse would be a convenient way to do a deep clone.
Yeah that's right. All I'm saying is that no matter what, the following sentence is false:
> An example of a non-serializable value type is the Date object - it is printed in a non ISO-standard format and cannot be parsed back to its original value :(.
If anyone here ends up using it, make sure you learn how it works first. There are performance implications. It may not matter but it's important to know what they are. Also it's generally best practice to learn how magical utilities like immer do what they do before using them!
This will lead to bugs, as inadvertently as a human you'll miss a merge or a clone and retain references that you don't want.
Inability of a language or runtime to correctly and quickly clone a structure is an upsetting fact of JavaScript.
This is a classic example: https://www.cvedetails.com/cve/CVE-2015-7501/
(Many more can be found under the CWE-502 "Deserialization of Untrusted Data" category)
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
Also, a program that only works in one browser, and only due to that browser being willing to open security holes other browsers are not willing to open, is arguably still defective. That said, that might describe a lot of things people are doing nowadays (e.g. anything that uses WebUSB).
There's nothing inherently wrong with structured clone, but it's only for JSON-able objects with extensions for circular references and some built in value-like classes. It's also special-cased for safe transmission between Javascript domains so it has a bunch of undesirable behavior for local copies. (no dom, no functions, no symbols, no properties...)
Even primitive types can't be safely copied unless you know what they're going to be used for later, a nodejs file descriptor is just an integer but it's also a reference to an OS resource that can't be duplicated without a syscall.
Why? I see no problem if the deep-copy behaves exactly the same as the original, from the perspective of any operation in the Javascript API (except for the === operator).
Deep copy roughly means "If I do `dst = deepcopy(src)`, modifying anything in the world I can reach through a reference from dst should have no side effect visible through src" which is a reasonable thing to ask in some special cases (a tree of plain old javascript values, a DOM node) but not reasonable in others (a database connection object, a file descriptor, a user-id)
Like, what should a deep copy function do if it reaches a reference to the global object? or a function that closes over some references in the object being copied?
The answer is always "it depends on what the object will be used for later and what specific behavior you want"
I think the only good solution to this is to either give up trying to do deep copies or give up immutability and stop caring about deep copies.
And then it needs to stop adding new features for at least a couple years so the world can catch up.
Anything beyond this and you are begging for trouble because there's always context-specific gotchas.
> Unfortunately, this method only works when the source object contains serializable value types and does not have any circular references. An example of a non-serializable value type is the Date object - it is printed in a non ISO-standard format and cannot be parsed back to its original value :(.
source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
I'm saying that the article is wrong when it says that JSON.stringify does not transform Dates into ISO string.
Right after that, the article says "cannot be parsed back to its original value" when it definitely can be parsed back. JSON.parse does not do it by default but that was never his point.
The broader point, which is entirely correct, is that you don't get back an exact copy of the object you want to clone, because the date fields don't end up being the same type.
let d = new Date(); let s = JSON.parse(JSON.stringify(d));
s will not be a clone of d.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
The article says it does not return a ISO string. I'm saying it does. The author as agreed with me and will update the article.
My advise to any person who really believes they need a clone of an object: do some self-reflection on plan as to why you think you need a cloned object. Any other approach is more efficient and more simple in the code.
Objects are hash maps that store data.
someObj.toJSON = function() { return { foo: this.foo, bar: this.bar } }
It is an extra step but when using redux or something like that you have to serialize stuff anyway to store it and is especially useful for mobile react-native stuff in keeping state when phone restarts or connection fails.
Or make your functions return new objects.
Personally, I prefer not having to mutate data, even wrapped in a produce method
If so, indeed immutable objects would not run into the problem of copying, as long as you can afford them to be immutable. (That is, you're not working with any APIs that assume and use mutability.)
If you know the shapes of your objects ahead of time you can create one-off functions, which will probably be faster and require far less code.
Basically, it's an AST translator that lets you use imperative syntax against immutable types. That is, `x.a = b` becomes the clunky `x = x.set('a', b)`, and it really gets convenient when you have complex structures.
Would it be worth it to look into a babel plugin for Javascript and ImmutableJS?
No variables are simply references to objects. Objects aren't references - they're referents.
To me its exactly what native languages do with pointers. In some languages (like Delphi) its implicit (like JS) and some (like C) its explicit in syntax.
# var a = new Date();
# cloneDeep({ a }).a === { a }.a;
This returns false. Use JSON.stringify if you care about the content. cloneDeep might be useful if you don't care about data integrity.
Do you want JSON.stringify to keep dates intact ? The whole point is to turn it to a string.
[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
> Do you want JSON.stringify to keep dates intact ? The whole point is to turn it to a string.
There are ways to serialize data structures to strings. JSON doesn't generally do that, and it's been to its benefit as it helps keep it generic and easily usable from many languages.
That said, there are serialization tools which can correctly serialize and unserialize more complex structures, given the correct circumstances (only core constructs, or the objects in question have serialization helpers, or they are ensured to not have features that might cause problems such as references to outside data).
The point of this discussion is Javascript object cloning, not turning objects into a string. Offering up JSON and then responding to a criticism of it based on its methods as "the whole point is to turn it into a string" is somewhat ridiculous given many of the other options you're espousing JSON over don't use strings at all.
JSON.parse(JSON.stringify(obj)) !== deepClone(obj)
EDIT: Also, how can you determine if the string should be a date, or a string? Sure, if it fits, you can always convert it to a Date, but if your first object is coming from something that sends ISO Dates as strings and you used JSON.parse with a reviver, you would get a different object.
So one can’t infer JSON-clonability from TypeScript/JavaScript types. Learned this the hard way.
4
> y = new (x.constructor)(x)
[Number: 4]
> x.constructor
[Function: Number]
> y.constructor
[Function: Number]
> typeof x
'number'
> typeof y
'object'
> x
4
> y
[Number: 4]
Is y a clone of x?
For completeness, I must mention that when you get to types like Rc<T>, deep cloning becomes a meaningful operation, but even there it’s not exposed as a deep clone method, but rather via make_mut (https://doc.rust-lang.org/std/rc/struct.Rc.html#method.make_...) which skips the deep part of cloning if there are no other references to the inner value.
Rust’s ownership model is absolutely delightful to work with. I miss it all the time when working in Python or JavaScript, and encounter and write bugs that would have been structurally impossible in Rust from time to time—to say nothing of inefficiencies that would have been either inexpressible or unreasonable in Rust.
- Types with shared semantics (e.g. Rc<T>, Arc<T>), or holding onto these internally
- Types holding borrowed references (these will still reference the same data).
C++'s copy constructors are too, but there are more caveats, although they're similar in principal to Rust's caveats.
:)
I like that it does not do it by default but I can make it parse the dates if I want.
You can make JSON parse arbitrary sublanguages if you're willing to put in the work.
Still, this is not pure json this way, it's your own transformations
If you could do so, JSON.stringify/parse would be a convenient way to do a deep clone.
> An example of a non-serializable value type is the Date object - it is printed in a non ISO-standard format and cannot be parsed back to its original value :(.
- Roundtrip-able (reversible?)
- Pure
- Effectful
- Mutation-doing
- Idempotent/Nullipotent
- Algebraically closed over the set of its inputs
- Et c.
Every time I hear about a new one I wish I had such a list
You mean injective
Involution, if you mean calling a function twice will give the original input.