@antirez: You continue to be an inspiration. Your words are thoughtful, your thinking clear and well-documented, and despite rare lapses your English skills are exceptional.
I use redis. I don't need ACLs, but I certainly don't disagree with your assessment, nor begrudge the time, care and attention to detail which is obvious in your design-work.
Thanks so much for the kind words! I'll try hard to find the balance in the future before new, old and stable... Nobody wants a bloated Redis I bet! Since I believe that many members of the Redis community have a similar aesthetic mindset about software.
Apart from Redis, I just wanted to say thank you for starting dump1090. The community that has grown around it is massive, and it has given me a new hobby and interest. It is a great demonstration of the power of open source software and FLOSS culture as a whole.
Not to mention there are now businesses built on the data provided by this free software, being run by volunteers (like I).
>"I really want Redis to be a set of orthogonal data structures that the user can put together, and not a set of tools that are ready to use."
This is why I fell in love with Redis right from the beginning. Having a data store that gets out of the way and just gives you the structures (a lot like programming data structures) to do what you need feels awesome. Picking up the commands and putting them to use feels as natural as reaching for any vector/list/whatever structures I use anyway in my programs.
Thanks so much for everything you're doing antirez.
I've said it before and I'll say it again, thanks for Redis antirez. It is an amazing piece of software in the utility it provides and I have used it in my work for so long and just never ever had it fail on me. It does what is written on the box every single time.
Thank you for the thoughtful design, for saying no to so many things and keeping Redis pure and useful and for your awesome stewardship. You are an inspiration in all those things!
Not a security expert, but in time_independent_strcmp(), first comment about strlen()s: couldn't the attacker use his own accounts with known passwords to determine the length of some other user's password? Also, given the name of this function I would expect the comparison to be time independent, even if attacker can change both strings' lengths... Or am I missing something? Haven't touched C in a loooong time... :)
Hello, the attacker here can control only one string: we want the time taken by the function to be independent from the POV of the string provided by the user. That is, we don't want that the user-controlled string can affect the time the function takes. The other string is the string set inside the database. About information leaks about the length, that would be completely acceptable: anyway the user is very likely to just it the user password length by extracting a random number, there is no real protection there. The problem of this kind of timing attacks is that it can leak the actual user string content. Such function should hopefully prevent this problem.
The comments in the code make it seem like absolutely no information about the secret is leaked:
> /* Again the time of the following two copies is proportional to
> * len(a) + len(b) so no info is leaked. */
> memcpy(bufa,a,alen);
> memcpy(bufb,b,blen);
If the attacker controls one of the inputs, the execution time reveals something about the length of the other input, right?
Or maybe you just meant that the length is leaked by the contents are not leaked? (I agree that it's generally considered ok for "timing-safe equals" functions to leak the length of the secret. But if you ARE allowed to leak the length, you can simplify the code by just checking the length in the beginning and exiting if they're not equal.)
And if you don't want to leak the length, it's easy: pre-SHA-512 the secret and then only compare hashes instead of comparing the full strings.
Thank you for taking the time to explain. Also the fact that a total stranger to this code (like me) can even reason about it, is incredible.
My point was not so much that there is a bug, but more that the assumptions that this function makes are not obvious (to me at least) from its signature or the header comment. I could easily imagine someone using this function assuming that it does, you know, time insensitive string comparison, but this is only true in a very limited context. Even renaming vars to `a_unchangeable` and `b_user_controlled` (or a comment) would help. But again, this is just a very minor point from a passing stranger, so feel free to ignore it. :)
There seems to be a chronic need in the Redis community to avoid confrontation (or even be bucketed) in the same space as Kafka.
Redis Streams is probably one of the most path-breaking things that Redis is doing - and can completely change what it stands for. That's not a bad thing..but it doesn't even get a title by itself but is rather called "data structures".
Here's a question - do you anticipate competing with Kafka? There are a lot of us that are cheering Redis on for this...but it seems there is internal reluctance.
It's like the long standing "Redis is not memcached" statement - even WordPress has a "choose Redis or memcached for object caching" option.
> It's like the long standing "Redis is not memcached" statement - even WordPress has a "choose Redis or memcached for object caching" option.
As was said in the post, Redis isn't memcached, but "a bunch of Redis nodes on the same machine together with a Redis Cluster proxy" is kind of equivalent to memcached.
Redis and memcached are on different architectural levels—one Redis instance represents one shard of a sharded data-store with an IO-concurrency of r=1 w=1, while one memcached instance represents a complete data-store with an IO-concurrency of r=N w=1. To build a caching "virtual appliance" for deploying on some big hardware, a memcached appliance would just be a memcached instance; while a Redis appliance would contain N Redis nodes + 1 Redis Cluster proxy.
It's a bit like comparing, say, UDP with TCP. There are features you can layer on top of UDP to make it act equivalently to TCP (as e.g. QUIC does), but you can also just use UDP by itself. UDP is not designed to be TCP or "compete" with TCP; but UDP happens to be able to be used—when you add extra stuff on top—for the same things TCP can be used for.
Redis, like UDP, is a flexible, low-level "infrastructure component" that can be customized into tools to solve any number of use-cases—but, like UDP, Redis sets out to address only a very limited set of use-cases when used as a standalone "tool" without such customization. I personally think of Redis a lot like an OS distribution—you could make any number of virtual appliances by customizing e.g. Debian with your own extra packages and building the resulting appliance-nodes into your desired architecture; and you can make any number of data-store servers (solving any number of problems) by customizing Redis with your own commands, and then connecting the resulting nodes into your desired architecture.
Redis itself isn't going to compete directly with Kafka. Kafka, like memcached, is on a higher architectural layer than Redis. But there's nothing stopping some downstream developer from building something on that layer, e.g. a "RedisMQ" server customized into a Kafka-killer.
>Redis itself isn't going to compete directly with Kafka. But there's nothing stopping some downstream developer from building something
So this is a philosophical rather than a technical thing. For a hobbyist open source project, I would agree with you there - there's only so much resources that someone can carve out of his day job.
For a company that raised a 60mil venture funding and specifically created a license to prevent IP leakage, this is disappointing.
I want to pay Redis and I don't mind their licensing. But I'm then questioning the philosophy if everything meaningful has to be done by unpaid open-source developers/startups, who might be on tenuous footing to do anything meaningful with it because of the licensing.
Redis is built by antirez as a "hobbyist open source project." It is a small core, and will stay that way. It is an infrastructure component. It is fully open-source.
Redis Labs is the company that raised 60mil venture funding, and Redis Labs' business model is exactly to create such "distributions" of Redis for their customers. These distributions are not open-source; they are the thing that has the weird license.
Redis Labs does not own or control Redis. antirez owns and controls Redis.
You could create a similar company of your own which is also selling Redis distributions (as a product, as a service, etc.) and make money off of doing so. Since Redis itself—the thing antirez develops—is open-source, nothing stops you from doing this, just like nothing stops Redis Labs from doing this. No "unpaid open-source developers" are needed.
antirez just happens to work for Redis Labs—but not in the sense that they can really tell him what to do. They just want to pay him to ensure he keeps making it, because their business depends on the continued health of the open-source project at its core. (It's about the same situation as Guido van Rossum working for Dropbox, or Yukihiro Matsumoto working for Heroku.)
If you want to talk about the well-being of some project, downstream of Redis, to build a full-featured multipurpose DBMS "tool" with Redis at its core, that is fully open-source... well, that project doesn't exist. Redis Labs is not(!) that project; Redis Labs is a services company. But there's nothing stopping such a project from existing, and there's nothing stopping a foundation springing up around it to take care of it and its developers (like e.g. the Linux Software Foundation, or the PostgreSQL EU+US foundations.)
But that software project, if it existed, would no more be "the Redis project" than https://openresty.org/en/ is "the Nginx project." It'd just be a customized distribution, by separate people.
It's true, I don't like to compare Redis to other products. I just want users to understand when Redis is a good candidate for their project and evaluate it. There is something that's too much of a taboo for me to compare what I did with what other did, I like to think more like if there are all this software that try to do things without trying to win one over the other, and are the developers out there to vote with their usage.
Sure thing. I respect that. But what is happening is that is causing a wide reluctance to even play in the same space.
We have been super excited about Redis Streams for quite some time now. But we are not seeing any implementation usecases, tools...or even your awesome blog posts around streaming usecases.
It would be great to see some more literature on the streaming stuff and things that you can (and are ) doing with it...and giving it a top level importance (beyond just a new data structure).
A key difference I observed was that if a Kafka consumer crashes, a rebalance is triggered by kafka after which the remaining consumers seamlessly start consuming the messages from the last committed offset of the failed consumer.
Whereas with Redis streams I had to write code in my application to periodically poll and claim unacked messages pending for more then some threshold time.
> I really believe that in the future of Redis “client side caching” will be a big thing. It’s the logical step in every scalable system.
As soon as this happens I will have no use for Redis. I already have better scalable datastores. I often use Redis as a cache for those. If Redis isn't a low-latency, consistent key-value/key-structure store anymore then in my mind it isn't Redis anymore.
Interesting that coleifer wrote the comment. I really respect their open source code. It’s no BS in their task cue program https://github.com/coleifer/huey
> "I really believe that in the future of Redis “client side caching” will be a big thing. It’s the logical step in every scalable system."
How is this a logical step in the context of micro services ? I feel there is a trade off. If I have to scale a service to N instances where each of them cache the same type of data. Potentially, I'm duplicating that data N times. Which means I need to reserve that much additional RAM per instance. I can avoid this by storing the data centrally in a very fast cache aka Redis. On the other hand even the fastest standalone cache will be much slower then accessing data in process memory.
In my experience microservice architectures end up very network traffic heavy. By virtue of having N instances, multiple instances can and will request the same data over and over. User A requests page 1 of results for "hello world" and goes to instance X, user B requests page 1 of results for "hello world" and goes to instance Y, etc.
Given that the stability and availability of your central data store in your model is likely more important than the N instances talking to it, caching data on those instances is definitely beneficial, rather than asking it for the same data over and over.
You need not reserve that much additional RAM, modern infrastructures end up with unused RAM anyways, by virtue of them being optmised for CPU usage of stateless microservices. Why not put it to use? A smart cache eviction strategy can help optimise what gets held in RAM and what doesn't. A Redis client working in tandem with the Redis server can help with that and stop you needing to write a bunch of code to do this.
> If I have to scale a service to N instances where each of them cache the same type of data.
Actually, they don't. If you're scaling a multi-user system to millions of users across hundreds/thousands of instances of your micro-service, they're likely to cache different data because different user sessions will be handled by each instance of your app. So, local caching in each micro-service is valuable. You just need to set an upper limit on how much memory to consume.
depends on what your application demands. I've worked on applications where response times more than 3ms could cost the company thousands of dollars. And I've worked on applications where 30ms and nobody would even notice...
I may add that, 99% of the very big Redis users out there do some form of client side caching... at scale is basically something common for many years now.
Yes in-memory is the fastest, but I think its useful when comparing SQL queries (slow) to querying Redis (faster) for cache data, especially for distributed systems.
I have no valuable addition to the discussion, hope its okay but just want to say thank you antirez for your impressive work. You're leading the way for us others to follow
Yet a lot to define, but it will be multi-threaded, written in C, in a separated repository, but part of the Redis project conceptually. Please ask any info you want (but not everything is defined yet...). In theory such project is not needed because we at this point should have all the clients with a great cluster implementation. Reality is a bit different :-) We need to do something about that and allow developers to talk with the proxy like if it was a single instance (but with the same multi-key operations limitations of Redis Cluster itself, and the lack of SELECT to have multiple DBs), without implementing any cluster protocol.
First of all, these two statements seem to be at odds:
> I/O threading is not going to happen in Redis AFAIK, because after much consideration I think it’s a lot of complexity without a good reason
vs.
> the way I want to scale Redis is by improving the support for multiple Redis instances to be executed in the same host, especially via Redis Cluster
Having to run multiple Redis instances, with a separate "Redis Cluster [proxy]", sounds like "I didn't originally use threads, and while threads are technically the correct solution (because it's the standard way to do things and not all that complicated), I don't feel like rewriting Redis to use threads the way it should have been done from the start; thus I've come up with an alternative hack that is not ideal but makes my life easier". While I have always argued that memcached is not automatically better solely because it is multi-threaded, the idea that an additional point of failure – a technically unnecessary proxy/broker – is somehow a better solution than threads seems like an attempt to save face for mistakes made during early development.
As for the ACL:
> You get a library from the internet, and it looks to work well. Now why on the earth such library, that you don’t know line by line, should be able to call “FLUSHALL” and flush away your database instantly
What can I say other than "give me a break". Every large codebase has hundreds/thousands of dependencies, and any one of them could slip in a malicious block of code. My analysis: anyone who would be concerned about some library calling FLUSHALL would never even install Redis - a C program that, for all we know, is spamming emails or running a Bitcoin miner while it runs. The fact is, all 3rd party software/dependencies come with risk; if I'm trusting your software not to screw me while I sleep, I think I can trust a basic library not to hijack my Redis server with a FLUSHALL command. Really, this is not even remotely a valid argument. Whatsoever.
Then there's this:
> maybe you just hired a junior developer that is keeping calling “KEYS” on the Redis instance, while your company Redis policy is “No KEYS command”.
Ok, now you're just making shit up. Here's an idea: you don't need ACL to restrict access to KEYS, as that command should have been deprecated and entirely removed in favour of SCAN long ago. You mentioned "I think it’s a lot of complexity without a good reason" regarding multi-threaded Redis; yet here you are advocating an entire, complex ACL layer... for what purpose? For some imaginary malicious scenario where my Redis library is going to call FLUSHALL, or a "junior developer" is allowed to call "KEYS *" – a command that should not even be available in 2019?
I started this comment intending to provide unfiltered – but polite/constructive – feedback. The more I reread your original post, the more frustrated I got at the fact you seem to have lost all common sense. You're trying to justify adding a massive amount of complexity, including a clustering/proxy strategy instead of simple threading, and ACL to save us from imaginary threats. I've previously been in the stage of development that I believe you might be in now – you've run out of truly "good ideas" that would actually benefit your user base, and are now attempting to justify new features as necessary, when the truth is you're just looking to write new code that you personally find to be "new and exciting", rather than iterating on the existing, boring codebase.
I've spent the past 4-5 years advocating that memcached is effectively dead because of Redis. The more you try to make Redis some kind of ACL-secured, clustered database – rather than the lightweight cache store we learned to love – the more likely another project will step in and replace Redis the same way Redis replaced memcached.
This is completely tangential but it might help some people, especially those where English isn't their first language:
> she/he (not sure)
English is really strange around gender, and idiomatic styles have changed over time. It has always been a valid style in English to use "they" to refer to a single person of unknown gender [1]. This feels natural to English speakers when the subject is unknown and could be one of many potential people, as in:
When the first guest arrives at the party, give them a balloon.
Here, "them" only refers to a single person, but it's correct.
For the past hundred years or so, it was also common to use "he" to refer to an unknown hypothetical person. You see a lot of textbooks that do this. In theory, this wasn't supposed to come across as being exclusionary, but obviously it is. It's an explicitly gendered pronoun.
The feminist movement rightly drew attention to this problem, and writers experimented with a variety of approaches. Sometimes, at the beginning of a book you'll see an explicit "apology" saying something like "we use 'he' but don't mean it to only apply to males". Some simply switched to using "she" for everything. Others switch between "he" and "she" throughout, randomly or use the longer "he or she".
But, lately, it seems like the style is settling down towards simply using the already established singular they for all of these cases. To a native speaker, it feels a little weird at first, but you quickly get used to it. It's less jarring than "he" or "she" for most readers now.
It has a lot going for it:
* It has more established history than any other form.
* It doesn't force the author to pick an arbitrary meaningless gender.
* It's shorter than the awkward "he or she" (which also still forces you to decide which gender to put first).
* It doesn't exclude people who prefer neither "he" nor "she" as their pronoun.
So, if you're trying to figure out what pronoun to use when you don't know a specific one that is correct, default to "they".
I find that pretty obnoxious to read (or listen to). It's much more natural, in my experience, for text (or speech) to use pronouns generally and explicit names sparingly.
the singular they as a compromise works ok but it leads to a larger question about whether there should be separate pronouns for a single person vs. multiple people.
while gender/sex is often beside the point, number often matters. we can keep the gendered pronouns for the few cases where it's needed, and use ungendered pronouns in the general case, but that requires an unambiguous singular ungendered pronoun.
> the singular they as a compromise works ok but it leads to a larger question about whether there should be separate pronouns for a single person vs. multiple people.
English already dropped it with 2nd person, settling on the plural (and formal) “you” as the sole pronoun and dropping the informal singular “thee”; and for 2nd person distinguishing number in the pronoun is probably more often an impediment to discussion than 3rd person (a 3rd person pronoun needs an explicit referent in the framing context anyway, so you don't lose much by not adding a reminder of number into the pronoun itself.)
Yes, in an ideal case, it would be nice if there were separate pronouns for "single unspecified gender" and "multiple". Unfortunately, we don't get to reboot spoken languages.
Even if we could, it's still not clear what the ideal solution would be. There are cases where you need to communicate ambiguous number too. It might be nice to have pronouns that distinguish "definitely more than one" from "potentially more than one".
the peculiar bit is how permeated language (e.g., english and other western languages) is with (women's) sexual availability--not just with gendered pronouns, but whole words, endings, titles (miss vs mrs. vs ms.), etc. i'm sure there are theories for why, but that emphasis is so heavy. many eastern languages, for example, emphasize seniority/hierarchy more.
It is correct. Usage makes it correct, and it has been used this way for centuries. Its your incorrect labelling which results in the unnecessary spread of gendered pronouns.
>No, no it is not. I assure you, it is not correct.
Language evolves, and this means that the rules of grammar do too. The MLA already allows for pronoun usage to match that of the preferred pronoun of whomever you are writing about, and are considering allowing for the singular they in general. Some style guidelines already allow it. Others still proscribe it.
Plenty of news organizations, including the Washington Post, have set their style guidelines to use the singular they.
It's grammatically correct in plenty of places today, and the current trend is for that to increase, not decrease. You're fighting a losing battle here.
Yes, and some opinions happen to be plain wrong, and others are similarly entitled to say so. The entitlement to state an opinion does not guarantee its correctness.
Thought I just had: using just "he" is obviously exclusionary because it makes males the default. Likewise, people don't just use "she" either. But given that there are people whose preferred pronouns are "they/them", wouldn't that make "they" equally exclusionary?
No, because singular they is a well-established idiom that conveys "unknown gender", it subsumes people who actually prefer "they" as well as people who prefer "he" and "she".
English needs a construction for referring to an individual that you don't know. They/them is fine. Someone somewhere is going to decide to get up in arms about that, and there's just nothing to be done about it since it will be a demand to know them before you know them (logically impossible), so don't worry about it.
The they/them pronouns are used in place of he/him/she/her because they are ungendered. I fail to see how the use of ungendered pronouns are exclusionary.
Off-topic: this post, like other posts on the author's blog, is written in Markdown. But rendered inside <pre> tags. Seems a little odd considering the site also uses regular Serif fonts for the title / meta.
Standard Notes, combined with their "advanced Markdown editor" extension, has a middle ground I think works really well, they render the Markdown formatting characters, as well as applying the formatting itself.
I agree, the current text is not very readable, but imagine it served raw as txt... Btw soon or later I'll change it with something else indeed. Need to find the time, also to convert the old posts.
This >> "An important thing to realize is that Redis has not a solid roadmap, over the years I found that opportunistic development is a huge win over having a roadmap. Something is demanded? I see the need? I’m in the mood to code it? It’s the right moment because there are no other huge priorities? There are a set of users that are helping the design process, giving hints, ideas, testing stuff? It’s the right moment, let’s do it."
I’ve handled this “chaos” successfully for years. It starts to break down as the company grows and matures. I used to be able to code something up and explain it to 3 people so they could document it, support it, use it etc. Now with ~30 it’s a lot more difficult and I upset folks when I deviate from the roadmap.
I used to like Redis, because of its simplicity. But these days it is growing so complex, with so many distributed algorithms being implemented, that I'd much rather use FoundationDB, which at least has the benefit of being extensively tested (in fact it was written test-first) and provides a full distributed database with transactions.
I think there would be value in a "simple Redis": a well-maintained tool for a certain class of problems which do not require a distributed database.
Redis is still very simple and most features are orthogonal and completely isolated. Don't trust my words, check yourself, if you count the lines of code with "cloc" in Redis unstable, it is composed of 64890 lines of code. In any other system this amount of code is used, maybe, for the query parser. A single developer can still read the Redis source code in a few days and understand how it works and modify any part alone.
EDIT: I also want to stress on a fact, that at the same time in the newer releases there was a simplification attempt. For instance the Lua scripting side effects problem are going to be removed completely by simplifying how the replication of Lua scripts is performed. Redis 5 is already like that, but the dead code was not yet removed for safety. Redis 6 will completely remove many parts of code. So there is not just the stress on adding, but also on refactoring. ACLs themselves allowed to refactor authentication in separated functions inside acl.c to lower the overall complexity.
To be honest, the top level comment in this chain reads a bit like an attempt at a plug for FoundationDB than a legitimate criticism of Redis. You could s/Redis/<literally any other DB>/g
I can second that. It's been a few years since I actually did this, but within about two working days worth of reading, I had consumed and understood a good chunk of the code and overall control flow. I was able to make some reasonably invasive changes within a few hours each after that (initially adding "msg/trigger on expire" and check-and-set operations on hashes, then a bit later a change to memory management that was a few percent of global speedup, and a partial COW implementation for strings). Admittedly, I have fair amount of experience working on interpreters in C, which is pretty close to what plain Redis does, just simpler and hidden behind http.
A quick glance suggests that this would still be similarly possible today. Redis is by no means perfect, but it deserves it's reputation for being accessible!
PS: Thanks for the mention of the Lua side effects changes. I'll be curious to read up about your solutions to those puzzles.
Impressively small or large? It's tiny compared to most storage products/databases.
I would say that's mostly due to that when it gets down to brass tacks, Redis (ignoring the orchestration in Cluster) really is a fairly simple interpreter with simple memory management (largely delegated to the malloc implementation) that glues together a fair number of nice
data structure/algorithm implementations that have little codependence. That last point is really, really important in it's developer accessibility. (antirez called this modularity elsewhere I think.) It also helps keep code size down.
Has redis broken any of the "simple" features? I don't use it that intensively, but always got the impression that most new things are optional and easily ignored.
This is actually what I love about redis. They added clustering, obviously a massive feature, in 3-4 and you could totally ignore it if you wanted to, while still receiving the stability and speed increases without changing much of anything.
Exactly, the same with ACLs and RESP3. With ACLs a lot of design work was done in order to make sure that is impossible to understand the difference if you don't know about ACLs. The old "requirepass" now sets the password for the default user, and the default user ships with the exact default of what connections could do before. Backward compatibility and the ability to ignore any part of Redis you don't use are one of my main goals.
If new features are implemented as 'composable' algorithms, meaning that they:
a) can be used, optionally
b) can be composed with each other into new data processing (eg filtering/aggregation) models
c) did not reduce, significantly complicate, or create many 'exception rules', to otherwise a coherent model
Then those are good additions, in my view.
Also, I personally found, useful about Redis is its protocol. It is a very useful and poweful paradigm when different systems by different authors/companies implement same protocol. It reduces cognitive overload on users (programmers) and helps to continue building their domain expertise, rather than constantly re-learing APIs of tools or libraries.
I think it is sad that downvoting is used to stifle non-popular opinions. It makes people with potentially interesting (or at least non-mainstream) views less likely to participate. I do not mind people disagreeing, but the downvote is used to "bury" comments, and as a result some people will never see them.
Think about it: if you downvote anything you disagree with, eventually all you will see is opinions that you agree with. I don't know about you, but that is not what I expect from HN.
I use redis. I don't need ACLs, but I certainly don't disagree with your assessment, nor begrudge the time, care and attention to detail which is obvious in your design-work.
Not to mention there are now businesses built on the data provided by this free software, being run by volunteers (like I).
I bet you'd receive a hell of a lot of postcards of thanks from around the world. Things like that feel more "real" than emails and words on forums.
@antirez do you have dogs? I’ve got some branded leashes with your name on em!
You are a pillar to a lot of us. (And I’m not a Redis user)
This is why I fell in love with Redis right from the beginning. Having a data store that gets out of the way and just gives you the structures (a lot like programming data structures) to do what you need feels awesome. Picking up the commands and putting them to use feels as natural as reaching for any vector/list/whatever structures I use anyway in my programs.
Thanks so much for everything you're doing antirez.
Thank you for the thoughtful design, for saying no to so many things and keeping Redis pure and useful and for your awesome stewardship. You are an inspiration in all those things!
Not a security expert, but in time_independent_strcmp(), first comment about strlen()s: couldn't the attacker use his own accounts with known passwords to determine the length of some other user's password? Also, given the name of this function I would expect the comparison to be time independent, even if attacker can change both strings' lengths... Or am I missing something? Haven't touched C in a loooong time... :)
> /* Again the time of the following two copies is proportional to > * len(a) + len(b) so no info is leaked. */ > memcpy(bufa,a,alen); > memcpy(bufb,b,blen);
If the attacker controls one of the inputs, the execution time reveals something about the length of the other input, right?
Or maybe you just meant that the length is leaked by the contents are not leaked? (I agree that it's generally considered ok for "timing-safe equals" functions to leak the length of the secret. But if you ARE allowed to leak the length, you can simplify the code by just checking the length in the beginning and exiting if they're not equal.)
And if you don't want to leak the length, it's easy: pre-SHA-512 the secret and then only compare hashes instead of comparing the full strings.
My point was not so much that there is a bug, but more that the assumptions that this function makes are not obvious (to me at least) from its signature or the header comment. I could easily imagine someone using this function assuming that it does, you know, time insensitive string comparison, but this is only true in a very limited context. Even renaming vars to `a_unchangeable` and `b_user_controlled` (or a comment) would help. But again, this is just a very minor point from a passing stranger, so feel free to ignore it. :)
And thanks for providing Redis!
Redis Streams is probably one of the most path-breaking things that Redis is doing - and can completely change what it stands for. That's not a bad thing..but it doesn't even get a title by itself but is rather called "data structures".
Here's a question - do you anticipate competing with Kafka? There are a lot of us that are cheering Redis on for this...but it seems there is internal reluctance.
It's like the long standing "Redis is not memcached" statement - even WordPress has a "choose Redis or memcached for object caching" option.
As was said in the post, Redis isn't memcached, but "a bunch of Redis nodes on the same machine together with a Redis Cluster proxy" is kind of equivalent to memcached.
Redis and memcached are on different architectural levels—one Redis instance represents one shard of a sharded data-store with an IO-concurrency of r=1 w=1, while one memcached instance represents a complete data-store with an IO-concurrency of r=N w=1. To build a caching "virtual appliance" for deploying on some big hardware, a memcached appliance would just be a memcached instance; while a Redis appliance would contain N Redis nodes + 1 Redis Cluster proxy.
It's a bit like comparing, say, UDP with TCP. There are features you can layer on top of UDP to make it act equivalently to TCP (as e.g. QUIC does), but you can also just use UDP by itself. UDP is not designed to be TCP or "compete" with TCP; but UDP happens to be able to be used—when you add extra stuff on top—for the same things TCP can be used for.
Redis, like UDP, is a flexible, low-level "infrastructure component" that can be customized into tools to solve any number of use-cases—but, like UDP, Redis sets out to address only a very limited set of use-cases when used as a standalone "tool" without such customization. I personally think of Redis a lot like an OS distribution—you could make any number of virtual appliances by customizing e.g. Debian with your own extra packages and building the resulting appliance-nodes into your desired architecture; and you can make any number of data-store servers (solving any number of problems) by customizing Redis with your own commands, and then connecting the resulting nodes into your desired architecture.
Redis itself isn't going to compete directly with Kafka. Kafka, like memcached, is on a higher architectural layer than Redis. But there's nothing stopping some downstream developer from building something on that layer, e.g. a "RedisMQ" server customized into a Kafka-killer.
So this is a philosophical rather than a technical thing. For a hobbyist open source project, I would agree with you there - there's only so much resources that someone can carve out of his day job.
For a company that raised a 60mil venture funding and specifically created a license to prevent IP leakage, this is disappointing.
I want to pay Redis and I don't mind their licensing. But I'm then questioning the philosophy if everything meaningful has to be done by unpaid open-source developers/startups, who might be on tenuous footing to do anything meaningful with it because of the licensing.
Redis is built by antirez as a "hobbyist open source project." It is a small core, and will stay that way. It is an infrastructure component. It is fully open-source.
Redis Labs is the company that raised 60mil venture funding, and Redis Labs' business model is exactly to create such "distributions" of Redis for their customers. These distributions are not open-source; they are the thing that has the weird license.
Redis Labs does not own or control Redis. antirez owns and controls Redis.
You could create a similar company of your own which is also selling Redis distributions (as a product, as a service, etc.) and make money off of doing so. Since Redis itself—the thing antirez develops—is open-source, nothing stops you from doing this, just like nothing stops Redis Labs from doing this. No "unpaid open-source developers" are needed.
antirez just happens to work for Redis Labs—but not in the sense that they can really tell him what to do. They just want to pay him to ensure he keeps making it, because their business depends on the continued health of the open-source project at its core. (It's about the same situation as Guido van Rossum working for Dropbox, or Yukihiro Matsumoto working for Heroku.)
If you want to talk about the well-being of some project, downstream of Redis, to build a full-featured multipurpose DBMS "tool" with Redis at its core, that is fully open-source... well, that project doesn't exist. Redis Labs is not(!) that project; Redis Labs is a services company. But there's nothing stopping such a project from existing, and there's nothing stopping a foundation springing up around it to take care of it and its developers (like e.g. the Linux Software Foundation, or the PostgreSQL EU+US foundations.)
But that software project, if it existed, would no more be "the Redis project" than https://openresty.org/en/ is "the Nginx project." It'd just be a customized distribution, by separate people.
We have been super excited about Redis Streams for quite some time now. But we are not seeing any implementation usecases, tools...or even your awesome blog posts around streaming usecases.
It would be great to see some more literature on the streaming stuff and things that you can (and are ) doing with it...and giving it a top level importance (beyond just a new data structure).
This is super awesome stuff !
Because I prefer not to have both in production.
Whereas with Redis streams I had to write code in my application to periodically poll and claim unacked messages pending for more then some threshold time.
As soon as this happens I will have no use for Redis. I already have better scalable datastores. I often use Redis as a cache for those. If Redis isn't a low-latency, consistent key-value/key-structure store anymore then in my mind it isn't Redis anymore.
How is this a logical step in the context of micro services ? I feel there is a trade off. If I have to scale a service to N instances where each of them cache the same type of data. Potentially, I'm duplicating that data N times. Which means I need to reserve that much additional RAM per instance. I can avoid this by storing the data centrally in a very fast cache aka Redis. On the other hand even the fastest standalone cache will be much slower then accessing data in process memory.
Given that the stability and availability of your central data store in your model is likely more important than the N instances talking to it, caching data on those instances is definitely beneficial, rather than asking it for the same data over and over.
You need not reserve that much additional RAM, modern infrastructures end up with unused RAM anyways, by virtue of them being optmised for CPU usage of stateless microservices. Why not put it to use? A smart cache eviction strategy can help optimise what gets held in RAM and what doesn't. A Redis client working in tandem with the Redis server can help with that and stop you needing to write a bunch of code to do this.
Actually, they don't. If you're scaling a multi-user system to millions of users across hundreds/thousands of instances of your micro-service, they're likely to cache different data because different user sessions will be handled by each instance of your app. So, local caching in each micro-service is valuable. You just need to set an upper limit on how much memory to consume.
> I/O threading is not going to happen in Redis AFAIK, because after much consideration I think it’s a lot of complexity without a good reason
vs.
> the way I want to scale Redis is by improving the support for multiple Redis instances to be executed in the same host, especially via Redis Cluster
Having to run multiple Redis instances, with a separate "Redis Cluster [proxy]", sounds like "I didn't originally use threads, and while threads are technically the correct solution (because it's the standard way to do things and not all that complicated), I don't feel like rewriting Redis to use threads the way it should have been done from the start; thus I've come up with an alternative hack that is not ideal but makes my life easier". While I have always argued that memcached is not automatically better solely because it is multi-threaded, the idea that an additional point of failure – a technically unnecessary proxy/broker – is somehow a better solution than threads seems like an attempt to save face for mistakes made during early development.
As for the ACL:
> You get a library from the internet, and it looks to work well. Now why on the earth such library, that you don’t know line by line, should be able to call “FLUSHALL” and flush away your database instantly
What can I say other than "give me a break". Every large codebase has hundreds/thousands of dependencies, and any one of them could slip in a malicious block of code. My analysis: anyone who would be concerned about some library calling FLUSHALL would never even install Redis - a C program that, for all we know, is spamming emails or running a Bitcoin miner while it runs. The fact is, all 3rd party software/dependencies come with risk; if I'm trusting your software not to screw me while I sleep, I think I can trust a basic library not to hijack my Redis server with a FLUSHALL command. Really, this is not even remotely a valid argument. Whatsoever.
Then there's this:
> maybe you just hired a junior developer that is keeping calling “KEYS” on the Redis instance, while your company Redis policy is “No KEYS command”.
Ok, now you're just making shit up. Here's an idea: you don't need ACL to restrict access to KEYS, as that command should have been deprecated and entirely removed in favour of SCAN long ago. You mentioned "I think it’s a lot of complexity without a good reason" regarding multi-threaded Redis; yet here you are advocating an entire, complex ACL layer... for what purpose? For some imaginary malicious scenario where my Redis library is going to call FLUSHALL, or a "junior developer" is allowed to call "KEYS *" – a command that should not even be available in 2019?
I started this comment intending to provide unfiltered – but polite/constructive – feedback. The more I reread your original post, the more frustrated I got at the fact you seem to have lost all common sense. You're trying to justify adding a massive amount of complexity, including a clustering/proxy strategy instead of simple threading, and ACL to save us from imaginary threats. I've previously been in the stage of development that I believe you might be in now – you've run out of truly "good ideas" that would actually benefit your user base, and are now attempting to justify new features as necessary, when the truth is you're just looking to write new code that you personally find to be "new and exciting", rather than iterating on the existing, boring codebase.
I've spent the past 4-5 years advocating that memcached is effectively dead because of Redis. The more you try to make Redis some kind of ACL-secured, clustered database – rather than the lightweight cache store we learned to love – the more likely another project will step in and replace Redis the same way Redis replaced memcached.
* Clients will stay simple, RESP3 is backward compatible with RESP2
* ACLs are mostly an anti-fool protection.
* True multi-threading is impossibly hard to implement, but there will be some ad-hoc workarounds.
* Better persistence will be getting better.
* Existing data structures solve everything, so their number won't be extended.
* Read @antirez twitter to stay tuned.
https://en.wikipedia.org/wiki/Access_control_list
> she/he (not sure)
English is really strange around gender, and idiomatic styles have changed over time. It has always been a valid style in English to use "they" to refer to a single person of unknown gender [1]. This feels natural to English speakers when the subject is unknown and could be one of many potential people, as in:
When the first guest arrives at the party, give them a balloon.
Here, "them" only refers to a single person, but it's correct.
For the past hundred years or so, it was also common to use "he" to refer to an unknown hypothetical person. You see a lot of textbooks that do this. In theory, this wasn't supposed to come across as being exclusionary, but obviously it is. It's an explicitly gendered pronoun.
The feminist movement rightly drew attention to this problem, and writers experimented with a variety of approaches. Sometimes, at the beginning of a book you'll see an explicit "apology" saying something like "we use 'he' but don't mean it to only apply to males". Some simply switched to using "she" for everything. Others switch between "he" and "she" throughout, randomly or use the longer "he or she".
But, lately, it seems like the style is settling down towards simply using the already established singular they for all of these cases. To a native speaker, it feels a little weird at first, but you quickly get used to it. It's less jarring than "he" or "she" for most readers now.
It has a lot going for it:
* It has more established history than any other form.
* It doesn't force the author to pick an arbitrary meaningless gender.
* It's shorter than the awkward "he or she" (which also still forces you to decide which gender to put first).
* It doesn't exclude people who prefer neither "he" nor "she" as their pronoun.
So, if you're trying to figure out what pronoun to use when you don't know a specific one that is correct, default to "they".
[1]: https://en.wikipedia.org/wiki/Singular_they
while gender/sex is often beside the point, number often matters. we can keep the gendered pronouns for the few cases where it's needed, and use ungendered pronouns in the general case, but that requires an unambiguous singular ungendered pronoun.
English already dropped it with 2nd person, settling on the plural (and formal) “you” as the sole pronoun and dropping the informal singular “thee”; and for 2nd person distinguishing number in the pronoun is probably more often an impediment to discussion than 3rd person (a 3rd person pronoun needs an explicit referent in the framing context anyway, so you don't lose much by not adding a reminder of number into the pronoun itself.)
Even if we could, it's still not clear what the ideal solution would be. There are cases where you need to communicate ambiguous number too. It might be nice to have pronouns that distinguish "definitely more than one" from "potentially more than one".
No, no it is not. I assure you, it is not correct.
If this is a concern, rewriting the sentence gives a solution.
"I’ve the feeling she/he (not sure) is not the only one.."
Change to "I’ve the feeling that user is not the only one..."
Your assurances don't mean much compared to usage going back to at least 1382: https://en.wikipedia.org/wiki/Singular_they
https://public.oed.com/blog/a-brief-history-of-singular-they...
Language evolves, and this means that the rules of grammar do too. The MLA already allows for pronoun usage to match that of the preferred pronoun of whomever you are writing about, and are considering allowing for the singular they in general. Some style guidelines already allow it. Others still proscribe it.
Plenty of news organizations, including the Washington Post, have set their style guidelines to use the singular they.
It's grammatically correct in plenty of places today, and the current trend is for that to increase, not decrease. You're fighting a losing battle here.
Let me rephrase that reply: Everyone is entitled to his own opinion.
Everyone takes the singular, but in that reply, I paired it with the plural their.
Here's a screenshot: https://i.imgur.com/KKbFXbD.png
That said, it was still a good read. Thank you for the time and effort you put into Redis.
I think there would be value in a "simple Redis": a well-maintained tool for a certain class of problems which do not require a distributed database.
EDIT: I also want to stress on a fact, that at the same time in the newer releases there was a simplification attempt. For instance the Lua scripting side effects problem are going to be removed completely by simplifying how the replication of Lua scripts is performed. Redis 5 is already like that, but the dead code was not yet removed for safety. Redis 6 will completely remove many parts of code. So there is not just the stress on adding, but also on refactoring. ACLs themselves allowed to refactor authentication in separated functions inside acl.c to lower the overall complexity.
Redis is the most simple, rock-solid piece of software I have ever had the pleasure of using. I find the comparison with FoundationDB jarring.
Don't let the negative comments like that sway your mindset; I think your choices are spot on and the proof is in the pudding.
A quick glance suggests that this would still be similarly possible today. Redis is by no means perfect, but it deserves it's reputation for being accessible!
PS: Thanks for the mention of the Lua side effects changes. I'll be curious to read up about your solutions to those puzzles.
I would say that's mostly due to that when it gets down to brass tacks, Redis (ignoring the orchestration in Cluster) really is a fairly simple interpreter with simple memory management (largely delegated to the malloc implementation) that glues together a fair number of nice data structure/algorithm implementations that have little codependence. That last point is really, really important in it's developer accessibility. (antirez called this modularity elsewhere I think.) It also helps keep code size down.
a) can be used, optionally
b) can be composed with each other into new data processing (eg filtering/aggregation) models
c) did not reduce, significantly complicate, or create many 'exception rules', to otherwise a coherent model
Then those are good additions, in my view.
Also, I personally found, useful about Redis is its protocol. It is a very useful and poweful paradigm when different systems by different authors/companies implement same protocol. It reduces cognitive overload on users (programmers) and helps to continue building their domain expertise, rather than constantly re-learing APIs of tools or libraries.
So work on RESP3, in my view, is very welcomed.
Think about it: if you downvote anything you disagree with, eventually all you will see is opinions that you agree with. I don't know about you, but that is not what I expect from HN.