I’m part of the Muze team at Charts.com. Over the years I’ve seen lots of people who struggle to find the perfect balance between low-level visualization kernel (like d3), or black-box configurable charts (HighCharts, FusionCharts).
So we decided to build Muze taking a data-first approach, where you load your data in an in-browser DataModel, run relational algebra enabled data operators to get the right subset of data, and then just pass to Muze engine, which automatically renders the best visualization for it.
Any changes to data (including application of data operations) automatically updates the visualization, without you having to do anything else.
Couple of added benefits are :
- With other libraries, if you’ve to connect multiple charts (for cross-interactivity, drill-down etc.), you’ve to manually write the ‘glue’ code. With Muze, all charts rendered from the same DataModel are automatically connected (enabling cross-filtering).
- Muze allows faceting of data out of box with multi-grid layout.
- Composability of visualizations allow you to create any kind of cartesian visualization with Muze, without having to wait for the charting library vendor to release it as a ‘new chart type’
- Muze exposes Developer-first API for enabling interactivity and customizations. You can use the low-level API to create complex interaction
We’ve literally just launched this last month or so, so I’d love some feedback if you can spare the time.
> I’ve seen lots of people who struggle to find the perfect balance between low-level visualization kernel (like d3), or black-box configurable charts (HighCharts, FusionCharts)
This is the biggest pain point for me with most current solutions. Either development time is super fast (e.g., tableau, periscope) but going beyond 80% is difficult, or development time is much longer (e.g., d3 or apis thereof) but you get full customization and getting to 100% is straightforward. For me, there is certainly a need to develop an 80% solution fast, but I also am always wanting to then redo the whole thing with lower level solution. I would prefer that I can piggy back off the 80% solution to 100% in the same software. That's a huge win for me. Thanks for providing a solution to this end, will definitely play around with this.
Hi, thanks for the reply! Actually, apologies for throwing your name next to tableau like that. I think your product does a great job incorporating things like R/python scripting to allow more flexibility in how data can be manipulated within the product. In this sense I prefer periscope to tableau (an in many other senses actually).
A problem I encountered (granted over a year ago) was creating grouped bar charts with confidence intervals. Bars were grouped on some discrete x axis labels. The suggested solution for confidence intervals on grouped bars was to use a scatter plot to draw the confidence intervals, but this clumped them all on the xlabel position, not in the center of each bar. matplotlib for example treats the visualization as an object, in which case it makes a lot of sense that to add confidence intervals just query the bar objects for their positions and place line segments of desired widths in the center of the tops of the bars (or wherever, you have full control over this). So in general, a marriage of these two paradigms, quick development of a visualization based on data, but then the ability to switch to viewing and manipulating the visualization as a collection of instantiated objects with full control over their attributes.
I am open to revisiting over any development periscope has made to this end.
Yeah, the hack you described for CIs is typical of "80% charting". We have a list of probably thousands of longtail visualization requests and we're way past the 80/20 point.
These days customers who want to go 100% use the Python/R editors and do their custom visualization there. So you do your SQL query like usual, but then pipe it to Python/R for the visualization. Have you tried that, and has it worked for you? Or do you prefer another model?
Maybe try Muze to be able to build all those long tail viz :) - given that you don't have to think of any viz in Muze as a chart type - but rather compositions through layers. Would love to talk to you guys!
If you don't mind, could you list the other things that make you prefer periscope to tableau? Also when you say R and python scripting, do you mean using them to prepare your data or to do something else?
I'm trying to implement the Programmatic trellis layout, but having difficulty re-creating it because I can't see the structure of the data. I started with the "yo muze" generator and am trying to manipulate with my own data.
We had plans to move DataModel (manages all data ops) to serverside. We even have a half baked DataModel in Scala which we thought we would do it once we understand some usecase. But currently we have put it on hold.
We would love to know your
- use case
- number of data points
- ops on data on serverside
One use case could be a data visualization similar to what I built in 
To build the visualization in , I used 3 datasets in csv format from a kaggle competition , and I implemented the charts using dc.js and Leaflet.js. The charts were interactive and I could managed to filter the data even in the map.
The largest dataset was 284 MB, which was still ok and didn't crash my browser.
There were 2 drawbacks to my approach: 1- All the data was in the browser. If my data was bigger (~1GB), then it would crash my browser. 2- If I deploy the visualization to a server (for example AWS), then it would make the rendering extremely slowly as it has to download all the data to the browser...
It seems to me like you could leverage any number of analytical engines that expose relational interfaces, rather than go to the trouble of building your own relational model. What are the goals in building first, rather than integrating?
So here is the thing with our DataModel. Every time you perform an ops on DataModel it create another instance. Now performing multiple such actions create a DAG where each node is an instance of DataModel and each edge is an operation.
We have auto interactivity, which propagates data (dimensions) pulse along the network. Any node which is attached to visualiztion receives those pulses and changes the visual.
So far I have not found any relational interface which exposes this DAG graph and api to user. Hence we though of building this.
Having said that, we might use some established relational interface and do the propagation ourself.
The implementation you are discussing sounds pretty elegant. I am most familiar with Power BI from a data viz perspective, but have used most of the enterprise viz tools out there.
The thing that always struck me about Power BI (and also Qlik) is that it is very much a model-first tool. Visualization is secondary to the model, to the extent that much of the friction I see in new users has been treating it as a reporting/layout/visualization tool when, in fact, it is a data modeling tool with a visualization engine strapped on.
One of the big drawbacks with Power BI is that it has a terribly inefficient implementation for propagating filter contexts for visual interactions (this is their translation of your "auto interactivity, which propagates data (dimensions) pulse along the network"). I do not know the internal implementation, but I am relatively certain that visual interactions are ~O(N!) in the number of visual elements on a report page, based on my experience of performance scaling across a wide range of reports. Regardless, one of the best practices is to limit a Power BI report page to a small number of visualizations (recommendations of the cutoff value vary, and types of visuals can also impact this).
If I understand you correctly, you are calculating the minimum set of recalculations/re-renderings necessary, based on the data element that a user has interacted with. This should be something much closer to O(N) in the number of visuals to propagate user selections to other visuals. I am making an assumption that most visuals should interact, as typically the scope of a single report should have a high degree of intersection of dimensionality across all report elements.
I do not know of any analytics engine that exposes the sort of DAG and associated API you are discussing, either. The reason for my initial question was simply because that sort of engine is a product in and of itself. There are plenty of columnstore databases (and following other paradigms, but optimized for OLAP workloads) out there. It seems like biting off a lot to tackle both the data engine and the visualization tier at the same time.
The big reason that I ask is that this sort of approach to visualization seems to me to benefit greatly from a data model that supports transaction-level detail. The type of interactivity that you expose is extremely powerful. I have seen interactive tools hamstrung by data models that do not allow sufficient interaction. As soon as you put interactivity in front of users, in my experience, they want to do more with the data. If you are limited to datasets that can live comfortably in the browser, that seems a showstopper to me, as it will require pre-aggregation to fit most of the datasets I've seen; pre-aggregation negates many benefits of interactive data exploration.
I'll be taking a much further dive into your product either this weekend or next. I'm very interested.
You are absolutely correct the propagation for us is O(n) as the graph is directed. But the problem there is multifold. Once a node receives propagation pulse it tries to figure out the affected subset using the dimensions received as propagation pulse. This requires joining, hence a chance to build a O(mn)cartesian product. If you see https://www.charts.com/muze/examples/view/crossfiltering-wit... example the contribution bars are drawn when the first chart is dragged requires joining follower by groupBy.
Which is why performing this in browser env even for low amount of data (say 10k) is nightmare. There are ways you can address this but while in browser you hit the limit pretty soon.
We wanted the concept to be validated first hence we have build it for browser only. But would love to hear / learn / discuss with you on this before we go ahead and build the data model in server.
Another ambiguity with the interaction is visual effect of interaction. Questions like do you really want all your chart to be cross connected. A in house survey showed us there is no certainty of the answer. And what kind of visual effect should happen on interaction differs person to person and is a function of use case. Which is why we have chosen go for chosen behaviour like
muze.ActionModel.for(...canvases) / for all the chart in page /
.enableCrossInteractivity() / allow default cross interactivity /
.for(tweetsByDay, tweetsByDate) / but for first two canvas in the example /
}) / if selection using mouse click or brushing happens filter data /
we are still writing docs for this. We hope to finish all the these docs in two weeks time.
I'm happy to continue this discussion in further detail and share my experience. You can get in touch with me at the email address in my profile if you'd like.
You're hitting a very important question in your fourth paragraph about ambiguity of desired effect from interaction. I often catch myself thinking I've heard every use case and built most of them in various viz tools. But I have learned that I am always wrong when I think that. I frequently encounter people asking for new things and it is always a toss-up whether what they want is trivial and novel or impossible and obvious.
I tend to be a data-guy much more than a viz-guy, but I fully understand the value of viz for actually presenting knowledge. Like I said, I'm interested in trying out your tool more.
Customers I've worked with that have small datasets would typically range into the 10M order of magnitude for a primary fact, though we had smaller outliers. Additionally, it would be common to have wide dimensions that could be KBs/record, which can add up quickly.
Might I suggest giving Perspective.js a look? Supports many of the same visualizations as Muze (and some it doesn't, specifically datagrids), is user-configurable, written in WebAssembly (C++) for extreme performance, and can run trivially on the server via node.js - there is even a CLI version:
Interestingly enough, while this is as good as it gets, my initial reaction to domain name was negative. From previous experience I learned that domains like that are taken by squatters or companies that... Well, don't really know how to capitalize on them. Kudos to OP for actually offering something to do with charts! :)
I really wish there was a legal limit to how long you can squat, say 5 years, before it goes to a lottery system for a pool of people who applied to be apart of the lottery and can show legitimate use for it.
I've been a career data analyst for 12 ish years. At first I didn't get the reference to Tableau, because I use Tableau for about 5-8 hours every day. I've played around with every new charting library since Flex because I've always wanted to create a free version of Tableau that gets me 80% of what I use Tableau for, but with 1% the frustration of using Tableau. Problem was, I could never figure out how Tableau is able to create its visualizations so easily just by drag and drop. Every library makes you think of the chart you want to make beforehand, but as an analyst, I work on the data first then spend almost equal amount of time finding the most intuitive visualization for the trend I'm trying to convey. So I've just put that idea on hold indefinitely.
I went through the tutorial and I have to say...oh man, this is amazing. Building a Tableau clone is now possible! I hope you guys don't go under because its going to take me a while, but I'm super excited!
Does this work on mobile? Also, when I click "Play" the chart takes atleast 1-2 seconds to render. Is that just your code running engine or does every visualizations have that lag?
Hey it does work on mobile. However, remember visualizations like crosstab, splom are not meant to be displayed in a space constrained area as is. There are multiple way to handle this situation. At this point of time Muze does not changes layout based on space.
The web framework fetches data, does some additional checking on data and schema, process visible code and render it. That is probably the reason you are seeing lag.
Also there are few areas where Muze performance needs to be improved. We are doing a release to address this soon.
The charts do work on mobile.
Also the lag you are seeing is only a limitation of our code engine which has to fetch and process the sample in an iframe due to security constraints.
In a normal environment where the library is loaded via a script tag the charts should render very quickly.
We currently use Plotly quite a bit where I work for a customer facing website with a wide variety of charts. Does anyone know what some of the tangible benefits might be to migrating to this instead of using Plotly?
Having only browsed both Plotly and this project, this is my understanding:
Plotly seems like its just the charting/graphing layer. A common use case (and increasingly an expectation) is that a series of graphs on a single page be responsive and cross filterable. For instance, if you click on a single element on one chart, it should filter the related charts accordingly. Additionally, these filters should build on each other and the developer/analyst should be able to define that.
Really, you're now doing a form of data modelling and in the domain of BI, and Plotly isn't going to help you figure any of this out.
Tableau and Power BI have gotten traction by building products that not only include but prioritize this form of modelling. Once you define your data model, you get the multi dimensional charting for free.
The appeal to me of this library (not having done a thorough G2) is an open source alternative to those products that integrates charts and data modelling easily.
Voyager is a recommendation system based on variables characteristics present in data. User is mostly limited by the offerings of voyegar (same story of https://github.com/vega/polestar).
Vega is descriptive version of d3. We find it hard for debugging and creating complex viz.
Vega-lite is concise and more intuitive version of vega though.
However Muze was created to start directly from data, creating layout, composable layers, automatic cross interaction and a robust interaction mental model. Muze is inspired from VegaLite-InfoVis and Snap together viz paper.
Hi, one of the Vega-Lite author here. I'm glad you like Vega-Lite and were inspired by our Infovis paper. Vega-Lite is much more mature than when we wrote the Infovis paper so I suggest you to check it out. Compared to Muze, Vega-Lite is embracing full declarativeness.
Hey I think you are Dominik (guessed from your handle). Thanks for the reply.
Vega-Lite paper and layered grammar of graphics are the biggest motivations to write Muze. Vega-Lite is still my goto viz library for my ipython and JS work. Hence there are healthy intersections between vega-lite and muze terminologies and concepts.
Looks very nice at first glance. I'm just digging into each example visualization. Noticed that the "Bubble with temporal axes" seems to peg my browser (Chrome 69 on Mid 2015 2.5 GHz Intel Core i7 Macbook Pro).
Thanks for bringing this up. It’s possible to create visualization like this using Muze easily. Will create an example and keep you updated. Should we reach out to you once done (if yes, let me know how)!
thanks Jerry2, the Animation was created using adobe aftereffects and exported using the bodymovi plugin as a JSON file along with keyframe images. The exported animation can be loaded using the lottie framework. https://airbnb.io/lottie/. Hope this helps.
Animations like the one on the front page of
can be created using adobe aftereffects and exported using the bodymovi plugin as a JSON file along with keyframe images.
The exported animation can be loaded using the lottie framework.
Exactly what I saw. This isn't a matter of slow server, but some examples are taking way too much ressources. Freezes.
In the case of many charts in the scope, as i need refresh every 10 seconds, this will turn in blocking in no time.
Testing performances in firefox debugger is already freezing with one chart and no refresh, and i didn't see the garbage collector be significant. (Memory leaks..)
Too bad that charts are good. But performances should be a big warning.
So we create instance of DataModel once and then we perform operation like filtering, projection of columns, sorting etc. Everytime an operation is performed a new instance of DataModel is returned. Hence its immutable. But under the hood there is only one copy of data resides in the system shared by all instance of DataModel, for every operation we just record a formula and save it on DataModel instance. The data for that particular instance is computed on demand based on the formula. Its not a pure immutability by definition. Hence pseudo-immutable.
However, every operation does not support formula storing. Operation like joining, grouping creates new data.
We are updating the docs rapidly. All this info would be on the docs soon.
FusionCharts is a configurable JS charting library. You choose your chart, define the parameters (highly configurable, though) and then render it. FusionCharts' strength is ease of use, backward compatibility, theming etc.
Muze is data-first. You start with data, apply any operations (if needed), then render. Muze automatically detects the right chart for that and then renders. Also Muze allows you to compose any kind of cartesian visualization, as it follows grammar of graphics.
So if I've to explain this in a spectrum, it goes like this:
d3 (very powerful, high learning curve, you can do anything)
Muze (data-first, Grammar of graphics oriented, compose viz)
FusionCharts (chart-first, lot of depth in configurations, but can't extend yourself for new chart types)