I evaluated this for creating local development data sets from our production data. It’s good, but a big and complicated tool, and I remember feeling like there was a big impedance mismatch on how we wanted to fit it into our workflow.
In the end I wrote django-devdata to do a similar thing: export anonymised, referentially correct, relational data, from a large Django site. Configuration is just in code in Django settings, and data can be exported/imported pretty quickly. Happy to help anyone get set up with it if it’s useful to others!
I would be interested to know what impedance error you are referring to, and what problems you have had to deal with.
Also it would be interesting to know when that was. There have been a lot of improvements especially in the last year.
I am always trying to improve the tool. Feedback is very valuable for this.
Hey, thanks for the reply, and thanks for the tool, it's a great project.
It's hard to describe exactly what I mean by the impedance mismatch, but generally this tool seemed to be (based on a small amount of research a while ago), a primarily GUI-based tool, that requires Java, requires quite a lot of up-front knowledge to use or edit configuration with, and that has no understanding of our application.
On the other hand, the solution we ended up using (django-devdata), was code-based rather than GUI, with configuration checked in to source control, code reviewed, etc. It's a Python dependency, which helps as most of our software and tooling was in Python (no one had Java installed), the config format is pretty approachable when making small updates, no need to learn much of a new tool, and we did very regular database updates on a schema with ~500 tables. And lastly, as the configuration was just Python code in our codebase, it was easy to integrate with the rest of our application, to re-use utils, validation, etc.
Obviously this tool wouldn't be suitable for projects that aren't Django sites, so it's far more limited, but that integration was handy and I'd probably re-implement it for Rails or any other ORM or language I worked with if necessary as it's only a few hundred lines of code.
Thank you for your interest!
It is true that the tool supports mainly relational database systems.
I don't quite understand what ElasticSearch support could mean? Is ElasticSearch a DBMS at all, or just a search engine?
I'm sorry if this question is stupid, but I've really never had anything to do with ElasticSearch.
Yes, ES is a search engine, but under the hood it's really just a non-relational DB with Lucene on top of it.
I guess what I would love to see is being able to see a visual representation of the relation between different fields. (Since ES is not relational, you obviously have to define these relations yourself).
There are, for example, a aggregation functions at your disposal (https://www.elastic.co/guide/en/elasticsearch/reference/curr...)
ES offers a tool called Kibana that lets you run these functions on top of your data (and even visualize it), but I never actually liked it because it's pretty cumbersome.