Same, I kept searching for the normal login form. Because the Google form was said to be "more secure", I assumed it was more secure than the normal login form that's hidden somewhere.
Very cool app--wanted to leave you some feedback after my first test render finished.
Loved the visuals. I can tell that your analysis separates out harmony and percussion as advertised as the video reacts very well to transients in music. This is a common criticism I have of music visualizers--often they don't react to transients quickly enough to communicate the percussion patterns and that's not an issue here.
The whole process creating the video was very easy. Some more control over the images used would be cool, but I'm not sure how that'd work. I do also like that the images are generated and unique for each project. Perhaps if it's possible there could be an advanced feature to upload a pool of images to seed the generation of images. For example, if I wanted to create a video using landscape images of a particular type (mountains, forests, etc), this kind of feature would be very useful.
I'm not sure I'll be a paying customer for this product, as I don't have a commercial use for music visualization at the moment, but I'll be playing around with it some more for sure. I have a big collection of synthesized versions of classical music I created for a project that never materialized, so I can see using this tool to revisit that music and perhaps create videos to share. Thanks for sharing this very cool project!
Thanks! In previous feedback we saw that uploading your own images to create a new visual theme is the most requested feature. So we're definitely thinking about that.
I haven't seen StyleGAN used in such an abstract way -- the results are really beautiful, very biological. Sometimes it reminds me of cellular automata, at least the very high resolution ones that stabilize into cellular-looking blobs. Are these trained on some input data or is it parametrically produced somehow? It's been a few years since I was reading about GANs so I can't really put together how you're getting the images and textures here.
If there is a paired dataset, I guess this could be "easy". The stylegan "input" is essentially used to control parameters at various stages within the network, so you could adjust them one at a time, or on varying schedules or something, to get the sort of gradual effect.
I know that randomly instantiated neural networks can produce some pretty trippy image transformations as well, so maybe there is a way to bootstrap without paired data.
Lastly, I doubt this is on the right track but it would be cool if you could produce appropriately styled images with a "compression" approach. ie, trying to fit the audio information into some small visually meaningful latent space, and then using that to generate images.
edit: ok just watched the first example, back to square 1 for me. Its literally pulling stuff from paintings.
We're using audio analysis, and applying that to control the output of a Generative Adversarial Network (GAN) trained on a particular set of images, which define the visual theme.
Thanks! I didn't notice the first example where its more apparent there are paintings used as a source material -- are the other examples on the landing page produced from paintings or some other original source? It would be an interesting task to try and design images knowing they will be used as the GAN's inspiration.
Also out of curiosity, did you determine anything about the legality of training the GAN on copyrighted images and decidings its output is its own creative work, or using public domain images?
Signing in is needed though. We could maybe find a way without it but it would be a bog hassle. At minimum we would need an email adress for when the render is done.
For now, we have a very basic pricing scheme. We add a subtle watermark to the video. If you pay to remove it, all the renders (also new ones) for that project will be without watermark. For now, we don’t put any limitting, except on audio length and file size.
Do you think this is a fair model? Or would something else be better?
Legal -> it’s your video. But we’re not responsible for storage.
This is incredible, and an entertaining use case for stylegan. Big fan of the first example track (Kupla x j'san - raindrops) which caught my attention. Keep up the great work!
While it's impressive that an AI manage to do that, you can find much higher quality content made with existing special tools.
The best organic, cellular automata like video clip that is synchronized with the music that I know of is:
https://youtu.be/_7wKjTf_RlI
A little warning that renders can take a long time these days. We've been trying to increase our quota on GCP for GPU instances, but they've refused our requests. We tried contacting the GCP team, but no reply yet.
If you know someone from the GCP team (sales/support), please let us know! We would like to increase our GPU instance quota.
Great work, the videos on the landing page are stunning! I'd love to give it a try with my own tracks if there was an option to log in without Google as others have also suggested.
Is the name a tribute to Mr. Mescudi?
As a developer of VJ applications for more than 25 years, it deeply resonates with me.
I will wait to be able to create an account with my email rather than using google to play with it.
|
I watched the 4 example videos. While impressive, the first one has so much flickering that I had to stop watching it. The second example is perfect, and you feel very well the connection to music. The third one got me lost; I am a big fan of ambient and cosmic music, but the result did not seem connected. The fourth one is much better, and you perceive the waves of the musical input very nicely.
|
While using a very different technological approach, the three videos from the right side have a lot in common with the results of Shadertoy.
Many music videos on YouTube are just a static picture; there is an excellent market for your idea. Maybe try to have more examples based on what you see prevalent on YouTube.
I uploaded one of my own tunes, one which I made with my wife. The video generated was spectacular. I was so impressed that the relatively high price tag is not a deal breaker. Love this.
Your I made it all the way to the "Google login", pressed it and was promoted for _way more_ than my email like my advertising preferences and any information I've found available.
It is possible to make that optional and by all means you should - that is possible with google sign in scopes. Even if you want my public profile info and not just my email that should be optional.
Ok, we will add user+pw. I've done some research in the meantime:
- FB is not that easy, since you can't set the email address as a required field. We use that to send the final render. We could let it be filled in in the app though.
Cool and interesting idea, but like most said, won't be logging in using google OAuth. Hope the demos are actual display of the tech instead of "hand made"
I don't really have a good solution for it. I've toyed around with deep learning based SaaS type projects though so I know cloud server costs are a major factor.
What sort of worked was I routed the data so the inference ran on my own local server (i.e. my work station), although the downside is you can't use your GPU for other stuff. At the time I looked into it, 3 months of cloud server costs were enough to buy your GPU work station. Also tried to do whatever I could to optimize inference.
I have no idea what kind of volume your getting but I imagine itd be even worse for video based Gan stuff. Maybe go for quality an target super high end? You probably have a much better idea than me.
All the best, will be interested to know how it goes
do you have a youtube demo or something? Personally I don't login to something unless I'm seriously interested, and I have no idea what this is without a demo
I'm ok with you storing my password, I'd create a unique one, this seems secure.
I don't want to log in via Google though, I see it as less privacy friendly.
It's also arguably less secure; if my Google account is compromised, the attacker gains access to other services.
Email/passwd auth coming in the coming week.
Loved the visuals. I can tell that your analysis separates out harmony and percussion as advertised as the video reacts very well to transients in music. This is a common criticism I have of music visualizers--often they don't react to transients quickly enough to communicate the percussion patterns and that's not an issue here.
The whole process creating the video was very easy. Some more control over the images used would be cool, but I'm not sure how that'd work. I do also like that the images are generated and unique for each project. Perhaps if it's possible there could be an advanced feature to upload a pool of images to seed the generation of images. For example, if I wanted to create a video using landscape images of a particular type (mountains, forests, etc), this kind of feature would be very useful.
I'm not sure I'll be a paying customer for this product, as I don't have a commercial use for music visualization at the moment, but I'll be playing around with it some more for sure. I have a big collection of synthesized versions of classical music I created for a project that never materialized, so I can see using this tool to revisit that music and perhaps create videos to share. Thanks for sharing this very cool project!
If there is a paired dataset, I guess this could be "easy". The stylegan "input" is essentially used to control parameters at various stages within the network, so you could adjust them one at a time, or on varying schedules or something, to get the sort of gradual effect.
I know that randomly instantiated neural networks can produce some pretty trippy image transformations as well, so maybe there is a way to bootstrap without paired data.
Lastly, I doubt this is on the right track but it would be cool if you could produce appropriately styled images with a "compression" approach. ie, trying to fit the audio information into some small visually meaningful latent space, and then using that to generate images.
edit: ok just watched the first example, back to square 1 for me. Its literally pulling stuff from paintings.
My co-founder has a YouTube channel where he will be going into technical detail how this all works. He will be posting a video in the coming weeks. https://www.youtube.com/channel/UCNIkB2IeJ-6AmZv7bQ1oBYg
Also out of curiosity, did you determine anything about the legality of training the GAN on copyrighted images and decidings its output is its own creative work, or using public domain images?
Firstly, like others have said, I don't want to use Google to sign in.
Also, why must I sign in? This seems like a tool and thus doesn'y need a login.
Which makes me ask... what's the pricing? When I do "sign up" do I only get one video? one minute? a limit of some type?
Also, what's the legal? Do you own the video? Can I monetize it?
An FAQ can probably help answer the above.
Google -> fair point, we’ll add some more options
Signing in is needed though. We could maybe find a way without it but it would be a bog hassle. At minimum we would need an email adress for when the render is done.
For now, we have a very basic pricing scheme. We add a subtle watermark to the video. If you pay to remove it, all the renders (also new ones) for that project will be without watermark. For now, we don’t put any limitting, except on audio length and file size. Do you think this is a fair model? Or would something else be better?
Legal -> it’s your video. But we’re not responsible for storage.
If you know someone from the GCP team (sales/support), please let us know! We would like to increase our GPU instance quota.
1. Upload your audio track
2. Pick a visual theme
3. Render your video
Enjoy!
Any feedback is welcome!
As a developer of VJ applications for more than 25 years, it deeply resonates with me.
I will wait to be able to create an account with my email rather than using google to play with it.
| I watched the 4 example videos. While impressive, the first one has so much flickering that I had to stop watching it. The second example is perfect, and you feel very well the connection to music. The third one got me lost; I am a big fan of ambient and cosmic music, but the result did not seem connected. The fourth one is much better, and you perceive the waves of the musical input very nicely. |
While using a very different technological approach, the three videos from the right side have a lot in common with the results of Shadertoy.
Many music videos on YouTube are just a static picture; there is an excellent market for your idea. Maybe try to have more examples based on what you see prevalent on YouTube.
You can always look up a song on youtube and use youtube-dl with the audio flag to grab an mp3 file, if you really do want to try it out
It is possible to make that optional and by all means you should - that is possible with google sign in scopes. Even if you want my public profile info and not just my email that should be optional.
- FB - Twitter - Apple - Would email+pw be necessary? I’m maybe biased, I generally prefer using and implementing third party authentication.
I would consider using Apple for something like this. But limiting it to third-party auth is pretty unfriendly to users.
- FB is not that easy, since you can't set the email address as a required field. We use that to send the final render. We could let it be filled in in the app though.
- Twitter will probably be added
- Apple can be investigated
It's a pretty costly operation since the marginal cost is very high compared to other SaaS products. We'll probably iterate on the pricing model.
What do you think a good model would be?
What sort of worked was I routed the data so the inference ran on my own local server (i.e. my work station), although the downside is you can't use your GPU for other stuff. At the time I looked into it, 3 months of cloud server costs were enough to buy your GPU work station. Also tried to do whatever I could to optimize inference.
I have no idea what kind of volume your getting but I imagine itd be even worse for video based Gan stuff. Maybe go for quality an target super high end? You probably have a much better idea than me.
All the best, will be interested to know how it goes