Show HN: WZRD – Visualize your music with StyleGAN

(wzrd.ai)

76 points | by acnops 1274 days ago

22 comments

  • thih9 1274 days ago
    > Why is [logging in via Google] more secure? When logging in with one of the buttons above, we can't store your password.

    I'm ok with you storing my password, I'd create a unique one, this seems secure.

    I don't want to log in via Google though, I see it as less privacy friendly.

    It's also arguably less secure; if my Google account is compromised, the attacker gains access to other services.

    • acnops 1274 days ago
      Good point, we’ll add more login options.
    • XCSme 1274 days ago
      Same, I kept searching for the normal login form. Because the Google form was said to be "more secure", I assumed it was more secure than the normal login form that's hidden somewhere.
  • WORLD_ENDS_SOON 1274 days ago
    Very cool app--wanted to leave you some feedback after my first test render finished.

    Loved the visuals. I can tell that your analysis separates out harmony and percussion as advertised as the video reacts very well to transients in music. This is a common criticism I have of music visualizers--often they don't react to transients quickly enough to communicate the percussion patterns and that's not an issue here.

    The whole process creating the video was very easy. Some more control over the images used would be cool, but I'm not sure how that'd work. I do also like that the images are generated and unique for each project. Perhaps if it's possible there could be an advanced feature to upload a pool of images to seed the generation of images. For example, if I wanted to create a video using landscape images of a particular type (mountains, forests, etc), this kind of feature would be very useful.

    I'm not sure I'll be a paying customer for this product, as I don't have a commercial use for music visualization at the moment, but I'll be playing around with it some more for sure. I have a big collection of synthesized versions of classical music I created for a project that never materialized, so I can see using this tool to revisit that music and perhaps create videos to share. Thanks for sharing this very cool project!

    • acnops 1274 days ago
      Thanks! In previous feedback we saw that uploading your own images to create a new visual theme is the most requested feature. So we're definitely thinking about that.
  • jazzyjackson 1274 days ago
    I haven't seen StyleGAN used in such an abstract way -- the results are really beautiful, very biological. Sometimes it reminds me of cellular automata, at least the very high resolution ones that stabilize into cellular-looking blobs. Are these trained on some input data or is it parametrically produced somehow? It's been a few years since I was reading about GANs so I can't really put together how you're getting the images and textures here.
    • boreas 1274 days ago
      Hope this gets a response - I don't see a paper.

      If there is a paired dataset, I guess this could be "easy". The stylegan "input" is essentially used to control parameters at various stages within the network, so you could adjust them one at a time, or on varying schedules or something, to get the sort of gradual effect.

      I know that randomly instantiated neural networks can produce some pretty trippy image transformations as well, so maybe there is a way to bootstrap without paired data.

      Lastly, I doubt this is on the right track but it would be cool if you could produce appropriately styled images with a "compression" approach. ie, trying to fit the audio information into some small visually meaningful latent space, and then using that to generate images.

      edit: ok just watched the first example, back to square 1 for me. Its literally pulling stuff from paintings.

      • acnops 1274 days ago
        We're using audio analysis, and applying that to control the output of a Generative Adversarial Network (GAN) trained on a particular set of images, which define the visual theme.

        My co-founder has a YouTube channel where he will be going into technical detail how this all works. He will be posting a video in the coming weeks. https://www.youtube.com/channel/UCNIkB2IeJ-6AmZv7bQ1oBYg

        • jazzyjackson 1274 days ago
          Thanks! I didn't notice the first example where its more apparent there are paintings used as a source material -- are the other examples on the landing page produced from paintings or some other original source? It would be an interesting task to try and design images knowing they will be used as the GAN's inspiration.

          Also out of curiosity, did you determine anything about the legality of training the GAN on copyrighted images and decidings its output is its own creative work, or using public domain images?

  • canadianwriter 1274 days ago
    Might be interesting but I'm missing some info.

    Firstly, like others have said, I don't want to use Google to sign in.

    Also, why must I sign in? This seems like a tool and thus doesn'y need a login.

    Which makes me ask... what's the pricing? When I do "sign up" do I only get one video? one minute? a limit of some type?

    Also, what's the legal? Do you own the video? Can I monetize it?

    An FAQ can probably help answer the above.

    • acnops 1274 days ago
      Hi, thank you for your feedback!

      Google -> fair point, we’ll add some more options

      Signing in is needed though. We could maybe find a way without it but it would be a bog hassle. At minimum we would need an email adress for when the render is done.

      For now, we have a very basic pricing scheme. We add a subtle watermark to the video. If you pay to remove it, all the renders (also new ones) for that project will be without watermark. For now, we don’t put any limitting, except on audio length and file size. Do you think this is a fair model? Or would something else be better?

      Legal -> it’s your video. But we’re not responsible for storage.

  • Jonanin 1274 days ago
    Why do I have to sign in using Google?
    • acnops 1274 days ago
      We’ll add some more options. Thank you for the feedback!
  • tonyteate 1274 days ago
    This is incredible, and an entertaining use case for stylegan. Big fan of the first example track (Kupla x j'san - raindrops) which caught my attention. Keep up the great work!
  • The_rationalist 1274 days ago
    While it's impressive that an AI manage to do that, you can find much higher quality content made with existing special tools. The best organic, cellular automata like video clip that is synchronized with the music that I know of is: https://youtu.be/_7wKjTf_RlI
    • acnops 1274 days ago
      Amazing! I do think these two methods will become complementary, since the output can be very different. Both cannot do what the other can.
  • acnops 1274 days ago
    A little warning that renders can take a long time these days. We've been trying to increase our quota on GCP for GPU instances, but they've refused our requests. We tried contacting the GCP team, but no reply yet.

    If you know someone from the GCP team (sales/support), please let us know! We would like to increase our GPU instance quota.

  • acnops 1274 days ago
    Hi! Today, we're launching WZRD, an online platform to create immersive videos using AI.

    1. Upload your audio track

    2. Pick a visual theme

    3. Render your video

    Enjoy!

    Any feedback is welcome!

    • afkqs 1274 days ago
      Great work, the videos on the landing page are stunning! I'd love to give it a try with my own tracks if there was an option to log in without Google as others have also suggested. Is the name a tribute to Mr. Mescudi?
    • FraKtus 1274 days ago
      Congratulations!

      As a developer of VJ applications for more than 25 years, it deeply resonates with me.

      I will wait to be able to create an account with my email rather than using google to play with it.

      | I watched the 4 example videos. While impressive, the first one has so much flickering that I had to stop watching it. The second example is perfect, and you feel very well the connection to music. The third one got me lost; I am a big fan of ambient and cosmic music, but the result did not seem connected. The fourth one is much better, and you perceive the waves of the musical input very nicely. |

      While using a very different technological approach, the three videos from the right side have a lot in common with the results of Shadertoy.

      Many music videos on YouTube are just a static picture; there is an excellent market for your idea. Maybe try to have more examples based on what you see prevalent on YouTube.

    • tr1pzz 1274 days ago
      To see the kind of videos you can create using WZRD, take a look at our Vimeo channel: https://vimeo.com/neuralsynesthesia
    • nailer 1274 days ago
      Hi! Nearly all of my music is on Spotify, I don't have MP3s or WAVs of it.
      • jazzyjackson 1274 days ago
        It can hardly be called 'your' music then, can it? :P

        You can always look up a song on youtube and use youtube-dl with the audio flag to grab an mp3 file, if you really do want to try it out

  • harel 1273 days ago
    I uploaded one of my own tunes, one which I made with my wife. The video generated was spectacular. I was so impressed that the relatively high price tag is not a deal breaker. Love this.
  • inglor 1274 days ago
    Your I made it all the way to the "Google login", pressed it and was promoted for _way more_ than my email like my advertising preferences and any information I've found available.

    It is possible to make that optional and by all means you should - that is possible with google sign in scopes. Even if you want my public profile info and not just my email that should be optional.

    • acnops 1274 days ago
      That’s not good indeed. We actually only store email and link to profile pic, so I’ll look if we can limit it as much as possible.
  • acnops 1274 days ago
    Google being the only login option is getting some criticism. What would you like as authentication options?

    - FB - Twitter - Apple - Would email+pw be necessary? I’m maybe biased, I generally prefer using and implementing third party authentication.

    • handedness 1274 days ago
      I'm surprised that with the number of complaints about Google auth, FB and Twitter are what come to mind as viable alternatives.

      I would consider using Apple for something like this. But limiting it to third-party auth is pretty unfriendly to users.

      • acnops 1274 days ago
        Ok, we will add user+pw. I've done some research in the meantime:

        - FB is not that easy, since you can't set the email address as a required field. We use that to send the final render. We could let it be filled in in the app though.

        - Twitter will probably be added

        - Apple can be investigated

  • nimmen 1274 days ago
    Cool and interesting idea, but like most said, won't be logging in using google OAuth. Hope the demos are actual display of the tech instead of "hand made"
  • MelkboerWouWou 1274 days ago
    Great tool, looking forward to the new dimension this will add to parties and music shows in the future!
  • gbh444g 1274 days ago
    That's really cool. How much computation (FLOPs) goes into rendering 10 sec of audio?
    • acnops 1274 days ago
      It’s GPU backend, and the rendering actually takes longer than the audio length
      • ackbar03 1274 days ago
        How much does it cost you in server costs to host this? GPU cloud servers arent cheap, how much do you plan to charge for it in future?
        • acnops 1272 days ago
          You can look at the GPU costs for GCP here https://cloud.google.com/compute/gpus-pricing. You also have the instance cost, and other costs like storage, transmission, etc.

          It's a pretty costly operation since the marginal cost is very high compared to other SaaS products. We'll probably iterate on the pricing model.

          What do you think a good model would be?

          • ackbar03 1271 days ago
            I don't really have a good solution for it. I've toyed around with deep learning based SaaS type projects though so I know cloud server costs are a major factor.

            What sort of worked was I routed the data so the inference ran on my own local server (i.e. my work station), although the downside is you can't use your GPU for other stuff. At the time I looked into it, 3 months of cloud server costs were enough to buy your GPU work station. Also tried to do whatever I could to optimize inference.

            I have no idea what kind of volume your getting but I imagine itd be even worse for video based Gan stuff. Maybe go for quality an target super high end? You probably have a much better idea than me.

            All the best, will be interested to know how it goes

  • silicon2401 1274 days ago
    do you have a youtube demo or something? Personally I don't login to something unless I'm seriously interested, and I have no idea what this is without a demo
    • acnops 1274 days ago
      Demo coming very soon!
  • kyriakos 1274 days ago
    Such a great idea. Good luck.
  • prut 1274 days ago
    Great
  • Ievgeniia 1274 days ago
    Do you have the API?
  • ukrwoodeast 1274 days ago
    Do you have API?
  • dapinto 1274 days ago
    Amazing visuals! This needs more upvotes
  • stencht 1274 days ago
    Wow looks nice!