Quaternion Knowledge Graph Embeddings (2019)

(arxiv.org)

100 points | by teleforce 9 days ago

3 comments

  • vslira 8 days ago
    I'm not nearly smart enough to state this confidently, but doesn't [1] imply that exotic embeddings can always be replaced by larger Euclidean embeddings?

    [1] https://en.wikipedia.org/wiki/Nash_embedding_theorems

    • hansvm 8 days ago
      No.

      Consider a square (with graph shortest path distances) for example. All euclidean embeddings have a minimum error of 20-40% or so. If you try to embed any graph containing that subgraph (embedding friends, advertisers, words, ...), you'll similarly have guaranteed error. Relaxing the square to a squircle, you'll see that error even for a really simple manifold.

      Riemannian manifolds are a bit more special.

  • palad1n 9 days ago
    Is anyone using this? It's from 2019.
    • VHRanger 8 days ago
      These exotic (non linear algebra based) embedding representations are often slow to take off unless they have an obvious use case.

      The other one that I've always been curious of is Poincarré Embeddings [1] - where the embedding also has a hierarchical representation of the space.

      There's issues with these becoming popular:

      1. Querying the embeddings requires more math knowledge than just "lol cosine similarity". This also requires to write code for the query

      2. You can often easily match the performance with regular embeddings by just adding dimensions and training more. So the advantage of exotic embeddings has to be with the information in the more complex mathematical abstraction.

      So they need a killer usecase to become popular, it's hard to move the needle.

      [1]: https://arxiv.org/abs/1705.08039

      • nico 8 days ago
        Super interesting

        Do you think it makes sense to have a group of models, each with more ad-hoc embeddings, and coordinate them to respond according to the domain of the input?

        Do multi-modal models use the same embedding type/structure for an image, sound, text?

        • VHRanger 6 days ago
          I think your first question is open for people to explore.

          The answer to the second is yes - it's all vector embeddings, and they're aligned to each other by finding a dataset that matches pairs (eg. images with captions)

          The real use for exotic embeddings will have to be in analyzing the embeddings themselves I think, otherwise it's easier to shove normal vectors downstream into other models.

    • adultSwim 8 days ago
      QuatE is implemented in PyKEEN, a library for KnowEdge graph EmbeddiNgs, https://github.com/pykeen/pykeen
    • adultSwim 8 days ago
    • cess11 9 days ago
      If someone is it's probably pretty niche. RDF is a bit simpler and still not commonly used due to its perceived complexity compared to RDBMS tables or JSON.
      • nl 9 days ago
        This completely misunderstands what this is.

        In this context "knowledge graph" means a representation of knowledge independent of it's serialization format (ie, RDF vs whatever).

        It's usable wherever you have a key value (key:value of relationship) or a triple (key:relationship:something).

        This is probably 85% of everything that is stored in a database, with the exception being paths that vary enough that you can't use the path as a key or relationship. In particular, most trees are suitable if you use collapse the path down to a single relationship.

        I've done a bunch of work on graph embedding. They are very effective for use in anything that can be thought of as a recommendation system ("this person would like these books") or similarity ("this person is similar to these people").

        Back when I was working on them I found Starspace wonderfully easy and effective: https://github.com/facebookresearch/StarSpace

        • DrDroop 8 days ago
          It is wrong the call this a "complete misunderstanding" as the paper clearly deals with triplets and formulates a different encoding scheme for entities and relations. Ok, so it is not RDF or even hyper graphs but it is dealing with a graph structure.
          • nl 8 days ago
            Sure, but as I explain almost everything can fit that structure. The misunderstanding is the " probably pretty niche" part since related techniques (graph embedding generally) are so widely used.
        • cess11 9 days ago
          I didn't compare RDF directly to the technique in TFA.
  • ziofill 9 days ago
    I always wondered why bother with quaternions and abstruse math when at the end of the day they behave like some matrices of floats.
    • barfbagginus 9 days ago
      I recently took the plunge and learned unit quaternions and dual quaternions for representing rotations, compound rotation-translations, and kinematic chains and geometric interpolation (serial and parallel robot arms, character skeletons, skinning).

      The advantages I found are:

      Unit quaternions represent rotations with less redundancy than matrices

      It's easier and more intuitive to derive, manipulate, simplify, interpolate, and solve quaternions than matrices.

      They're less abstract/more concrete than matrices.

      Rotating with quaternions takes fewer multiplications than matrices.

      Dual quaternion inverse kinematics are easier to derive and faster than matrices

      Unit and Dual quaternions have an efficient implicit form, which further speeds up IK. See: https://www.researchgate.net/profile/Neil-Dantam/publication...

      • captaincaveman 8 days ago
        • barfbagginus 7 days ago
          Quaternions makes it hard to imagine why gimbal lock would even have to be a problem.

          Normally it happens because we have to solve a rotation into potentially redundant roll, pitch and yaw factors.

          In Q, you can just write the axis of rotation as a unit vector quaternion v = v1 i + v2 j + v3 k. That pure vector quaternion represents a 180 degree rotation around v. Other rotations around v are interpolations between v and the identity rotation scalar 1:

          w = a1 + bv, where a^2 + b^2 = 1.

          This is a circular analogue of linear interpolation. Indeed, if t in [0, 1] is our rotation angle, we observe:

          a = cos(pi t) b = sin(pi t)

          And we can reduce

          w = a + sqrt(1-a^2)*v

          Where a \in [-1, 1].

          This representation is so nice. There is no need to solve roll, pitch, and yaw. Just pick unit rotation axis v, then twist the scalar knob a to set the rotation amount.

          This is quite human!

    • godelski 9 days ago
      The problem is most people were taught math wrong and this typically only gets corrected when you get to high level mathematics like abstract algebra.

      Imaginary numbers aren't "imaginary" they are how to do consistent math in a 2D framework. Quaternions? Well that's about 4D. Poincaré once said that math isn't about numbers, but relationships between numbers. For some reason we don't talk about mathematical structure (explicitly) until the late stages. I wouldn't necessarily call these concepts "abstruse" but they are a bit more abstract. A big part of the problem though, we often ignore the ground we are building upon and so when you finally look at it, it is new and confusing. But then again, the success of Bourbaki's New Math is arguable[0]

      [0] https://en.wikipedia.org/wiki/New_Math#In_other_countries

      • auraai 9 days ago
        > Poincaré once said that math isn't about numbers, but relationships between numbers

        He described math more poetically and profoundly: "the art of giving the same name to different things"

      • aap_ 9 days ago
        > Quaternions? Well that's about 4D

        Quaternions are about 3D. You need 4 numbers for it, but the space that they operate on is 3D.

        • raincole 9 days ago
          He's talking from a mathematical perspective, not a compute graphics one.

          (Unit) quaternions happens to work as rotations in 3D space. But quaternions' algebraic structure is indeed 4D, just like imaginary numbers' is 2D.

        • ducttapecrown 8 days ago
          Since you can multiply a quaternion by a quaternion, the quaternions act on themselves, so they indeed act on a 4D space. They can also act on a couple 3D spaces: the unit sphere in 4D, called the three sphere, or the imaginary quaternions.

          In math, this is called a representation of a Lie group, and there are representations in all dimensions.

    • hi-v-rocknroll 9 days ago
      For 3D graphics, it makes rotation and translation simpler by adding another dimension. Maybe you had a less-than-stellar teacher for this topic. If so, here's a master lecturer elucidating it:

      https://youtu.be/mHVwd8gYLnI

      https://en.wikipedia.org/wiki/Transformation_matrix#Examples...

      • klyrs 9 days ago
        To be fair, some of us learned about quaternions over finite fields before anything about 3D graphics. The first time I heard about this application of them, I was gobsmacked. Saying this because my professor for that course was quite excellent.
        • hi-v-rocknroll 9 days ago
          Nice! For undergrad, I was 2 courses shy of a math major || CS major, but then I saw neither was ABET accredited. Wat? Not to worry, because the CS & Eng major was accredited but required physics and chemistry for engineers series, interfacing, computer architecture, and a load and store (or was it compare and swap?) more electives approximating an EE major. It was really an EE/CS program but they couldn't call it that for historical and intercampus political reasons.

          I think most of the difference in mastery of a given topic in an academic setting comes down to the skill and interactivity with a competent and expert lecturer who also has expertise in the additional domains of public speaking and teaching. Ken Joy and Sean Davis were the virtuosos of teaching at UC Davis: massive and accessible brains. Intellectual curiosity and an semi-extroverted personality help too.

        • lanstin 3 days ago
          Or over the reals. I have studied quaternions a number of times but never studied 3d graphics. They are interesting algebraically as part of the sequence of divison algebras reals, complex numbers, quaternions, octonions, nothing. https://en.m.wikipedia.org/wiki/Hurwitz%27s_theorem_(composi...
      • globalnode 8 days ago
        hurts me how he keeps saying over and over that they were invented by someone named Heisenberg.
    • somenameforme 8 days ago
      One of the big motivations for quaternions is performance. Not only are they much more compact than a rotation matrix (4 float vs 9), but many important operations, such as composing rotations, are also much quicker. The only real downside of them is that they seem to drive everybody to this obsession to try to truly understand them. After years of using them, I've learned to embrace my ignorance. The underlying math is not terribly complex (hur hur), but relationships between interdependent periodic rotations all packed in a tidy imaginary number package, is simply not something you're going to really get a good intuitive feel for - numerically speaking.
      • ubj 8 days ago
        > The only real downside of them is that they seem to drive everybody to this obsession to try to truly understand them.

        This. If you only want to understand *how to apply* quaternions to represent rotations, it's not too hard to learn. However, if you want to understand *why* they work, that's a totally different story.

        Many of my colleagues still use Euler angles to represent rotations. I always use quaternions, and am a bit baffled why people are so averse to them.

        • noone_important 8 days ago
          If you have some knowledge of group theory the 'why' is pretty straight forward.

          Quaternions are the generators of SU(2) which is a double covering of SO(3). The latter describes rotations in 3 dimensions. Thus you can express any rotation in 3d with quaternions.

      • GuB-42 8 days ago
        If you are still interested, try this: https://marctenbosch.com/quaternions/

        The idea is that quaternions are the wrong way to look at the problem and that a better approach would be to use geometric algebra, bivectors and rotors. The formulas are essentially the same as with quaternions, but at least for me, this approach make more intuitive sense. It also work in dimensions other than 3, which matter to the author as he is the author of "4D toys" and hopefully, Miegakure.

        Here is another talk on the subject, by a different author: https://www.youtube.com/watch?v=htYh-Tq7ZBI It is not specifically about quaternions, it is about multiplying vectors and what you get from that, and it includes quaternions.

    • authorfly 9 days ago
      Because they enable you to involve things typically unrelated (i.e. two types of embeddings), think and understand and manipulate them in a simpler mathematical model (a single vector) and see how you can then apply the most fundamental mathematical operations (e.g. cosine similarity) in terms of simpler outputs/metrics. It enables a form of creativity in other words; it makes you see things like the math in Transformers in a less anxious light as you understand how these things can overlay into a simpler intuitive understanding.
      • redwood 9 days ago
        Sounds similar to the power of LoRa; critical building blocks to building efficient scalable serving
    • pinkmuffinere 9 days ago
      As a concrete example, feedback controllers for attitude control (ie, pointing something) are imo easier to develop in quarternions than with 3x3 matrices. For one, the quaternion formulation isn’t over-constrained by the extra parameters.

      This is essentially because quaternions are a remarkably good representation of rotations.

      • TeMPOraL 8 days ago
        Related, 3D math is done with 4x4 matrices, not 3x3 as the "3D" would make someone think, because you need that extra dimension to represent translations in matrix form, and once you do that, you can compose arbitrary transforms through multiplication.
    • meindnoch 8 days ago
      Rotation matrices (i.e. orthonormal matrices) are a complicated manifold in 9 dimensional space.

      Rotation quaternions (i.e. unit quaternions) are simply the unit sphere in 4 dimensional space.

    • truckerbill 9 days ago
      'Why bother with programming languages when at the end of the day they are just pages of text'
      • TeMPOraL 8 days ago
        Or even why bother with computer science and engineering, when at the end of the day, it's all just about counting very fast.
    • pizza 8 days ago
      Simplicity depends on the observer