TransE treats relations as translations — but translations can't handle symmetric, inverse, or composition patterns. RotatE fixes this by working in complex number space and treating each relation as a rotation on the unit circle. One small geometric insight unlocks all four fundamental relation patterns simultaneously.
TransE embeds knowledge graph facts as translations: h + r ≈ t. For a decade, this was state-of-the-art. But three categories of relation patterns break it:
All three failures have the same root: translations in Euclidean space don't form a rich enough group. You need an algebraic structure that supports more operations. The answer: rotations in complex space.
A complex number z = a + bi lives in a 2D plane: a is the "real" coordinate, b is the "imaginary" coordinate. You can think of it as a 2D vector (a, b). But complex numbers have one property vectors don't: multiplication rotates and scales simultaneously.
Multiplying z₁ = r₁e^(iθ₁) by z₂ = r₂e^(iθ₂) gives r₁r₂e^(i(θ₁+θ₂)). The magnitudes multiply; the angles add. If we restrict to unit complex numbers (|z|=1), so z = e^(iθ), then multiplication becomes pure rotation: angles add, magnitudes stay 1.
RotatE embeds each entity as a complex vector h ∈ ℂ^d (d complex numbers = 2d real numbers). Each relation is also a complex vector r ∈ ℂ^d, but with the constraint that |r_i| = 1 for all components i. This means each relation component is e^(iθ_i) — a rotation by angle θ_i in the i-th complex plane.
All unit complex numbers sit on the unit circle. Multiplying two rotates by their combined angles.
RotatE's core equation: for a true triple (h, r, t), each component of h rotated by r should land at the corresponding component of t:
Here ∘ is element-wise complex multiplication. Since |r_i|=1, each r_i = e^(iθᵢ). So the equation becomes: for each dimension i, h_i · e^(iθᵢ) = t_i. This is a rotation of h_i by angle θ_i.
The scoring function measures how far h∘r is from t, using the distance in the complex plane:
The entity h lives on the unit circle (teal point). The relation r rotates it by angle θ (orange arrow). The rotated point h∘r should land at the tail entity t (purple). Adjust θ to complete the triple.
Here is where RotatE's geometric insight pays off. Every fundamental relation pattern corresponds to a simple constraint on the rotation angle θ:
Symmetry (θ = π): If r_i = e^(iπ) = -1, then h ∘ r = -h and t ∘ r = -t. For a symmetric relation, we need h ∘ r = t AND t ∘ r = h. With θ = π: h · (−1) = t means t = −h, and t · (−1) = h means h = −t. Both hold simultaneously! Rotation by π maps every point to its antipodal — and rotating that point by π again returns to the original.
Antisymmetry (θ ≠ 0, π): For most relations, rotating h by θ lands at t but rotating t by θ does NOT land at h (it lands somewhere else). This is automatic for any θ ∉ {0, π}.
Inverse (θ₂ = −θ₁): If relation r₁ has angle θ and its inverse r₂ has angle −θ, then h ∘ r₁ = t implies t ∘ r₂ = t · e^(−iθ) = h · e^(iθ) · e^(−iθ) = h. The inverse relation rotates back. Perfect.
Composition (θ₃ = θ₁ + θ₂): Applying rotation θ₁ then θ₂ is equivalent to rotation θ₁+θ₂. This is exactly the angle-addition property of complex multiplication. Composition relations naturally decompose into angle sums.
Like TransE, RotatE needs negative samples — corrupted triples to distinguish true facts from false ones. The loss function is:
The first term: make the true triple score low (well below margin γ). The second term: make each negative triple score high (well above γ). γ is the margin (paper uses γ=9.0 for FB15k-237).
Standard negative sampling picks corrupted triples uniformly at random. But random negatives are often too easy — obvious nonsense like (Einstein, capital-of, Germany). The model learns quickly that these are wrong and stops getting useful gradient signal.
Self-adversarial sampling samples negative triples non-uniformly, weighting by the current model's own score:
High-scoring negatives (the ones the model currently thinks are plausible) are sampled more frequently. These are the "hard negatives" — the triples that the model almost believes are true, so they provide the most informative gradient signal. α controls the sharpness of the weighting.
Watch the distribution shift as α increases. Higher α concentrates sampling on the hardest (highest-scoring) negatives.
RotatE is evaluated on three standard benchmarks: FB15k, FB15k-237 (a harder subset removing inverse relations to prevent test leakage), and WN18/WN18RR (same issue — WN18 leaks, WN18RR is clean).
| Model | FB15k-237 MRR | FB15k-237 H@10 | WN18RR MRR | WN18RR H@10 |
|---|---|---|---|---|
| TransE | 0.279 | 44.1% | 0.226 | 50.1% |
| DistMult | 0.241 | 41.9% | 0.430 | 49.0% |
| ComplEx | 0.247 | 42.8% | 0.440 | 51.0% |
| ConvE | 0.325 | 50.1% | 0.430 | 52.0% |
| RotatE | 0.338 | 53.3% | 0.476 | 57.1% |
RotatE sets new state-of-the-art on all four metrics. The improvement is largest on WN18RR: +4.6 MRR and +6.1% H@10 over ComplEx. WN18RR contains many composition relations (hypernym chains) — exactly the pattern where RotatE's angle-composition property shines.
RotatE subsumes TransE, DistMult, and ComplEx in expressive power — it can represent everything they can, plus more. Understanding how illuminates the precise role of complex space in KG embedding.
RotatE vs TransE: TransE: h + r = t in ℝ^d. RotatE: h ∘ r = t in ℂ^(d/2). Both use one vector per entity and per relation. But multiplication of unit complex numbers is a much richer operation than addition of real vectors — it has the group structure that enables all four relation patterns.
RotatE vs ComplEx: ComplEx uses: score = Re(Σ h_i · r_i · t̄_i), where t̄ is the complex conjugate of t. This is a bilinear form in complex space. RotatE constrains |r_i|=1, which ComplEx doesn't. The constraint is what forces relations to be pure rotations — ComplEx can scale (grow/shrink embeddings) while RotatE cannot. The constraint costs some expressiveness but gains the geometric interpretation and handles composition cleanly.
| Property | TransE | DistMult | ComplEx | RotatE |
|---|---|---|---|---|
| Symmetric | No (r=0) | Yes | Yes | Yes (θ=π) |
| Antisymmetric | Yes | No | Yes | Yes |
| Inverse | Partially | No | Yes | Yes (θ₂=−θ₁) |
| Composition | Partially | No | No | Yes (θ₃=θ₁+θ₂) |
| Relation as | Translation | Scaling | Complex scaling | Complex rotation |
RotatE is the first model to handle ALL four patterns simultaneously. Composition is the critical gap: ComplEx can handle symmetric/antisymmetric/inverse but cannot naturally compose relations. This is where RotatE's angle-addition property is uniquely powerful.
RotatE belongs to a geometric tradition in KG embedding: representing relational semantics as geometric operations. Rotations were the key insight; subsequent work extended to hyperbolic spaces, higher-dimensional rotations, and combinations with attention.
| Method | Geometric operation | Space | New capability |
|---|---|---|---|
| TransE (2013) | Translation | ℝ^d | — |
| RotatE (2019) | Rotation (unit |r|) | ℂ^(d/2) | All 4 patterns |
| QuatE (2019) | Quaternion rotation | ℍ^(d/4) | 3D rotations |
| HAKE (2020) | Translation + rotation | ℝ×ℂ | Hierarchies |
| PairRE (2021) | Paired rotation | ℝ^d × ℝ^d | Complex patterns |
Related lessons
Key papers
"We model each relation as a rotation from the source entity to the target entity in the complex vector space."
— Sun et al. (2019)