Skip to content

YingboMa/SIMDPolynomials.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIMDPolynomials

Stable Dev Build Status Coverage


SIMDPolynomials.jl uses bit-packed monomials so that most of the operations on multivariate monomials can be done in a few CPU instructions. Currently, it is only optimized for relatively small polynomials. Contributions, especially optimizations for large polynomials, are welcome!

Examples:

julia> using SIMDPolynomials, BenchmarkTools

julia> x, y, z, t = [PackedMonomial{4,7}(i) for i in 0:3]; # PackedMonomial with maximum of 4 variables and 7 bits of exponents.

julia> p = x * y + 3 * (z * t)
x₀x₁ + 3x₂x₃

julia> q = (p + 1) * p
x₀²x₁² + 6x₀x₁x₂x₃ + 9x₂²x₃² + x₀x₁ + 3x₂x₃

julia> @btime gcd($p, $q)
  4.019 μs (94 allocations: 8.06 KiB)
x₀x₁ + 3x₂x₃

julia> begin
           c1 = 10*(x * z + x)
           c2 = 2*(x^2 + z)
           c3 = 2*(2 - z  )
           c4 = 20*(x * z^2)
           e1 = 0
           e2 = 5
           e3 = 7
           e4 = 10
           p = c1 * y^e1 + c2 * y^e2 + c3 * y^e3 + c4 * y^e4
           q = prod(i->p + i, 0:3)
       end;

julia> @btime for i in 0:3
           gcd($p + i, $q)
       end
  350.086 μs (1159 allocations: 588.06 KiB)

The same micro-benchmark using AbstractAlgebra:

julia> using AbstractAlgebra, BenchmarkTools

julia> R, (x, y, z, t) = PolynomialRing(AbstractAlgebra.Integers{Int}(), [:x, :y, :z, :t], ordering=:deglex);

julia> p = x * y + 3 * (z * t)
x*y + 3*z*t

julia> q = (p + 1) * p
x^2*y^2 + 6*x*y*z*t + 9*z^2*t^2 + x*y + 3*z*t

julia> @btime gcd($p, $q) # SIMDPolynomials.jl is 30x faster
  119.795 μs (1320 allocations: 89.17 KiB)
x*y + 3*z*t

julia> begin
           c1 = 10*(x * z + x)
           c2 = 2*(x^2 + z)
           c3 = 2*(2 - z  )
           c4 = 20*(x * z^2)
           e1 = 0
           e2 = 5
           e3 = 7
           e4 = 10
           p = c1 * y^e1 + c2 * y^e2 + c3 * y^e3 + c4 * y^e4
           q = prod(i->p + i, 0:3)
       end;

julia> @btime for i in 0:3 # SIMDPolynomials.jl is 14x faster
           gcd($p + i, $q)
       end
  4.934 ms (32235 allocations: 3.43 MiB)

The same micro-benchmark using DynamicPolynomials:

julia> using DynamicPolynomials, BenchmarkTools

julia> @polyvar x y z t;

julia> p = x * y + 3 * (z * t)
xy + 3zt

julia> q = (p + 1) * p
x²y² + 6xyzt + 9z²t² + xy + 3zt

julia> @btime gcd($p, $q)  # SIMDPolynomials.jl is 65x faster
  264.561 μs (4962 allocations: 298.19 KiB)
xy + 3zt

julia> begin
           c1 = 10*(x * z + x)
           c2 = 2*(x^2 + z)
           c3 = 2*(2 - z  )
           c4 = 20*(x * z^2)
           e1 = 0
           e2 = 5
           e3 = 7
           e4 = 10
           p = c1 * y^e1 + c2 * y^e2 + c3 * y^e3 + c4 * y^e4
           q = prod(i->p + i, 0:3)
       end;

julia> @btime for i in 0:3 # SIMDPolynomials.jl is 82x faster
           gcd($p + i, $q)
       end
  28.943 ms (529642 allocations: 31.20 MiB)