عجفت الغور

kll approximate quantiles

statistics

  • Probably what the APPROX QUANTILES in SQL is doing
  • Minimize base complexity
  • Probably approximately correct
    • Failure is okay! As long as its bounded
  • eplison and delta
  • Union bound
    • The probability of 2 events (A, B) -> probability of both of them happening is limited by the union bound
  • merability allows you to have parallelism
  • stack of compactors does not affect accuracy! shrinking is good and bounds memory usage upper bounds
  • at some point it doesn’t matter because you’ve seen enough
  • weight provide you information about compaction, tells you how much the rank is going to move
    • symmetric changes in rank
  • repeated compaction
  • https://apache.github.io/datasketches-python/main/quantiles/kll.html