Identifying the Components

Matthijs van Leeuwen, Universiteit Utrecht, Netherlands
Jilles Vreeken, Universiteit Utrecht, Netherlands
Arno Siebes, Universiteit Utrecht, Netherlands

Links

Session:
Springer Link:

Abstract

Most, if not all, databases are mixtures of samples from different distributions. In many cases, however, nothing is known about the source components of these mix-tures. Therefore, many methods that induce models regard a database as sampled from a single data distribution. Models that do take into account that databases actu-ally are sampled from mixtures of distributions are often superior to those that do not, independent of whether this is modelled explicitly or implicitly. Transaction databases are no different with regard to data distribution…