Identifying the Components
Matthijs van Leeuwen, Universiteit Utrecht, Netherlands
Jilles Vreeken, Universiteit Utrecht, Netherlands
Arno Siebes, Universiteit Utrecht, Netherlands
Links
Session:
Springer Link:
Abstract
Most, if not all, databases are mixtures of samples from different distributions. In many cases, however, nothing is known about the source components of these mix-tures. Therefore, many methods that induce models regard a database as sampled from a single data distribution. Models that do take into account that databases actu-ally are sampled from mixtures of distributions are often superior to those that do not, independent of whether this is modelled explicitly or implicitly. Transaction databases are no different with regard to data distribution…