Exploring the internet on hot topic i.e. Machine Learning, I found tutorials created by Microsoft about their SQL Server Data Mining Extensions. Microsoft provides AdventureWorksDW2012 sample database under Microsoft Public License and uses it in their data mining tutorials. Some time ago it inspired me to build and test more general and presumably more scalable solutions (of course, results of computations should be comparable to some level) and evaluate some of MLLib algorithms that are available in Apache Spark.

