Data lakes for big data and machine learning