Sunday, January 3, 2016

Google Auto Awesome Video - Is it a machine learning solution ???

Wish you all a very happy and a wonderful new year 2016 !!!

Happy to start the year with a blog covering some aspects on machine learning and this post is actually an inspiration from the new year eve celebration.

In the new year eve celebration with friends and I captured some photo moments using google photos app. The next day morning when I woke up I got a notification in mobile , would you like to review and save the video made out of photos in the new year eve event with some nice background music added, in google terms they call as Auto Awesome videos in google photos app.

I am happy to see the video that has been made automatically and ready to share , there is also a manual mode where we can customize photos for the video. But my interest is on the automatic creation and started thinking how this design could have been ?

At first cut I was able to sense this could potentially be a machine learning implementation and with my limited data science knowledge I thought to do provide some guess work on how this could have been designed while running at a large scale for millions of tenants at the server end

Let us understand the requirement in detail , given a collection of images we have to perform the following

  • Categorize the images into groups and pick the group corresponding to a specific event say the new year celebration in this case.
  • To improve accuracy , check and eliminate any irrelevant images that went into the group by error
  • Judge the mood of the event and add appropriate background music
  • Now let us analyse the type of machine learning solutions that could have potentially been used for this design

  • First part of the problem is towards categorizing the images into groups based on some parameters , Clustering algorithm could be a best fit to perform this. Given a specific dataset clustering helps to categorize the dataset into different partitions based on features of data, In our scenario grouping could be based on time of photo taken but I have seen cases where grouping is done based on image background and persons involved.
  • Next is to eliminate outliers in the grouping ,some photos might have accidentally went in to the group. Algorithms like anomaly detection can be executed to eliminate those outlier images in the collection.
  • Final step is to understand the mood of the images and add relevant background music to the video, sentiment analysis algorithm on pictures could potentially help to understand the mood of the images.
  • Disclaimer : This is purely my own guess work of the design and google might have done in a different way :)