Friday, September 11, 2015

Google Summer of Code 2015 Experience


Hi everyone, I was selected for Google Summer of Code 2015 under CloudCV Organization. In this blog post, I am going to describe the work that I did whole summer. I hope that it might be useful :P  

I was selected as the Google Summer of Code student in the CloudCV Organization for the Project 'Integrating Dropbox, Google Drive and S3 and building REST APIs for CloudCV'. 



It has been an awesome experience this summer. I learn a lot of new technologies and worked on them in a very short span of time. 

TL;DR

In short, the project aimed in developing the following functionalities: 
  • Integrate the third party authentication through Dropbox and Google 
  • Create the new database schema for CloudCV 
  • Create the REST APIs for the new database model
  • Modify and Integrate the NVIDIA's DIGITS Framework (Deep Learning Framework) and add the concept of users in DIGITS
  • Create the CloudStorage Upload APIs for uploading several Gigabytes of Model Jobs to Dropbox, Google Drive and Amazon S3 Cloud
  • Create the CloudStorage Download APIs for downloading the training and validation datasets from Dropbox, Google and Amazon S3 Cloud
Now, explaining the project in detail: 
  • Integrate the third party authentication through Dropbox and Google:

    For integrating the third party authentication, I used the Django-Allauth package for django. I already did a lot of research over the list of authentication packages available for Django. So, finally I chose Django-Allauth. It handled all the cases quite well except some due to which I needed to change that so that it satisfied the needs. Integrating this was not that tough, but creating a beautiful UI was a big problem for me. I used the Materializecss (based out of Material Design) for creating the layouts.
  • Create a new Database Schema

    Creating a database schema and normalizing it properly has always been a challenge for me. I brainstormed a lot for the database schema and finally after a lot of discussion with mentors over this topic, we came to a conclusion and the database schema looked like as shown here http://www.deshraj.in/cloudcv_db.
  • Create REST APIs for new Database Model

    That was my first time when I built REST APIs. Before going into depth, I would like to say that REST is just AWESOME ;) I loved the concept. For creating the APIs, I preferred the DRF(Django REST Framework). DRF has a lot of out of box functionalities that helps a developer very much in building the RESTful Architecture.
  • Inegrating Workspaces in DIGITS

    This task involved working on the DIGITS Framework and modifying it for adding the concept of workspaces and users. About DIGITS: Deep Learning GPU Training System is a webapp for training deep neural networks. The official source code repository of DIGITS is https://github.com/NVIDIA/DIGITS. The basic idea behind creating workspaces in DIGITS is to facilitate the collaboration of several researchers, data-scientists to work together. This will create a platform for them to work collaboratively. After integrating workspaces, I needed to connect the both Django(CloudCV Server) with Flask(DIGITS) to support a single authentication system. So, the session was managed in CloudCV Server and DIGITS server got the Readonly access to sessions so as to check the logged in user. For managing sessions, I used the Redis_Session Fork. The implementation of sharing single session between django and is a crucial part of the project.
  • Building Cloud Storage API 

    This Cloud Storage API is one of the prominent features of CloudCV forked DIGITS. Using this API, the models can be easily uploaded and downloaded from cloud storages like Dropbox, Google Drive and Amazon S3 Cloud. Speaking about the technology, Boto S3 is used to fetch the data from S3 buckets.
  • Planning after GSOC 2015

    I have been contributing to open source projects from around 11 months and I love to do that. So, I am continuing my work on CloudCV. Also, I am contributing to the Main Repository of NVIDIA DIGITS so as to add other functionalities and help the Researchers and Data Scientists around the world who are using it. 

It has been the most challenging summer for me that improved my coding skills a lot. I would recommend that one should start contributing to the Open Source Projects as early as possible because that teaches you how the big Organizations like Mozilla, Google, Microsoft etc works on their projects. Also, it gets added to your CV that is a big plus. 

If you have any comments, then do comment below the post. Sorry for my bad composition  :) :) 
Lastly, I would say that 


2 comments:

See all Posts