Moogle

TL;DR I am the author of Moogle. Moogle is free and open-source and hosted on GitHub
A demo site will be available soon!

You are searching for the sushi restaurant that a friend of yours recommended last month: you type “sushi restaurant” in your smart phone and you get a tweet from John talking about Tokyo Sushi. You also get a comment you wrote on Facebook, an SMS message sent to your brother and a bookmark in your browser, all about the same restaurant. Now imagine that you can do this with your smart phone, laptop, tablet or smart TV. Something so basic yet so far from the reality. This is Moogle - My Own Google, the search engine for private data.

Moogle icon

A few years ago all our private data were stored in laptops and personal computers: we used to manage emails with client programs like Mozilla Thunderbird and Microsoft Outlook, and to store our documents and photos in local folders.

But today much of our private data are stored in the shared web-based infrastructure known as “the cloud”: we use web based email services like Google Gmail and Yahoo Mail; we store and share documents with services like Dropbox, Google Drive and Microsoft Skydrive; we share photos and interact with our friends in social networks like Facebook, Twitter and Instagram; we communicate using messaging applications like Google Hangouts, Facebook chat, Skype, SMS, Whatsapp; we have bookmarks in our laptops, in our office desktops and in our mobile phones.

Statistics prove the popularity of these services: 1.5 million emails are sent per second, 112 emails per day by average corporate user; Dropbox reached 100 millions users in Nov 2012; Facebook reaches 1 billion active users every month; in 2009 1 billion Facebook chat messages and 10 billions Whatsapp messages were sent every day.

Having our personal data in the cloud is handy but there is a drawback: the search is complicated because the information is distributed over several services. When a user wants to search for an address and he doesn’t remember if he read it in a file in his laptop, in an email or in a Facebook post, he might waste much time in searching through different platforms. What he wants instead is a single place to query and get results from the entire set of his private information distributed in the cloud. This is Moogle - My Own Google.

Moogle is a web application which provides users with the ability to search against all their distributed private data being them emails, documents, posts in social networks, SMS and chat messages in mobile phones or bookmarks. Furthermore Moogle performs full-text search. This means that the actual content of documents is searched and not only those metadata that describe resources. Advanced text analysis is also performed while indexing data, with well-known techniques like stop-words removal, lexical analysis, stemming, synonyms injections, etc. Notice that these features are not always available in the cloud: Dropbox, Twitter and messaging applications in mobile phones f.i. just offer basic keyword-based search.

Moogle is the result of a project I have been working on in the last years as my Master’s degree thesis. The project together with a sketch of its business and marketing plans was pitched in some venture capitals meetings in Washington DC and Amsterdam and it raised some interest.

The project was named Moogle - My Own Google because it is a highly suggestive name. Yet it is just a working name used only in this initial phase.

Screenshots

Managing data providers
The main search page
Search results

Resources

Moogle is hosted on GitHub. A demo website will be available soon! In the meantime you can read my thesis (English) and some slides (Italian).

Key Technologies

Key technologies used in Moogle:

  • Python, Django
  • PostgreSQL, SQLAlchemy, Redis, redis-py
  • Apache Solr, solrpy
  • OAuth, requests-oauthlib, Facebook Graph API, Twitter REST API, Dropbox Core API, Google Gmail API, Google Drive API
  • Git, GitHub