PhotoHunt: An example of social graph integration
If you are unfamiliar with the PhotoHunt application, you can read the Java server overview here and can download the app from the Google+ GitHub page.
I’m going to share some ideas in this post that have been coming up while experimenting with the PhotoHunt application and thinking and talking about odd edge cases when creating social graphs and implementing social sign-in. Note that I don’t consider myself a security or expert for authorization and that this is intended to help get you thinking about social graphs.
The PhotoHunt application was developed as a showcase for Google+ Sign-in with the goal of creating a simple but comprehensive example of the potential for using external social networks to simplify creating an application’s friend graph unique to it. Google+ Sign-in is also demonstrated as a primary sign-in solution. A feature that lets you see your friend’s photos currently uploaded to PhotoHunt in the application UI shows simple social graph integration. Note this excludes more challenging networks with more computationally expensive relationships such as “friends of friends”. For PhotoHunt, this is done because there are more common problems worth showing – as opposed to the performance problems that effectively require large, expensive, and potentially confusing queries.
In this post, I’m going to give a high level view of the PhotoHunt graph and discuss design tradeoffs considered when building these graph relationships.
The PhotoHunt graph
To give you a high level view of the database model, we elected to go with the following design, which is rather de-normalized for simplicity:
The user, specific to PhotoHunt, is connected to other users (again, specific to PhotoHunt) through the DirectedUserToEdge table. What this relationship does is make a single connection from one user to another. When a new user joins the PhotoHunt app, we want to add their users from other social networks to save them the effort of adding the users manually. Because of this design choice, we must have two edges per relationship between users, yielding a social graph unique to PhotoHunt, consisting of Users and “edges” connecting them to their friends as shown in the following diagram:
This is the simple graph that we want to use as the full relationship model that matters for PhotoHunt. Users only see their friends’ photos one degree away in the friends photos list and our application does not store or cache the Google+ graph because it effectively does not need to – we remember who the user is on Google+ within our user table and nothing more.
The problem with multiple social logins
Intuitively, you might want to key off of a user’s email address. However, most people have multiple email addresses and in many cases, users use these accounts in ways you don’t expect. For example, if I have a Twitter account set up, that has a different email (or no email!) and I want to bring its graph to my application, the associated email address will not be able to be matched with the available credentials for the current user.
So, in order for PhotoHunt to extend or integrate the social graph data for existing relationships on other networks, it requires that a unique identifier for each network be addressable for determining whether existing users on PhotoHunt are in a user’s social graph. Revisiting what we discussed before, the social graph from Google+ is integrated into the PhotoHunt application by constructing edges between that user’s friends who are currently on the PhotoHunt application.
The problem with sign-in
Intuitively, given that I just said that keying off of emails is a bad thing, you’re probably thinking now that the email should not even be stored ever for the user. However, when you have multiple social logins that are used for sign in… what happens when you lose an account that you previously were using to authenticate with? You could get blocked out from your socially connected service. In some cases, this can result in you having an abandoned account on a service which is both bad for the service and for the user.
To address this, a primary email should be retained for a user so that the service can contact them, at a minimum, for account recovery. In addition, you might want to store a symmetrically encrypted password for access using an email login. Upon disconnecting the primary authorization solution, you could prompt the user to create new credentials so they don’t lock themselves out or you could setup a user’s account recovery settings during the signup flow.
Making it better
You will notice that we did make a specific choice to couple the Google+ profile information directly with the user associated with the PhotoHunt application. This was a deliberate design choice to simplify the application’s code and to showcase how awesome the profile information and APIs for Google+ are. However, if we were to generalize this to multiple accounts, even multiple Google+ accounts (e.g. for folks with apps for domains), then we could adopt a different model for storing each individual connected account.
I’m going to propose an alternative model with an integration that could be used to further generalize access to accounts used for searching across social graphs. Put concisely, a connected accounts model removes the Google+ specific details from the user model. Queries to users, made to generate edges unique to PhotoHunt, should only touch the account types appropriate to the account being connected. When disconnecting an account, as opposed to removing an account, we could simply remove the connected account from the user’s associated accounts and would remove the edges as appropriate. Only the values relevant to the application are kept within the user table and are verified by the user accordingly. Here is an example of how such a model could exist:
So there you have it, you can add additional accounts, even multiple Google+ accounts, and your users can more easily grow their PhotoHunt graph.
Getting clever
Before you get clever at growing your users’ edge graphs as I’m suggesting here, you would most likely want to add a confirmation dialog when extending friends to your PhotoHunt network and would want to be careful that you’re not misusing or violating a policy in the way you are doing it.
If it’s within policy for the connected accounts you are integrating, you can get clever when adding additional accounts to the PhotoHunt app. For example, what if a user has a friend who has public URLs associated with their Google+ account but this user hasn’t circled that friend. You have the identifiers to make this connection and could then try and draw them together in order to grow that graph.
Additionally, you could do something clever when looking at the profiles and external accounts for a user and could create additional email and profile links based on the urls associated with their account. For example, if you discovered a URL for a user’s flickr account in the public list of a users Google+ URLs and could suggest connecting this user’s friend graph through these sorts of matches.
Conclusions and examples of multi-login
Growing your own application’s graph is a useful way to drive engagement on your site. The larger this graph is, the higher the engagement will be. To encourage growth of this graph, you should carefully design your site to simplify the user’s experience for growing their graph. PhotoHunt is a great example of how this is possible and is a fantastic starting point for modeling such a social graph model on your own sites.
The following are examples of multi-login and connecting various accounts to social graphs worth checking out for inspiration: