An open-source scalable personal cloud


As we have mentioned before, clients are in constant communication with both, the metadata and storage services. Communication with the metadata service is bidirectional, that is, clients notify updates about local file changes to the server at any time, and receive events about remote modifications.

In order for the clients to keep updated, they need to be aware of all changes made in their remote repository. This can be achieved in two different ways: periodically asking the metadata service if there is any change (i.e. Pull strategy); or the metadata service to notify clients when there is a change (i.e. Push strategy). These two strategies have been compared many times. Next, we will provide a brief description of their advantages and disadvantages.

  • Pull. In this strategy, the information is available on the server and clients are asking periodically for changes. If there were many clients connected simultaneously, the server could collapse when receiving too many requests. It would create many request that most times would be unnecessary. Thus, as we pretend to create a scalable system to be able to accommodate a large number of users, this strategy is not suitable.
  • Push. It is the opposite case, clients are kept waiting for the server to notify them about changes, saving many request to the server. But has the drawback that clients have to maintain an open connection to receive notifications.

As we prioritized the server bandwidth and load, we opted for a push-based communication between desktop clients and servers.

Clients can perform two types of server calls: synchronous and asynchronous.

Synchronous calls

There are some requests that clients must perform in order for them to be initialized. These type of calls are blocking, i.e., when the client makes the request, it is blocked until the server responds.

Each time a client is started, it needs to apply all changes made since the last time it was running. As notifications follow a push strategy, if they are not processed by the time the server sends them, they are lost. Therefore, clients must ask the server for the current state of the files to check themselves what changes have occurred.

The fact that these type of calls are blocking makes the time to process it a critical issue. If the server does not respond within a time period, more request could accumulate, specially at peak hours.

Asynchronous calls

On the other hand, some other requests require significant processing time to ensure data consistency or are not time-dependent. Decoupling the request from the response in these scenarios is vital to ensure the system’s scalability.

An example of asynchronous call is when the client wants to upload the metadata of a new version. When the client finishes uploading a file to the data server, it sends a request to provide the metadata with all the necessary information. The server, after analyzing the data consistency, sends an event to all interested devices.

From the time the message is sent to the server, until notification is received, the client can perform other tasks.