A.nnotate: Document Annotation Using Cloud Computing
From Entrepedia: The Entrepreneurship Wiki
Contents |
Introduction
Watch Fred Howell introduce A.nnotate
A.nnotate) is a cloud computing based application which allows people to upload documents, images and snapshots of webpages and then send them to other people who can then annotate and comment on the content.
Fred Howell and Robert Cannon launched the service in 2007 with a view to providing a tool that would allow for collaboration in the production of large-scale documents[1]. Their idea had grown from their research on more sophisticated ways of structuring and organizing large quantities of documents using the Semantic Web.
The idea behind A.nnotate is fairly straightforward and the concept is similar to other collaborative document production tools such as Zoho and Google Docs, although unlike these tools, A.nnotate is not an on-line word processor: you still use regular desktop tools for editing, and use a.nnotate for getting feedback on a read-only PDF copy, in the same way as you would send around a printed copy and gather comments written in the margin attached to highlighted sections[2] . Howell explains that whilst on-line applications such as Google Docs are great for collaboration, they often lack the rich functionality of desktop applications (never mind the fact that it’s not always convenient to have a group of people editing a document all at the same time). A.nnotate, on the other hand, allows users the best of both worlds: the full functionality of desktop applications and potential of web-based collaboration. What’s more, the service is not restricted to one file-format: users can work with not only office documents, but PDFs, images and files in a wide variety of formats, not to mention the functionality that allows them to take a “snapshot” of a live webpage.
The business model behind the venture is relatively straight forward. On the one hand, there is the on-line, or hosted, part of the website which uses computing to provide a free service to users. Revenue is generated by means of a credits based system which limits the number of documents that can be uploaded with the ability to top up as required or to sign up for a monthly plan. Discounts are offered according to the size of the documents uploaded and the number of credits purchased[3]. The hosted side of the business has been successful, attracting thousands of subscribers in just one year and Howell explains how it functions well as a “shop window” which draws in the larger, corporate customers who account for the majority of the business’s revenue. These customers will often purchase a licensed version of the software to run behind the safety of their own firewall.
Start-ups vs Spin-outs
Feeling increasingly restricted by the small scale user bases in the world of academic research, Howell and Cannon decided that starting their own business was the right path to choose. Whilst small scale, funded application development is suited to researching and developing ideas, Howell explains that it does not allow them to mature into real-world applications with a diverse and heterogeneous user-base.
Howell is keen to point out that although the ideas behind the business are rooted in his academic research, it began as a start up in its own right and not as a university spin-out. A distinction which is critical: “Spin-outs are great,” says Howell, “and it really helped us to go through the spin-out process in one of our previous ventures, but the university always retains a large chunk of your business and ultimately you reach a point where it becomes a problem to take the decisions necessary to grow the business when they have such a significant stake in your company.”
A.nnotate benefited from a £50,000 research and development grant provided by the SMART Awards scheme. This provided Howell and Cannon with the opportunity to develop their idea. “The SMART award was really useful”, explains Howell, “somewhere in between the extremes of market-driven product development and pure research, it gave us the means to really develop our idea. Also, having to report back to the SMART scheme helped to make sure we thought carefully about things such as due diligence and progressed in a serious and rigorous manner.”
Yet developing the software and the system architecture was not the biggest problem they faced. Howell explains that even now, the business expends more effort on marketing than anything else. “Marketing was the biggest challenge we faced,” says Howell, “you’ve got a great idea, even a great website but how do you tell people about it?” With Publications List, a previous venture, they had used Google Ads with great success and had learnt through experience the difficulty and importance of choosing a search term that would bring the traffic rolling in to their site. Howell admits that the optimum search term for A.nnotate still remains elusive to them, but by a combination of on-line marketing, including reviews in blogs, and distributing pamphlets and leaflets at events such as BarCamp, they have been able to market the business to some degree of success.
Using the Cloud
At the heart of the business operations is a system built around cloud computing. One of the biggest problems created by the growth of the business were the storage requirements created by customers uploading more and more data to their servers. Quite simply, the growth of the system was limited by the rate at which documents were uploaded to its servers. The solution, Howell explains, was simple: “by moving our documents over to Amazon S3 storage, we were able to have access to an what is essentially an infinite storage device.” Even though the documents are initially uploaded to the A.nnotate server, there is a batch job which runs nightly, moving them over onto the company’s S3 storage.This model has served them well: by using the Amazon S3 service, A.nnotate now has access to unlimited storage capacity on a pay per gigabyte basis, something which is of considerable advantage as users continue to upload their files and the business continues to grow. What’s more, as all the heavy content of the application is served from the Amazon S3 servers, the A.nnotate webserver is able to operate using a relatively small amount of bandwith.
Yet the transition to cloud-based storage has not been without its problems. Howell explains some of the challenges of designing the system architecture so as to have the functionality carried out on the core web-server and the content served from Amazon’s S3 system. He also points out that relying solely on Amazon to server its content is one of the weaknesses of the system: every now and then the service is prone to reliability issues, such as minor outages and limited technical support is available when this happens. “When you have you’re own dedicated server, you can just ring up and someone will get on the case, it’s much harder with S3,” Howell explains. Yet he is relatively relaxed about this: whist Amazon may have pioneered cloud storage with its S3 system, as the market becomes more competitive, the pressures of competition force it to focus on issues such as reliability.
Storage vs Processing Power
Storage was not the only issue encountered by Howell and Cannon with respect to A.nnotate’s architecture: as the business grows, they are becoming increasingly aware of greater demands on processing power. “Whilst you could see the transition of our storage to S3 as a sort of cloud level 1,” says Howell, “we’re now looking at being able to scale our processing power as a sort of cloud level 2.” As the A.nnotate architecture continues to evolve, virtual servers on-demand will play a key role in its ability to absorb growth.
Again, Amazon has been one of the pioneers in this are and its EC2 service allows users to create and destroy new virtual servers from the command line. The advantage of this approach is that as well as having access to a highly scalable server-base, the business is not tied into a 12 month minimum contract, as is the norm when purchasing servers. It also allows dedicated servers to be created in a matter of minutes which has allowed the business to provide private instances of A.nnotate to corporate customers, many of whom have then gone on to purchase a licensed version.
However, there are a number of issues presented by the use of virtual servers. One of these is price: contrary to what would be expected, virtual servers often work out more expensive than commissioning physical ones. Howell explains that upon inquiring, he found it would be more expensive to run the Annotate site on virtual servers with a performance equivalent to their physical counterparts, yet the increased costs are partially offset by the fact that a business only pays for the servers they use and the ability to create new servers on demand.
It seems though, that as A.nnotate continues to grow, the business will look more and more towards this “cloud level 2” as a means to deliver the scalability required to cope with increased processing demand.
Cloud Suppliers
A.nnotate has built itself upon the S3 and EC2 services provided by Amazon. Howell explains that this is largely because Amazon was something of a pioneer in the provision of cloud computing servcies, but the market is beginning to evolve and, in spite of obvious barriers to entry (the huge infrastructure costs required in setting up a service like Amazon S3), the market is becoming much more and more competitive. There are now a number of options when it comes to contracting cloud services: Mosso (Rackspace), FlexiScale and Microsoft Azure are all muscling in on the scene and Howell is optimistic that this increase in competition will see cloud infrastructure become more reliable and cost efficient in the future.
In addition to this, he gives content distribution as just one example of how cloud is continuing to evolve. These so-called Content Distribution Networks facilitate the geographical spread of data in order to improve response time and eliminate the requirement of transferring data throughout the world. For example, data used primarily by users in the United States would be hosted in the USA and data used primarily by the services British customers would be hosted somewhere in the U.K. The technology is still in its early stages, but with the rise of a new generation of web content that relies heavily on video and multimedia, it can only be set to grow in its importance.
In this case study, we have seen that A.nnotate has been successful in its early adoption of cloud technology, but it also is clear that the continuous evolution of cloud services and increased competition amongst provides presents an abundance of opportunities that will be seized by the current generation of emerging entrepreneurs.
References
- ↑ Enhancing documents with annotations and machine-readable structured information using Notate, Robert Cannon and Fred Howell 04/03/2007 [1]
- ↑ A.nnotate and (online) word processors for document collaboration, A.nnotate.com 02/06/2008 [2]
- ↑ See Hosted annotation plans, A.nnotate.com for more information [3]



