1.0 OVERVIEW
Codenvy is a cloud IDE with nearly 100,000 developers who use it to code, build, and test applications.
This article explains the architecture of Codenvy's various technologies and sites.
(Click on the image to enlarge it)
Our architecture is driven by the different offerings we intend to launch:
- Codenvy.com - A hosted cloud IDE with support, SLAs, and hardware.
- Codenvy Enterprise - Enable organizations to code, build, test and deploy applications, on their own servers.
- Codenvy ISV - Drive and measure technology engagement of published SDKs and APIs with promoted Factories, monetizable plug-ins, and IDElets. A Factory is a way to launch a temporary code, build, test, debug workspace with policies. An IDElet is an embeddable code, build, test, debug workflow that can be inserted into another product.
- Codenvy Platform - A cloud IDE engine to provide developers a way to develop, test, and run tooling plug-ins and applications.
The Codenvy Platform is used as an engine to deliver Codenvy.com, Codenvy Enterprise, and Codenvy ISV. It can also be used to create other IDEs with any branding the implementer desires. This SDK is similar in structure to the Eclipse Platform, but engineered for a cloud environment. It also provides support for developing plug-ins for build, run, test, and debugging workflows, which typically operate outside of the IDE itself.
The architecture discussion will be in two parts: 1) The SDK that drives Codenvy Platform, and 2) the components that extend the SDK to create Codenvy.com.
2.0 THE DIFFERENCE BETWEEN A CLOUD IDE AND A DESKTOP IDE
The primary technical difference between the two types of IDEs is that - typically - with a desktop IDE, the IDE vendor has an expectation that the packaging, build, test, and runtime environments are installed and managed outside the tool itself. This is not always the case, as some really advanced IDEs take care of installing these additional components onto the host machine, but it's not usually the case.
With a cloud IDE environment, the IDE exists entirely in the cloud; however, usually the cloud IDE vendor must also provide the build and runtime environment in a hosted location in addition to the executing IDE.
This distinction offers both promise and challenges. The promise for cloud IDES is that being entirely in the cloud, the cloud IDE vendor can provision more components, do it faster, and potentially reduce failure rates from configuration. The challenges include the additional overhead of managing a workspace, which now includes an IDE in addition to projects, code, builders, and runners.
2.1 APPROACHES TO CLOUD IDE WORKSPACE MANAGEMENT
There are generally two approaches taken with managing workspaces in the cloud:
- A user is granted a dedicated VM where they can gain privileges, including installing files and other software that can be edited, built, or executed.
- A user shares pooled resources that are homogenously structured, and commands for IDE functions (i.e., refactor), build functions, and run functions are distributed across different clusters optimized for each function. The more homogenous the development environments are, the denser the operation of the system is for the operator. The downside is that these systems are less flexible and less likely to be a perfect match for developers on their project.
Codenvy.com provides both configurations for users.
For users on a premium plan, we operate the first configuration where they are given a dedicated VM with an SSH option. (This plan is to be made available during Q4/2013.)Different build and runtime software can be installed on the VM. The IDE, which is running within the Codenvy system, can then be re-configured to point the build, run, and debug commands to execute processes that reside within the dedicated VM.
For users on the free plan, we operate a set of centralized servers broken into three clusters: one for IDEs, one for builders, and one for runners. There is a queue system that sits between each cluster and throttles the activity back and forth between each cluster. Each cluster can scale independently of the others, and allows for a higher density across the population. IDE clusters scale on their bottleneck of I/O & memory, build clusters on compute, and run clusters on memory.
3.0 CODENVY PLATFORM SDK ARCHITECTURE
Since developers are coming up with a variety of different development workflows, we needed to design an engine that would allow for the creation of different cloud IDEs that could change the behavior of the system, along with its feel. There are some well-defined approaches to structuring plug-ins that have been generated by Eclipse and JetBrains over the years, and these interfaces could be extended to a multi-tenanted cloud environment.
The SDK encompasses:
- A cloud client runtime that discovers, registers, loads, and manages plug-ins. The runtime is also responsible for managing a set of multi-tenanted inbound and outbound external connections to systems participating in a tooling workflow.
- A cloud client SDK that enables the development of multi-tenant, cloud client plug-ins. The SDK provides a common model for working with resources, events, and user interfaces. Plug-ins can be integrated with, or layered upon one another in an extensible, well-defined format.
- A set of standard plug-ins, which may can be excluded from the runtime (if necessary), to provide functionality relating to core development tooling. The current set of plug-ins includes: git, Java, CSS, Python, HTML, Ruby, XML, PHP, JavaScript, Browser Shell, and maven.
- A default IDE that contains a structured workbench for organizing code repositories, projects, files, build integration, runtime/debugger integration, and deployment / publish workflows. This IDE is a package of integrated plug-ins that deliver a set of default tooling. This IDE is accessed through a browser or a set of RESTful Web Services that represent each of the tooling functions.
3.1 PLUG-IN ARCHITECTURE
The foundation of building an IDE is the ability to create, package, deploy and update a plug-in. Plug-ins are extensible in their own right, and can be layered such that one plug-in can invoke and extend another plug-in.
The SDK is designed as a two-tiered application with Web application and server-side application services. Both tiers are extensible and can be modified by third parties.
Plug-Ins are authored using Java, GWT, and CDI. Injection and aspect-oriented programming is used throughout the interface system as a technique for extending plug-ins, connecting plug-ins together, and to link plug-ins into the IDE itself. The injection and interface system of CDI makes it possible to create a very clean, simple interface fronting a plug-in. And GWT is used as it has a number of optimizations to generate both standard and performant JavaScript code that can operate across a number of browsers.
This is a blank plug-in that adds a menu item to the IDE that changes state when selected. This plug-in can be compiled, tested, and validated in its own workflow. Injections are used to extend the constructor of the Java class, and these injections can be used to add in additional parameters that are passed into the Extension itself.
package com.codenvy.ide.extension.demo; import com.codenvy.ide.api.editor.EditorAgent; import com.codenvy.ide.api.extension.Extension; import com.codenvy.ide.api.ui.action.ActionManager; import com.codenvy.ide.api.ui.workspace.WorkspaceAgent; import com.google.inject.Inject; import com.google.inject.Singleton; /** * Extension used to demonstrate the IDE 2.0 SDK fetures * * @author <a href="mailto:nzamosenchuk@exoplatform.com">Nikolay Zamosenchuk</a> */ @Singleton @Extension(title = "Demo extension", version = "3.0.0") public class DemoExtension { @Inject public DemoExtension(final WorkspaceAgent workspace, ActionManager actionManager, EditorAgent editorAgent) { menu.addMenuItem("Project/Some Project Operation", new ExtendedCommand() { @Override public Expression inContext() { return projectOpenedExpression; } @Override public ImageResource getIcon() { return null; } @Override public void execute() { Window.alert("This is test item. The item changes enable/disable state when something happend."); } @Override public Expression canExecute() { return null; } @Override public String getToolTip() { return null; } }); } }
3.2 PLUG-IN SERVICES & APIS
Plug-ins have a variety of services and APIs that they can use. There are three categories of services provided.
3.2.1 IDE CLIENT APIS
These are the typical APIs a developer would expect to be in a platform to help them create visual extensions to the client side portion of the IDE itself. This includes packages for preferences, menus, views, helpers, wizards, toolbars, selections, editors, hotkeys, and so one. There is also a set of commons libraries that contain UI components and other utility functions.
There is also a set of top-level, standardized views that are provided with the default IDE. These views house the various perspectives and panels, and can be accessed directly by the plug-in itself. There are views for the console, editor, project explorer, menus, forms, shell, and wizards.
3.2.2 IDE SERVER SERVICES
In order for a plug-in to access a workspace operating within a cloud environment, there is a standard set of RESTful APIs that can be used to allow a plug-in to interact with the cloud development environment. The cloud manages the build cluster, the run cluster, the virtual file system where all files are persisted, the connection pool between Codenvy and outside/3rd party services, identity management server for managing user credentials, interfaces for billing/subscriptions, and other high compute functions that are better offloaded for processing within the cloud instead of being handled within the browser.
Codenvy’s platform is different from most cloud IDEs. A developer’s workspace is virtualized across different physical resources that are used to service different IDE functions. Dependency management, build, runners and code assistants can execute on different clusters of physical nodes. In order to properly virtualize access to all of these resources, we needed to implement a VFS that underpins the services and physical resources but also has a native understanding of IDE behavior.
An example of the offloaded services is refactoring. Since a refactoring command may need to alter the contents across many files, including renaming some and deleting others, handling refactoring as a cloud service instead of within the browser is much more effective. This capability is exposed as a set of RESTful services that can be directly accessed by the plug-in itself.
3.3 PLUG-IN LIFECYCLE
A plug-in itself is a combination of code that instructs the server-side and the client-side to operate in a certain way. Since the client-side is GWT, and the core IDE itself is also authored in GWT, a plug-in that is loaded into the system must be compiled with the rest of the IDE to generate a new IDE each time. The combination of the core IDE GWT code and the GWT extensions authored by the plug-in provider create a compiled binary set of code that generates an integrated, optimized set of JavaScript that is the UI itself. The plug-in itself is packaged as a single JAR file that is combined with the IDE WAR during its creation phase.
It is possible to run the SDK runtime with a loaded set of plug-ins and then activate or de-activate which plug-ins are actively displayed to the user through configuration files. The configurability of plug-ins at boot time allows for some fine-tuned control over the memory and CPU imprint of the IDE as a whole.
ISVs and other 3rd party developers need ways to build and test the plug-ins that they write. There are three modes of plug-in deployment:
- Stand Alone. Plug-ins are authored on a dedicated machine, compiled, and then booted with a customized Codenvy runtime that is a WAR file. This is generally done on a desktop, but can also be done as a hosted service as well. There is a standardized plug-in project template that comes with an associated set of maven build commands to enable this packaging within the downloadable version of the Codenvy SDK.
- Hosted SDK. Located at codenvy-sdk.com, this is a shadow site running the Codenvy platform. It provides a basic IDE where plug-ins can be created, compiled, and executed. Executed plug-ins are loaded in an instantiated IDE. All of this takes place in a browser, and instantiated IDEs are running in a different process that is a Tomcat server. Essentially, developers are using a shadow Codenvy to create more Codenvy instances with one plug-in loaded to test behavior and functionality. Each "run" of the plug-in type causes another JVM to launch with a different IDE load configuration.
- Codenvy.com. We incorporate approved plug-ins into the production build of Codenvy.com, build a set of Selenium tests to automate the testing of the interfaces, and then activate for the community to make use of.
Currently, all loaded plug-ins are activated. We will eventually create a mechanism that allows named users at Codenvy.com to identify which plug-ins are active within their named workspace.
4.0 CODENVY.COM ARCHITECTURE
We use the engine that powers the platform to create Codenvy.com, an elastic, hosted, and supported environment. We envision Codenvy.com as a continuously living entity that is accessed by a variety of different roles. Because of this vision, Codenvy.com is primarily a system with different interfaces. Developers use browsers to make use of the system. Devops and ISVs have programmatic interfaces to access workflows and to create IDEs on demand, and internal audiences have configuration interfaces to look at analytics data, administer users/workspaces, and dictate how the system should operate while unattended.
(Click on the image to enlarge it)
4.1 CLIENT-SERVER COMMUNICATIONS
Browsers interact with Codenvy though WebSocket and HTTP REST connections. The connections are established when a workspace session is initiated. Communications between the browser and the server are limited, and restricted only to functions that require server access. Server access functions are related to accessing new files, periodic auto-save, build services, event logging, and runner services. There is no heartbeat, and the protocol has been optimized to minimize the network traffic.
We use WebSocket communications when we need an ongoing set of collaborative discussions. This happens most often during a collaboration session for multi-user editing and chat windows. Since WebSockets consume a larger amount of resources on the server, we use HTTP REST connections for any command that is one-way, or fire-and-forget, such as a user initiating a build request, a refactoring, submitting a PAAS deploy, or submitting a log event.
4.2 TEMPORARY VS. NAMED WORKSPACES
Codenvy has the concept of a temporary workspace. A temporary workspace is a workspace that has a project, code repository, and some code, build, test, debug services, but structured into a quarantined zone such that work cannot be long-term persisted. Additionally, a temporary workspace will be destroyed if it is left idle for an extended period or if the browser is closed. In a temporary workspace, there are certain behaviors that must be repeated for each action, such as authenticating to an outside provider, since the system will not persist any of those credentials. A permanent workspace is bound to a user or an organization, and the projects inside it persisted. Additionally, the system will persist account information, credentials to outside services, such as SSH keys to GitHub, and allow for public / private project functions.
Temporary workspaces behave similarly for both non-authenticated and authenticated users. For developers that only want to work in a temporary space, all factoried projects begin in a temporary workspace. For those that have a registered account, they can then copy the project into their permanent workspace. For those that do not have an account, they can execute the authentication process and the project will be copied after authentication is confirmed.
The architectural difference between a temporary workspace and a named workspace is whether persistence storage (LDAP) is used to store related account information. The temporary workspace has a full user profile; however, since that information is kept entirely in-memory, any destruction of the tenant itself will wipe the entire project space. We also place all temporary tenants into an isolated portion of the virtual file system to be able to run batch cleanup and reporting algorithms against those files.
4.3 USER MANAGEMENT & AUTHENTICATION SERVICES
Users can be authenticated through either a form-based login or OAuth. We use Java Authentication and Authorization Service as the core technology to handle authentication. Authentication will happen any time a user attempts to access a protected resource, which is URL-referenced. Some URL references are unprotected (public) and others are protected (private). After authentication occurs, there is an attempt by the system to determine which workspace and project that user should gain access to. If the URL being referenced does not have an explicit workspace reference, then the user is re-routed to a different workspace and tenant selection algorithm.
(Click on the image to enlarge it)
4.4 ANONYMOUS VS. NAMED USERS
A named user is one that has an associated account and is granted configurable permissions to access protected resources. An anonymous user is one where they are accessing the system and no identifiable credentials can be associated with the account. Anonymous users can exist in one of two scenarios:
- A user without a cookie initiates a Factory operation, placing them into a temporary workspace.
- A user gains access to a public project URL and enters the product through that URL in read-only mode.
The behavior of the system is different for anonymous users and named users. With anonymous users, it is possible for administrators to configure which features of the product are available through a set of configuration parameters accessible through configuration files. Anonymous users must either create an account or authenticate (either outside the product, or within it) in order for these capabilities to be re-activated. Anonymous users must also have all their traffic routed to an IDE cluster that is quarantined from other IDE clusters that handle named user traffic. This routing and isolation takes place using HAProxy and cloud administration IP that we have authored.
Architecturally, anonymous users have an automatically generated user name that is not persisted into LDAP. They are in-memory only. Both anonymous and named users have an associated set of permissions, and are members of groups. Anonymous users’ associated groups and permissions are pre-determined and cannot be altered.
4.5 PUBLIC VS. PRIVATE PROJECTS
Codenvy supports the concept of public and private projects. A public project is one where the project space, or any of its files are accessible by anyone who has access to the URL. We will eventually expose all of these public projects through a navigation directory on the Web site, but today it requires explicit awareness of the URL to gain access.
We use ACLs to control behavior for workspaces, projects and files. ACLs are set on our virtual file system for a workspace, each project and the associated set of files in subdirectories. We can propagate ACL properties through any subdirectory of a root. For workspaces that are designated as entirely public (such as with our Community plan), we make a standard set of ACLs at the root of the virtual file system and make it unchangeable. For private projects, we then make these ACLs modifiable. There are both groups and individuals that can have ACLs set and modified.
4.6 INVITE VS. FACTORY
When a user is invited into another user’s workspace, they must have an appropriate key to gain access. For existing Codenvy users that are invited to a workspace, we automatically associate the invitation key from the invited user into the workspace of the inviter. For invites to email addresses that are not associated with Codenvy accounts, we generate a temporary key, store it along with their email in our LDAP repository, and then send an email to the user with the link & key. When the user clicks on the link, we apply the temporary key and the user is then granted access into the account that they were invited into.
For users that access a public workspace, a key is not required. We create an automatic access key that grants the anonymous or named user certain access rights to the workspace itself. These are generally read file rights, along with a limited number of project, build, and run functions.
For users that click on a Factory, we create a separate temporary workspace. Users are not invited to use a Factory. Any user with access to the Factory URL can make use of that Factory. If the user accessing a Factory is already authenticated, then we can recognize that and allow them to copy the contents into their existing workspace. If the user is not authenticated, we’ll create an anonymous user account during the temporary workspace session.
4.7 COLLABORATION MODE
When a file is opened, it can either be done with a standard editor or a collaboration editor.
(Click on the image to enlarge it)
A standard editor only operates in a single JVM and has no awareness of what is occurring in other JVMs. This means that the editor is only available to the user who explicitly opened it up, and it is not suitable for true collaboration mode. The standard editor can be launched for environments where there are weak connections or long ping times on the network which can affect WebSocket performance. With a standard editor, the amount of memory space consumed is controlled and users have a guarantee on saves, which are entirely manual.
A collaboration editor is the default editor type in the Codenvy platform. We have extended Google’s collide open source project to enable simultaneous editing of files by multiple users. The collaboration editor controls the locks of files in the virtual file system and then coordinates changes made by each user on the file in real time. The collide editor is able to asynchronously save changes to the file on a periodic basis and save events are propagated back to the end user screen to make them aware of the persistence change. Additionally, the collaboration editor synchronizes events between various users that are within the same editor / file combination at the same time.
4.8 MULTI-TENANCY
There are different tenancy models for the IDE cluster, the builder, and the runner.
- IDE Cluster. Within this cluster, we run a single JVM per node that consumes all of the available memory and compute available on the node. Within the JVM we use some in-house IP to enable multi-tenancy within the JVM itself. Each workspace is allocated its own thread, memory, and virtual file space. When a workspace is instantiated with its own IDE in a single JVM, we configure our router to map HTTP requests and WebSocket connections to the right IDE. And the IDE itself is mapped to the appropriate directory within the. In general, we are able to operate about 300 IDE tenants in a single JVM operating on a medium instance at AWS before HAProxy has to spin up another IDE server.
- Build Cluster. In the build cluster, there are a set of queues that front-end the cluster. There is a BuildManager service that sits in front of the queues that is accessible over RESTful Web Services. The BuildManager handles the message collection, ordering and processing. Build messages are routed to a queue based upon the incoming client context, such as paid / free account. There is an administration console that specifies how many nodes are allocated to a queue. We run one queue of processing for the Community tier (free), and then there is a dedicated queue for each Premium subscription (paid). Using this model, we can then allocate multiple hardware nodes to a single workspace, and the queue manager can load balance workspace requests across different build nodes. The client IDE periodically polls the builder assigned to its process to gather output and log files for display back in the browser itself.
- Runner Cluster. Similar to build clusters, there is a queue system that sits between the build and run clusters. Either the build cluster can place orders onto the runner cluster, or the commands can come directly from the IDE. Each node on the runner cluster operates a multi-tenant deployment of CloudFoundry. CloudFoundry cartridges are used to determine the environment configuration of the server that boots.
In Part 2 of this article we will present the following Codenvy topics: the virtual file system used, how logging and analytics are implemented, hosted APIs, cluster management, the overall picture, the release model and the SCRUM process used for development.
About the Author
Tyler Jewell is CEO of Codenvy and a venture partner with Toba Capital where he focuses on developer investments. He sits on the board of Sauce Labs, WSO2, Exo Platform and Codenvy along with making investments in Cloudant, ZeroTurnaround, InfoQ, and AppHarbor.