With the recent release of code generation tools such as Spring Roo framework from SpringSource, Skyway Builder Community Edition version 6.3 and Blu Age's M2Spring, there is a renewed focus on the role of code generation in developing enterprise Java applications.
Model Driven Development (MDD), which includes code generation, is gaining more interest now because there are several artifacts in a typical Java application that can be auto-generated. A typical Java web application includes classes like Data Access Object (DAO), XML to Java mapping files, Spring, Log4J configuration files which are all good candidates for auto-generation.
Roo is a round-trip code generator framework that generates most of the artifacts required in Spring based web applications. It provides a command line shell with tab completion, context aware operations and command hinting features. It also constructs the Spring applications in a standard directory format, manages the build configuration files, helps the developers create domain objects, and provides automatic web tier generation for REST-based web user interfaces.
Skyway Builder Community Edition version 6.3 was released last month. The new version supports Spring MVC scaffolding for generating Spring-based, Java CRUD applications from new or pre-existing domain models. It has integration with Spring DSL and also has the feature to do code generation for generating Spring MVC and Spring Web Flow solutions. Skyway Builder Enterprise Edition (EE) 6.3, the commercial edition of Skyway Builder, has integration with IBM Rational Software Architect to transform UML into working Spring applications. It also has DWR (JavaScript/JSON) support for developing RIA applications using Spring services and project-level customization of code generation templates using the JET technology.
IBM's MDD tool, called Rational Rhapsody, supports UML2 and SysML, requirements traceability, application code generation, and design for testing (DFT) features. Rhapsody is a round-trip model driven solution and supports capturing the project requirements using requirements diagrams, use-case diagrams, sequence diagrams, activity diagrams and state charts. Users then create traceability links from the model to the requirements, automatically providing traceability, impact analysis and coverage documentation. It also supports model-driven testing (MDT) which is a new paradigm that brings the benefits of MDD to the testing process. MDT allows engineers to iteratively simulate a design to locate errors early in the process, automate tedious testing, incorporate requirements-based testing to validate the design against requirements, and use IBM Rational Rhapsody Automatic Test Generation Add On capabilities to automatically create coverage tests from the design.
Blu Age recently joined the group of code generation tools providers with their release of M2Spring product. M2Spring uses the combination of MagicDraw UML and Blu Age Agile Model Transformation for modeling and automatically generating the application code based on the Spring architecture. It generates Spring based web application classes and artifacts in Service (Business Rules, application services and web services), Presentation (User interface, user roles and security policies), and Persistence layers (Business Objects, DAO implementation, DAO finder). M2Spring supports several modeling and JEE technologies such as UML 2.2, OCL 2.0, XMI 2.1, EMF UML2 2.x XMI, Struts, Spring, and Hibernate.
Other java development tools that support code generation are Lombok and Spoon. Project Lombok offers the features like auto-generation of the default getter/setter methods, automatic resource management (using @Cleanup annotation) and annotation driven exception handling.
InfoQ spoke with Ben Alex, Spring Roo project lead, about the code generation role in java application development. He said that most developers use code generation every day, whether they're using Eclipse's "generate getters/setters" feature or perhaps copying a fragment of working code from elsewhere. The key motivation is to implement a solution to a particular requirement as quickly as possible and avoid having to rediscover the best approach every time. Modern-day code generators embody these same motivations, and simply expand the scope of what you can do beyond "generate getters/setters" to a more productive level - such as building multiple layers of the application. And just as with "generate getters/setters", modern day generators are easy to use, emit code that is equivalent to what you'd have written yourself, and allow you to change the modified code without thinking about it.
What should the software architects and developers look for in a Code Generation framework?
Developers should evaluate code generators in a similar way as to how they would evaluate a tool like an IDE. I'd ask questions like will this software make me considerably more productive? Is it easy to use and learn? Is this software allowed to be used in my organisation? Am I locked in or can I stop using it easily? Is this software likely to be actively maintained? Is there a community around this software? Does it seem of high quality (number of unresolved defects, end user blogs, architectural clarity etc)? Is there an easy way to add extra functionality via plugins, add-ons etc?
Beyond questions you'd use for an IDE, you should ask some questions specific to code generators: Will this code generator incrementally and automatically maintain my code in the long-term (ie what round-trip support does it offer)? Is the way I control the generator appealing given my knowledge, skills and experience? Does it work well with my preferred IDE? Will it work with new versions of my IDE or might it break if I install a new version of the generator or my IDE? Is the code it generates natural, clean, consistent and efficient? Is the usage approach flexible enough to let me work in my natural way? Does it work automatically or do I need to remember to do something? Will new versions of the code generator (or its plug-ins or add-ons) potentially break my project?
In most applications, the developers need to hand-code business logic and validation rules that are not easy to auto-generate. How do the code generation tools address this requirement?
Hunt and Thomas described in The Pragmatic Programmers the main two types of generators: passive and active.
Passive generation is run once to produce a result that is subsequently maintained by hand. Many developers have used passive generators via IDE gestures like "generate getters/setters". The output of a passive generator is determined by the developer indicating what they would like generated when they dispatch the generation command. The trouble with passive generators is you must maintain the code by hand forever more, as a passive generator will not update the generated code. This is a particular problem if you change something the generated code relies on, as the generated code will go into error and users will generally delete the code and re-run the generator.
Active generators, on the other hand, produce a result every time they are run. Active generators require a way of storing control information so they can execute later and produce the required result. The control information techniques vary considerably, but are generally a combination of automatically-derived metadata (eg from source code parsing and binding) together with user preferences defined via Java annotations, JavaDoc tags, XML files, special GUI configuration systems etc. Quality active generators (like Roo) allow a developer to simply write their custom logic in a standard source file as usual and the generator will detect this and automatically update just the files which are affected by any changes. There is no need to tell systems like Roo you'd like to write custom code, as this is an expected and normal part of using the system.
How do you enforce good architectural and design practices in a scenario where both auto generation and manual coding are part of the application development process?
Many modern generators are used to produce the initial application skeleton. This means the generated applications reflect whatever best practices are embodied in the generator. These best practices are then continuously encouraged via the normal operation of the generator, as any further code it generates for the user will reflect those same architectural patterns. Such consistency is a major bonus for organisations with teams of differing skill sets or projects that will require long-term maintenance.
Naturally one of the key questions a prospective user should ask is whether the best practices embodied in the generator really do represent best practices. Generators used by multiple users and across multiple projects are usually the best, as a lot of design feedback would have been incorporated into such generators. Prospective users should also look at the actual architecture emitted and critically review it for testability, maintainability, design integrity, duplication, lock-in avoidance etc. Additionally, most quality generators will have the architecture they generate publicly certified by some reputed organisation (eg SpringSource certifies Roo architecture as best practice).
How do other approaches like Grails (where most of the code is generated behind-the-scenes at execution time) compare with a traditional code generation framework?
Grails, Rails, Django and so on offer a passive generator facility that stubs the initial application skeleton and specific artifacts like web controllers. They then perform at runtime (via a dynamic language or reflective techniques) what would have been performed by an active generator at development time.
There are well-documented advantages and disadvantages to static language compile-time techniques versus dynamic language runtime techniques. These are fundamentally different development paradigms and neither paradigm is superior for every single project. SpringSource has both Roo and Grails so end users can choose the paradigm they prefer and still enjoy tremendous productivity and the same quality Spring underpinnings.
What do you see as the future of Model Driven Software Development (MDSD) in general and Code Generation in particular?
I believe the objective of finding higher levels of programming abstraction will continue, but the current trend (as measured by the number of new frameworks and people adopting them) appears to favour programming language innovation and DSLs. A major point of difference between generation tools remains their DSL approach, with some featuring simple commands (like "create-controller") and others offering advanced shells (with tab completion, contextual awareness, command hiding etc). These more advanced shells not only lower the training requirements for users, but they also simplify the use of increasingly powerful (and higher level of abstraction) commands.
Given the above, code generation will continue to play an important role in software development. Static languages still dominate the overwhelming majority of language use (source: Tiobe) and such languages depend on code generation techniques for programmer productivity. Even frameworks that specialise in dynamic languages still use code generation for areas like the initial application structure, key application artifacts and so on. Trends like higher-level DSLs, more advanced and usable shells, innovative compilation unit separation and very high quality round-trip support are underpinning the next phase of code generation evolution.
InfoQ also spoke with Jack Kennedy from Skyway team about the code generation in software development. Responding to a question on the role of code generation in projects where Agile and Lean software development methodologies like Scrum, XP, or Kanban are being used, Jack said:
Regardless of the development methodology used, pattern-based code generation and automation techniques influence all phases of the software development lifecycle (analysis/design/build/test/deploy). Code generation would start with decorating the conceptual elements in analysis, carrying those through to more specificity in pattern-based design, and finally, transforming the design into pattern-based code and test units. For agile and lean software development methodologies, the motivation for using pattern-based code generation and automation techniques is significant. Although quite successful for relatively small projects, a key challenge for the wider adoption of these development methodologies within the software industry is the ability to scale to large, distributed, and complex delivery requirements. Most of the concern centers on the need for highly skilled developers and the perceived lack of predictable large work effort estimations. Introducing pattern-based code generation and automation techniques into the full development lifecycle provides organizations the opportunity to infuse junior-level developers into the mix and reduces the guess work with proven and repeatable effort estimation for generation components. As modern methodologies call for continuous integration and demonstrable development iterations, it becomes even more crucial to generate functioning applications with minimal time investments and consistent quality.
Scaffolding helps with time, effort, and code quality as most of the application artifacts (like web, Log4J and Spring configuration files) are auto-generated. Is this where we get more value out of a code generation tool?
Fundamentally, a software generation platform provides value by automating development tasks while conforming to the implementation requirements of the target deployment environment. Implicit in this approach are the value propositions of consistency, quality, and reuse. This value can be measured by comparing the "cost" of the input requirements to the system with the "value" of the functional outputs of the system. As in all things, the goal is to minimize the cost and to maximize the value. In the past, code generators were limited to generating one artifact for each "concept" that was input to the generation system. With regards to data modeling, this generally meant that the data model could be created using one of a variety of formats, and that the "Class" implementation of the data model could be generated. Many UML solutions provide this level of generation exclusively. In this case however, many teams did not find enough value in the output to warrant the investment in the input.
As the implementation technology for creating enterprise applications has simplified and software generation space has matured, the scope of functionality that can be automatically generated has increased tremendously. The ability to bootstrap and generate fully functional applications that offer CRUD-based capabilities has created a large enough value proposition to move many developers to adopt modern RAD environments and tooling that include the use of DSL's and/or generation technologies. This was clearly a tipping point for generation, but it does not represent the end state or final estimation of value to be realized. As the industry moves to offer support for a wider set of functional generation options and input formats, we will see a larger number of developers moving to adopt generative techniques.
What do you see as the future of Model Driven Development and Code Generation?
The primary focus for the near future will likely be the bridging of the gaps that exist between traditional MDD approaches/tooling and the emerging DSL's for rapid application development. The DSL's that are currently in use will continue to become more powerful and sophisticated, incorporating security, validation, and application flow. Generative technologies will move well past the simple "code generator" stage and will become a standard feature of the build and development processes. The options for the types of software assets that can be derived from UML and DSL investments will continue to grow and should include options for generating a greater percentage of application code across a wider set of technology stacks.
DSL's will continue to gain adoption as developers find greater value in the functionality that can be generated in an automated way from these targeted, simplified syntaxes. The tooling options for working with these DSL's will become more sophisticated and will provide many different input styles and formats including textual and visual environments. We would expect that eventually software vendors will compete on their ability to generate and run optimized deployments of DSL-based applications.