Feedback from EU Cassandra Summit

For the second time in the same year, I was able to attend another major Cassandra event : the EU Summit in London in December 2014.

Again, the format was quite similar to the one in San Francisco held earlier in September, the first day is a training day and the second day is reserved for conferences.

I The training day

For this summit I chose to attend the Cassandra/Spark training and that was definitely the right choice. In the morning we had an introduction to Spark architecture and the installation, settings and tuning of Datastax Entreprise which embeds Cassandra and Spark in a single product.

The trainer (Patrick Callaghan) presents with much details all the Spark config parameters (SPARK_MASTER_OPTS, SPARK_WORKER_OPTS, SPARK_WORKER_MEMORY, SPARK_WORKER_CORES …) that will impact memory and CPU usage.

The morning finished by some crash courses on RDD concept, an introduction to the Cassandra-Spark connector and a hands on exercise to start the Spark master, workers and to connect to the REPL and default web interface as basic health check.

The afternoon session started with presentation of RDD lineage, acyclic dependency graph, lazy evaluation and persistence strategies in Spark. To be honest most of those topics are quite common but Patrick pushed the explanation into details by giving real examples of how and when persisting data with Spark can make a huge difference in term of performance for your pipeline.

There was also a big chapter on Spark partitioning/shuffling strategies and how some aggregation functions like join, grouping and sorting may result in shipping a bunch of data over the network because they change the previous partitioning.

One striking example is the very common map() function. If you have a PairRDD (RDD consisting of a couple <key,value>), using map() will remove any previous partitioner and may result in data shuffling down the processing pipeline.

After the theory, the practice. The set of exercises are well designed and the difficulty level built up progressively.

In a nutshell, this training really worths to be attended.

II The conference day

The Keynote

The first hour of the keynote, presented by Billy Bosworth, is essentially the same as the one in San Francisco, with different people invited on stage. The second part is a small recap by Jonathan Ellis on new features in 2.1 and some announcements of features coming in 3.0:

hints storage: hints are no-longer stored in Cassandra tables but just as plain append-only files, à-la commitlog. It will improve hints delivery and avoid coordinators to be overwhelmed by hints and related compaction in case of multiple nodes failure

JSON syntax support for CQL: with 3.0 you can insert or retrieve data with JSON formatting, which greatly simplifies your day if you need to send the data directly to third parties using JSON as exchange format. It also opens the door for potentially interesting automatic object mapping leveraging existing libraries like Jackson mapper

UDF: a few words about user-defined functions and the syntax. Jonathan did not expand much on the subject because there is a complete talk on it by Robert Stupp later

Global index: a long awaited feature. Until now secondary indices are distributed locally, meaning that you suffer the fan-out phenomenon when your cluster size grows. With global index, the approach is more classic, all the partition is sitting on the same node. What’s about wide rows ? Jonathan did not give any detail on it

The Conference

Lesson Learned — From SQL to Cassandra with Billions Contacts Data

I started the first conference session of the day as speaker, with Brice Dutheil. Essentially this talk presents the work we have done at Libon to migrate data from Oracle to Cassandra.

In the first part we presented Libon business, the functional use-cases and the need to migrate to Oracle.

Then we introduce the 3-phases migration strategy with a double-run and zero downtime. We played with Cassandra timestamp to give higher priority for live production updates over the batch migration process.

We also explained in detail the strategy to mitigate code refactoring to the persistence layer only and leveraging the huge code coverage we have with existing unit and integration tests. The data model is designed to scale linearly with the growing number of users and contacts because we always take care to have user_id/contact_id as component of the partition key.

The last part of the talk focused on the tooling and data type optimization with the usage of Achilles to optimize performance and simplify the coding.

If you missed this session, the video is here.

User Defined Functions in Cassandra 3.0

This is a very interesting talk presenting the sexy feature of user-defined functions (UDF) coming in Cassandra 3.0. The talk was given by Robert Stupp, the Cassandra committer that implemented the feature!

In 3.0 you will be able to define your own function using various languages supported by the JVM. Java can be used of course, but also Scala, Groovy … You can code your UDFs using Javascript too but Robert does not recommended it because it’s less performant than plain Java. The Javascript code will be executed by the Nashorn Javascript engine shipped with Java 8, which make me think that Java 8 may be required by Cassandra 3.0 …

A new CQL syntax has been introduced to allow users pushing their UDF to the server:

CREATE FUNCTION sin(value double)
RETURNS double
LANGUAGE javascript
AS 'Math.sin(value);';

Once pushed to the server, the code will be compiled and distributed to all nodes so that your UDF can be executed on any node. The UDFs can be used in the SELECT clause of your CQL queries:

SELECT sin(angle) FROM trigonometry WHERE partition = ...;

The simple UDF is used as building block for aggregate functions. The aggregate declaration syntax is:

CREATE AGGREGATE minimun (int)
STYPE int //type of the state

The initial state is set to null. For each fetched row (CQL row, not physical partition), the myUDF function is called and the state is updated with the value returned by myUDF.

You can tune further the initial state value and the final computation of the state with the extended syntax for aggregates:

CREATE AGGREGATE average (int)
STYPE tuple
SFUNC averageState
FINALFUNC averageFinal
INITCOND (0, 0);

Robert recommended to make your UDF pure, meaning no side effects, no socket opening or file access because the UDF code is executed server-side and side effects may cause performance degradation or worst, crash the JVM.

When asked about sand boxing, the answer is that right now there is no hard verification of the UDF code because code checking for side effect is a complex problem in it self. One can think about a white list of allowed Java packages import but it would be too restrictive and user want some time to import their own library in the UDF for custom behaviour.

Black listing pacakges or classes also proves to be hard and not enough because you can never be sure that the black list is comprehensive enough. Let’s suppose that I create my own class: MyFile, encapsulating the core class. My custom class will completely bypass the black list and thus allow me to perform expensive side effects server-side…

Last but not least, Robert give us a quick preview of what will be possible to achieve with UDFs:

The video of the talk is now online here

Cassandra at Apple at Massive Scale

I’ve missed this talk at the San Francisco summit so I wanted to catch up. Apart from the impressive figures showing the number of Cassandra nodes deployed at Apple, the talk was too technical and focused mainly on the technical issues they encountered and how they fixed and contributed back to the code base.

How Cassandra was used at Apple and for which use-case ? No information on that. I was quite disappointed by this talk considering the initial hype. In the defence of Sankalp Kohli, the speaker, the legals have probably imposed some restrictions on the internal content that can be publicly exposed.

The video is not available, legal restrictions again I guess.

Hit the Turbo Button on Your Cassandra Application Performance

Yet another talk by the super star Patrick McFadin. This time Patrick focused on the usage of the driver and some common mistakes :

  • only prepare the statements once, never re-prepare the same statements many time, it is completely useless and hurt performance
  • for insert performance, use executeAsync()
  • batch statement is NOT for performance, it’s for eventual atomicity

Many people get bitten by the batch. The abuse of batch statement is very bad because the job that is not done by the client application is delegated to the coordinator. For very large batches with different partition keys, the coordinator will maintain the payload in its JVM heap and block for all statements to be acknowledged by different nodes before releasing the resources. This can lead to a chain of undesired event like heap size pressure, long JVM garbage collection, node flapping etc…

The only sensible case where using batch to optimize network traffic is when all the statements inside the batch have the same partition key, which is very unlikely most of the time.

As a counter-measure to bad usage of batch, some thresholds have been introduced in the code. Above a certain batch size threshold, Cassandra will raise a warning message in the log. Above another threshold the batch will fail. Interestingly, the thresholds are not set on number of statements in a batch but on the payload size.

Then, Patrick exposed some perf improvement with the new row cache refactoring in 2.1. Now the row cache can keep the most recent cells in memory. This is well suited for time series data model where you need to access recent data.

The talk finished by an annual rant about storage. People sometimes expect the impossible with rotational disks. With around 10ms of access time at best, there is no way to push the performance of a node above some limit. SSD, in contrast, have access time an order of magnitude lower, around 70 micro second for the best. In a nutshell, you get what you pay, there is no magic.

The video of the talk is here.

Add a Bit of ACID to Cassandra

This is a quite unusual talk. People from is presenting their own fork of Cassandra called COne that allow ACID transactions in Cassandra. Indeed they changed drastically the masterless architecture to assign roles to Cassandra nodes. There are 2 roles: update servers and storage servers. In the classical master/slave architecture, the master role is given to one server. In COne, the roles is assigned to a data center instead of a single server, so there is one DC dedicated to transaction management and another one to storage.

All reads can target directly the storage DC whereas upserts must go through the transaction DC. Consensus in this DC is achieved using QUORUM.

In addition, clients become fats because they know all about the topology and act as their own coordinator.

For the transactional part, in details, Oleg Anastasyev explained that a pessimistic locking mechanism is used in combination of an implementation of Lamport timestamp to guarantee temporal ordering of the cells. A transaction in C*One consists of several steps and can be rolled back.

The design seems innovative thought it suffers the same limitation of all master/slave architectures: all writes must go to the transaction DC. However the size of this DC can be increased. This is the trade-off accepts to pay for ACID transactions.

Right now COne is not open-sourced. When I asked Oleg the reason why they keep it closed source, he told me that at they have designed a special infrastructure (network connection, server storage …) to make COne work and this may not be easy to replicate this architecture. In one word, C*One has been designed to meet closely requirements and has a lot of optimizations for their business logic. It may not be a good fit to another project.

The video of the talk is here

Spotify: How to Use Apache Cassandra

This was a very interesting talk explaining how Spotify pushes the usage of Cassandra internally, empowers and gives more responsibility to the devs. “You code it, you maintain it” is the motto. Until 2013, the developers push their code into production and the SRE(site reliability engineer) teams are responsible for the cluster status in production and are on-call duty. Since 2013, the developers themselves are involved in on-call duty and the SRE teams act just facilitator/expertise for Cassandra.

This approach is the right one because unlike traditional RDBMS, with Cassandra, the data model has a crucial impact on the performance at runtime so the developers should be accountable for their data model design. And what is the best way to make them accountable other than putting then on-call duty ?

Then the speaker Jimmy Mårdell explained how managing repair in a big cluster (more than 100+ nodes) is a complex task. Fortunately those issues are mostly solved by incremental repair in Cassandra 2.1.

The end of the talk shed some light on the new DateTieredCompaction, which has been developed internally at Spotify before being merged to the open source code base. The explanation went into great detail and I found the pictures very explanatory, example worth thousand words. If you’re interested on how it is implemented, just start watching at the video here

The video of the talk is here

Wrap up

Having attended the EU Summit last year, I can see the real difference with this edition. The number of attendees triple, people are no longer asking whether Cassandra is a appropriate choice, they are now at the point of asking how they can leverage Cassandra in their infrastructure. I’ve also met some interesting people from the academic background with new projects using Cassandra as the Geo Sun project at the university of Reunion (French territory in Indian Ocean). The idea is to have a network of weather metrics (temperature, pressure, …) captors, saved the raw data into Cassandra, run data mining algorithms on them to provide predictions and consequently adjust the local electrical production.

This kind of projects fit into the Sensors and IoT (Internet of Things) perfect use-cases for Cassandra. Coupled with Spark and the Datastax Cassandra/Spark connector, you’ll have a powerful platform for big data ingestion and analytics.

Feedback from SF Cassandra Summit

After 2 days of high paced Cassandra Summit in San Francisco, it’s time to lay down and give a little feedback of the event.

The first impression is that the conference is quite well organised. There were enough staff at the registration desk so that the process went smoothly. It was a little bit crowded at rush hour, around 7:30AM – 8AM but it did not look like a zoo with long queuing lines and people pushing back.

For the content, the summit was split into 2 days, the first one dedicated to training and the second to conferences. There were training sessions for Cassandra Fresh Starters, for Data Modeling and for Performance Tuning. The 3 sessions cover a wide range of attendees requirement. The conference day themes split into Real production use cases for Cassandra, Cassandra-related technical talks and again Performance tuning in production

I The training day

Since I have some knowledge about Cassandra, I attended the Advanced Performance Tuning session held by Aaron Morton, a Cassandra veteran. The training was quite interesting, he re-used some materials from previous year but added new chapters to adapt to new features. The slides were very well structured, starting by the definition of what is performance, what goals we want to achieve, before looking into metrics detail. Then he showed the close relationship between latency & throughput and how they are related.

Aaron is a good story teller, every figure & metrics were explained along side with real production issue he encountered. It makes the training more interactive, with people asking questions to dig into some particular topics. At the end of the day, he introduced a checklist on Cassandra Performance Tuning Methodology, a very detailed document about what to look for, in which order, and which metrics to collect in order to troubleshoot Cassandra performance issues. This list is definitely a must-have

II The conference day

The first keynote

The conference started at sharp 8:15AM with Billy Bosworth, Datastax CEO, opening the show. He explained why in this new digital area of always online businesses, Cassandra high availability and quick response time is the perfect match. He then displayed a Gaussian curve with 20% of legacy systems using traditional SQL solutions on the left, 20% of bleeding-edge NoSQL technologies used by early adopters and techies on the right and the big belly of 60% in the middle consisting of running applications shifting to NoSQL paradigm and how Cassandra can/should address those.

Interestingly enough, he gave some examples of real production use cases powered by Cassandra. He invited Jef Ludwig, VP Engineering of Sony Network Entertainment, on stage to explain how Datastax helps them to power all Sony Entertainment data platform. That was a very strong reference for Cassandra being a rock solid OLTP solution that scale.

The next special guest invited on stage is even more surprising: Yi Li, CEO of Orbeus. This small Californian start-up is developing an awesome digital image recognition service. Yi did a quick live demo for the audience with her application, recognising the sex, age and mood of any face captured by her Ipad camera, among other features. Truly amazing. This demo is quite lengthy though.

I think Datastax made a great moves showcasing such start-up companies. It’s a clear message there: “You are a small start-up. If you enroll in our start-up program, you may get a free showcase at one of the biggest NoSQL conference in the US“. Nice, isn’t it?

The tech keynote

Second keynote was presented by Jonathan Ellis, co-founder of Datastax and chairman of the Apache Cassandra project. He announced the fresh release of the so awaited Cassandra 2.1 and is listing through new features of this version.

First for developers, introduction of the new “User Defined Type” (UDT). It’s basically a custom type you define statically to nest arbitrary types inside. You can even nest UDT inside UDT. The most common given example is an user having an address UDT comprising street name, street number, state and zip code. With this new feature it’ll be dead easy to save JSON messages inside Cassandra.

The second interesting feature for developers are static columns. They are called static because they are defined on a clustered table (table having a compound primary key) and they relate to the partition key only. All static columns are shared among all clustered data. To clarify the concept, think about a blog post with comments. The blog post title, author, creation date and content would be static columns, with the post id as partition key. The clustered data are people comments on the post, with clustering columns being the date of the comment.

On the performance side, Cassandra 2.1 is up to 50% faster than the previous version, due to some internal optimisation on memtables. There was a blog post explaining in detail those perf improvement. He also explained how Cassandra achieves not only to have overall fast response time but also consistent fast response time, even for the 99th percentile, quite an amazing optimisation. Last but not least, dynamic re-sampling of partition keys for dichotomic search index will help optimise memory usage.

But the 2.1 release did not forget the ops. For them, new optimisation for the repair process is here. Cassandra now will mark SSTables that are already repaired so they can be skipped for the next repair. Before that, the repair time grows linearly with you data set. Now the repair time is proportional to the data creation rate, which is by far much more scalable. Another interesting improvement is the loading of new SSTable chunks created during compaction into the OS page cache. This helps avoiding dips of read latency at then end of the compaction, thus helping to achieve the aforementioned consistent fast response time

Counters have been redesigned in 2.1 to be more reliable and resilient to node failures. Now you won’t have counter over-count when a node is resurrecting after a failure and replaying commit log. More details can be found in this counter blog post

The conferences

I’ve targeted mostly conferences on performance tuning because I’m interested by the subject so my feedback may be biased. Anyway, let’s have a look at some of them:

TitanDB, Scaling Relationship Data and Analysis with Cassandra (speaker Matthias Broechler): Matthias is lead developer of TitanDB and is presenting the framework. The idea of TitanDB is to implements the Tinkerpop specs and use various data store (Cassandra/HBase/BerkeleyDB…) underneath. What makes TitanDB stand out of the crowd is its scalability compared to other graph datastores, said Matthias. He then presented some of the graph traversal API used in TitanDB to query data. It looks nice but is kind of complex to handle if you’re not familiar with the Tinkerpop stack. The talk then got into details of how vertices and attributes are mapped to Cassandra. For complex graphs with high cardinality vertices, TitanDB can partition them to make query faster. Please note that there is even a query engine inside TitanDB that will optimise the query plan for you.

Although the ideas and architecture behind TitanDB is very nice and appealing. I felt that it is somehow too complex and the framework kind of “hides” this complexity and tries to optimise performance on behalf of the users. It resembles Hibernate attempt to hide the same complexity of SQL from developers. I’m not sure it’s the right path to choose. Nevertheless, having a framework that can scale on graphs (under which conditions scalablity is guarateed is another debate…) is nice

Lesser Known Features of Cassandra 2.0 and 2.1 (speaker Aaron Morton again!): this talk is more like a catalog of numerous small features not sufficiently highlighted during public talks. Aaron is listing the most interesting one:

  • new logging framework (Logback) for Cassandra to make the logging configuration easier and more dynamic (no need to restart the server!). See CASSANDRA-5883
  • new join_ring toggle to make a node go into hibernate when set to “false”. It is usefull when bringing a dead node into the cluster after a long time, to avoid it serving staled data. Nice trick to know for ops. See CASSANDRA-6961
  • pluggable configuration loader. You no longer need to configure Cassandra using the cassandra.yaml file, now you can plug in your own config manager. But I doubt the utility of this feature, although it’s always nice to have the choice. See CASSANDRA-5045
  • CQL3 now support column aliases, a nice to have feature especially useful when you need to grant an alias to a function call like `writetime()` or `ttl()`. See CASSANDRA-5075
  • new min & max column names stored in SSTable meta data to accelerate slice query, it’s an “old” feature of 2.0.x but it’s nice to know that now slice queries can benefit from it to skip hitting unnecessary SSTables on disk. See CASSANDRA-5514
  • new tool “sstablelevelreset” to force LeveledCompaction tables to reset the level to 0 and so re-compact all SSTables. Before the trick was to remove the JSON manifest file. See CASSANDRA-5271

This talk was nice and technical, ideal for folks like me that knows Cassandra internals quite well. I’m not sure that beginners would appreciate it for what it really worths

Real Data Models of Silicon Valley (speaker Patrick McFadin): this time, Patrick McFadin, Cassandra chief evangelist at Datastax, is on stage. As always he was very comfortable with big audience and the presentation was very fluid. Patrick went into detail about the new “User Defined Type” (UDT) and also mentioned the new tuple type, which is just a variant of UDT. He demonstrated what UDT can bring to Cassandra modelling with a simple example of how one can model documents represented as JSON into Cassandra. It’s amazing to see how a big document hierarchy can be neatly mapped into a CQL3 table using UDT. He then mentioned the meaning of the “frozen” keyword and the practical reason for it to be there: backward compatibility. Patrick gave a glimpse of what UDT would be in Cassandra 3.0 when UDT will be fully functional. You will be able to modify atomically every field of an UDT whereas the 2.1 UDT version is basically just a blob.

As always with Patrick McFadin, the talk is a show, with lot of fun (a big troll on the frozen keyword) but still quite interesting and very technical

CQL Under the Hood (speaker: Robbie Strickland): I did not have the opportunity to attend this talk because it was occurring at the same time than Patrick McFadin talk but I stumble upon the slides. They are definitely worth reading and should be on the top of your reading list for effective data modelling in Cassandra

Performance Tuning Cassandra in AWS (speaker Randy Bliss): this talks illustrated how FamilySearch, a Mormon-backed website that offers a genealogy service, leveraging Cassandra on AWS. First Randy introduced the business use cases of FamilySearch, then he highlighted some tuning done on AWS to increase the cluster performance. Among others, switching to TokenAware load balancing strategy on the driver reduces inter-node latency and offers better throughput. Similarly they increased the number of threads dedicated to read & write, up to 128, to achieve desired performance. 128 seems a pretty high number compared to the default 32 but for their use case, it was a winning move

The talk is too focused on the FamilySearch business, not enough perf tuning content for me so I’m a little bit disappointed

Common Cassandra Performance Patterns Seen Through Histograms (speaker Christ Eniry): this tech talk is definitely a must-seen for any Cassandra ops. Even though I already know about Cassandra histograms, Christ shed a light on new aspects of those figures, especially the fact that the displayed metrics are only a snapshot view in time of the server status. To have a high level overview, he recommended building heat map by taking regular snapshots over a long period of time. He then showed the difference between the cfhistograms, which throw metrics at table level, and proxyhistograms which are more related to the cluster. Analyzing both histograms are interesting and give you some hints for what could be wrong inside the cluster. Since histograms are temporal views, you should not only rely on those for performance investigation

A very good tech talk for perf tuning. Watch it once the Summit videos are online

Cassandra Doctor at Apple (speaker: Richard Low): Richard Low is a well known Cassandra speaker, being nominated MVP last year at the Cassandra EU Summit. He is working now at Apple and exposed some performance issues he dealt with recently. He took a detective approach, first presenting the symptoms and consequences before digging further into performance and code analyzis. The first issue was a client having high read/write latency. It appeared after looking at the perf metrics that the latency resembled typical cross-DC latency. Richard found out that there was a bug in the Java driver that sets “null” to local data center in a system table, making the client selecting the wrong DC as local DC. Another highlighted bug is suspicious sstable state after compaction. He noticed that some very old sstables (having lower generation ID in their name) are still hanging around after several compactions. Looking into the source code of Cassandra, he caught a nasty bug about counter and filed a JIRA. Interestingly the bug was fixed almost in the same day. Open source power ! Please note that, this talk, along with the Cassandra at Apple for Massive Scale talk I didn’t attend, showed how Apple uses Cassandra extensively in their infrastructure. Go get the videos once online!

This talk is an incentive to learn more about Cassandra internals. You don’t need to know in details how every piece of the database works together but having a big picture in mind does help a lot narrowing down root causes. He urged people not to hesitate looking into the source code when having Java stacktrace at hand, it can save your day. Personally that’s what I did yesterday when helping a customer and looking into source code definitely help finding the root cause

The lightning talks

This lightning talks session is a kind of tradition to wrap up the conference. The idea is dead simple: you have 5 minutes to pitch your idea/talk. The timer is displayed on the big screen in the background. After 5 minutes, the bell rings and you’re out. Having had a chance to do a lightning talk last November at the Cassandra EU Summit in London, I must admit that time is running really really fast when you’re on stage.

This year, Christian Hasker, responsible for the Apache Cassandra Community development, leads the show. There were some interesting talks. One about Instaclustr, a Cassandra-in-the-cloud provider, proving that Apache Cassandra is now at the core of lot new startup business. Patricia Gorla, this year MVP, did a quick presentation of SSTable generator, a worthy tool to generate raw SSTable from CQL3, really nice. Brian Lynch then introduces new support for Cassandra in GCE. They experienced some issue configuring Cassandra on GCE but finally made it. Now you can in one click select and deploy a Cassandra cluster on GCE, with all the perf tuning integrated. The last lightning talk I like is the one by Apple, presenting how they architected Cassandra to serve as distributed cache. Though technically interesting, I am afraid that it can give bad idea for some folks using Cassandra as a caching solution without proper tuning.

All the extra

Apart from the main conference and training session, there were some interesting spots to look at during this Summit. First one was the Cassandra LIVE room where you can drop by and have people from Netflix, Sony, Instagram … talk about their experience using Cassandra in production. Another sweet spot is the Meet the Experts room, where you can just drop by and grab any expert there to answer any of your question. The room was really crowded at rush hours. This was the place to be for techies.

Wrap up

This was my first Cassandra Summit in SF and it was intense. 2 days of non stop technical info stream to ingest. The organisation was excellent, no chaotic big line for food or registration. I’ve met of lot of interesting people and put a face on some of the folks I used to exchange with online. The Cassandra Summit is definitely a worth-to-attend event. I can’t wait for the next Cassandra Summit Europe happening in London this fall.

ReCaptcha login form with Spring Security

Today I’ll show you how to customize Spring Security to create a login form with ReCaptcha verification

The captcha is based on Google ReCaptcha plugin. More info on ReCaptcha here

Read more of this post

Design Pattern: the Asynchronous Dispatcher

Today we’ll revisit a common design pattern, the Observer, and one of its derivative for multi-threaded processing: the Asynchronous Dispatcher.

Read more of this post

GWT JSON integration with Spring MVC

In the previous post, I’ve presented GWT RPC integration with Spring. In this post we’ll see how we can achieve JSON backend service integration between GWT and Spring MVC. We’ll re-use the same StockWatcher application and change the RPC communication into JSON requests.

Read more of this post

GWT RPC integration with Spring

Recently I started studying GWT, a new web framework for my curriculum. The main idea of GWT is to let you code the GUI part in Java (with static compilation and type checking) then translate this code into Javascript.

For the backend, GWT relies on an RPC system using plain old Servlet. Each RPC service is published as a distinct servlet so you end up having as many servlets are there are distinct RPC services.

Not only this approach is not optimal for resources management server-side wise, it also pollutes your web.xml with many servlet declarations.

Ideally we should have an unique servlet serving as a router and dispatching RPC calls to appropriate service beans managed by Spring. Spring is an appropriate framework for managing business in the backend due to its industry-wide adoption and its mature extensions portfolio. Needless to say that this design can be easily adapted for JEE containers too.

Disclaimer: the code presented below has been inspired from projects like spring4GWT and gwtrpc-spring. I took the same approach and modified the URL mapping part so the original credits go for them.

Read more of this post

Java 8 Lambda in details part V : Functional interface definition and lambda expression implementation

In this last post we’ll look at the functional interface formal definition and the way lambda expressions are implemented.

Read more of this post