Patching a Liferay Cluster With No Down Time

This article describes steps for customers to patch Liferay instances in a cluster with no down time for the cluster environment. Known as a rolling change, there are limitations to this method; DO NOT use this method if the fix pack or hotfix makes changes to the database schema and if the fix pack or hotfix makes significant API changes to the cached objects.

Finally, it is important to note that when starting the patched nodes, please make sure one node is fully started before starting the next node. This is because the best practice is to start each node serially and verify additional cluster link messages are being generated. By design, Liferay Portal requires some delay between the start of each node. For example, the Quartz scheduler needs to elect a master in order to determine which node is responsible for running scheduled tasks. The safest possible way to start the nodes in a cluster is to wait until the previous node has started up completely.

Resolution

Important Note:

The instructions below assume there is a load balancer used to manage the cluster. Each Liferay customer will have different number of nodes in the cluster so adjust the instructions accordingly. Note that since the load balancer does not distinguish the patched nodes from the others, there is a strong possibility that users will still experience the bug until all nodes have been patched.

Limitations

Do NOT use the rolling change process if:

  • Upgrading the JDK
  • Upgrading the OS

Use the rolling change process only for: 

  • Applying a fix pack or hotfix that does not make any changes to the database.
  • Applying a fix pack or hotfix that does not have significant core changes. In the sense that the differences between two patch levels (check by using Patching Tool v18 and v19) contains only .jsp* or static resources (i.e. images, html) changes, but does not have java / class files modifications or library upgrades (.jar).

Instructions

  1. Remove several nodes from the load balancer to force the prevention of all new connections. As stated above, the number of nodes to be removed depends on the user's needs.
  2. Wait until all connections have been terminated.
  3. Stop the Liferay instances on the terminated nodes.
  4. Patch the Liferay instances on these nodes.
  5. Start the Liferay instances on these nodes one by one. As stated above, one primary reason is because of Liferay Portal's Quartz Scheduler requires a delay to elect a new master.
  6. Once each Liferay instance has been started, add the nodes back to the load balancer.
  7. Repeat the steps above for all remaining nodes.
  8. Once all the nodes have been added to the load balancer, clear the cache. To clear the cache, navigate to Admin > Control Panel > Server Administration. In the Resource tab, click Executefor each action needed to clear the caches.

Additional Information

Was this article helpful?
0 out of 0 found this helpful