When entries in the jgroupsping table are not removed, clustering does not always work properly

Issue

  • With two Liferay 7.2 SP3 nodes, configured unicast clustering. However, it has been observed that clustering does not always work properly and that the cluster returns to normal operation when the entries in the jgroupsping table are deleted. The following errors have been observed:
    2021-11-29 10:40:15.841 ERROR [TQ-Bundler-9,liferay-channel-control,liferay-77ff6c5695-9cg4s-32926][TCP:99] JGRP000034: liferay-77ff6c5695-9cg4s-32926: failure sending message to liferay-6bcd5d8f4d-k8cqq-32934: java.net.SocketTimeoutException: connect timed out
    2021-11-29 10:40:16.142 ERROR [TQ-Bundler-9,liferay-channel-control,liferay-77ff6c5695-9cg4s-32926][TCP:99] JGRP000034: liferay-77ff6c5695-9cg4s-32926: failure sending message to liferay-6bcd5d8f4d-k8cqq-32934: java.net.SocketTimeoutException: connect timed out
    2021-11-29 10:40:17.051 ERROR [TQ-Bundler-9,liferay-channel-transport-0,liferay-77ff6c5695-9cg4s-8287][TCP:99] JGRP000034: liferay-77ff6c5695-9cg4s-8287: failure sending message to liferay-6bcd5d8f4d-k8cqq-13644: java.net.SocketTimeoutException: connect timed out
    2021-11-29 10:40:17.447 INFO  [VERIFY_SUSPECT.TimerThread-22,liferay-channel-control,liferay-77ff6c5695-9cg4s-32926][JGroupsReceiver:93] Accepted view [liferay-77ff6c5695-9cg4s-32926|5] (2) [liferay-77ff6c5695-9cg4s-32926, liferay-77ff6c5695-km8x5-47418]
    2021-11-29 10:40:17.474 INFO  [default-2][ClusterSchedulerEngine:926] 25 MEMORY_CLUSTERED jobs started running on this node
    2021-11-29 10:40:17.737 INFO  [VERIFY_SUSPECT.TimerThread-17,liferay-channel-transport-0,liferay-77ff6c5695-9cg4s-8287][JGroupsReceiver:93] Accepted view [liferay-77ff6c5695-9cg4s-8287|5] (2) [liferay-77ff6c5695-9cg4s-8287, liferay-77ff6c5695-km8x5-5224]

Environment

  • Liferay DXP 7.2 SP3
  • Unicast Clustering 
  • Openshift

Resolution

  • DB table entries for members which crashed but were not removed for some reason, and this is why, whenever users truncate the entries, the cluster resumes normal operation.
  • Jgroups appears to be failing to clean up that table. As a result, based on the following jgroups forum, it appears that  the clear_table_on_view_change parameter has been replaced with remove_all_data_on_view_change.  Reference: https://sourceforge.net/p/javagroups/mailman/message/36479197/
    • Set clear_table_on_view_change to true in the JDBC_PING tag of jdbc_ping_config.xml of each cluster node, for the automatic removal of the DB table entries for members which crashed but weren’t removed for some reason.
    • Please find more in the jgroups documentation.

Additional Information

  • According to the support matrix, Openshift is not supported with Liferay 7.2.
    However, we have a channel called 'Global Service Team' which might be able to assist in this regard.
¿Fue útil este artículo?
Usuarios a los que les pareció útil: 0 de 0