I have a Kafka setup with three brokers in Kubernetes, set up according to the guide at https://github.com/Yolean/kubernetes-kafka. The following error message appears when producing messages from a Java client.
2018-06-06 11:15:44.103 ERROR 1 --- [ad | producer-1] o.s.k.support.LoggingProducerListener : Exception thrown when sending a message with key='null' and payload='[...redacted...]': org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topicname-0: 30001 ms has passed since last append
The listeners are set up to allow SSL producers/consumers from the outside world:
advertised.host.name = null advertised.listeners = OUTSIDE://kafka-0.mydomain.com:32400,PLAINTEXT://:9092 advertised.port = null listener.security.protocol.map = PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,OUTSIDE:SSL listeners = OUTSIDE://:9094,PLAINTEXT://:9092 inter.broker.listener.name = PLAINTEXT host.name = port.name = 9092
The OUTSIDE listeners are listening on kafka-0.mydomain.com, kafka-1.mydomain.com, etc. The plaintext listeners are listening on any IP, since they are cluster-local to Kubernetes.
The producer settings:
kafka: bootstrap-servers: kafka.mydomain.com:9092 properties: security.protocol: SSL producer: batch-size: 100 buffer-memory: 1048576 # 1MB retries: 1 ssl: key-password: redacted keystore-location: file:/var/private/ssl/kafka.client.keystore.jks keystore-password: redacted truststore-location: file:/var/private/ssl/kafka.client.truststore.jks truststore-password: redacted
- The errors started appearing when the broker was moved moved to SSL.
- On the server side everything is running as expected, there are no errors in the log and I can connect to the broker manually with a Kafka client tool.
- The errors appear intermittently: sometimes it sends 30+ messages per second, sometimes it sends nothing at all. It may work like a charm for hours and then just spam timeouts for a little while.
What could it be?