We are using
ipvs for L4 loadbalancing which transmits packets to L7 backends on ipip tunnel mode.
There are three ipvs systems configured with source hashing for persistence. Sometimes, ipvs is transmitting the packets to incorrect backends.
For example, ipvs 1 recieves packet from client 22.214.171.124 and it sends the packet to backend realserver 1, the same client’s next packet is received by ipvs 2 which sends it to the backend realserver 2. Now, backend 2 has no idea of this packet because the connection was actually initiated with realserver 1, thus the realserver 2 ends the connection with an RST packet.
This happens not only with a particular client, all the clients are having the same behavior.
to my understanding, all L4 ipvs should pick the same real server because of the source hashing algorithm.
I built a same setup in lab, but couldn’t reproduce it. The setup that has issue is production, hence I cannot do any huge changes to it for debugging purpose.
Keepalived is used to manage the ipvs.
Any directions on how to debug this issue with minimal impact will be really helpful.
PS – I know source hashing isn’t very consistent, but the packets being sent to wrong real servers are too high. We have other clusters where we have never seen this issue.