An error occurred while fetching the assigned iteration of the selected issue.
Redis slaves get disconnected when high rate of writes occur
@twk3 noticed this last week when we attempted to clear the cache with the Rake task, and I just tried to requeue about 7000 Sidekiq jobs. When this happens, we see this error:
Redis::CommandError: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
From the logs, I don't see anything obvious why this occurs. On the master, the logs show:
2017-04-27_14:51:59.54581 redis4 redis: 4942:M 27 Apr 14:51:59.545 * Background saving started by pid 48139
2017-04-27_14:52:10.79892 redis4 redis: 4942:M 27 Apr 14:52:10.798 # Connection with slave client id #121783 lost.
2017-04-27_14:52:30.94452 redis4 redis: 4942:S 27 Apr 14:52:30.944 # Connection with slave 10.45.2.10:6379 lost.
2017-04-27_14:52:30.94472 redis4 redis: 4942:S 27 Apr 14:52:30.944 # Connection with slave 10.45.2.8:6379 lost.
2017-04-27_14:52:30.94485 redis4 redis: 4942:S 27 Apr 14:52:30.944 # Connection with slave 10.45.2.6:6379 lost.
2017-04-27_14:52:30.94514 redis4 redis: 4942:S 27 Apr 14:52:30.945 * SLAVE OF 10.45.2.6:6379 enabled (user request from 'id=5322146 addr=10.45.2.9:53407 fd=14 name=sentinel-c05dc551-cmd age=20 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
2017-04-27_14:52:30.95250 redis4 redis: 4942:S 27 Apr 14:52:30.952 # CONFIG REWRITE executed with success.
2017-04-27_14:52:31.84497 redis4 redis: 4942:S 27 Apr 14:52:31.844 * Connecting to MASTER 10.45.2.6:6379
2017-04-27_14:52:31.84508 redis4 redis: 4942:S 27 Apr 14:52:31.844 * MASTER <-> SLAVE sync started
2017-04-27_14:52:31.84560 redis4 redis: 4942:S 27 Apr 14:52:31.845 * Non blocking connect for SYNC fired the event.
2017-04-27_14:52:31.84615 redis4 redis: 4942:S 27 Apr 14:52:31.846 * Master replied to PING, replication can continue...
2017-04-27_14:52:31.84677 redis4 redis: 4942:S 27 Apr 14:52:31.846 * Partial resynchronization not possible (no cached master)
2017-04-27_14:56:47.73400 redis4 redis: 48139:C 27 Apr 14:56:47.733 * DB saved on disk
2017-04-27_14:56:48.52876 redis4 redis: 48139:C 27 Apr 14:56:48.528 * RDB: 326 MB of memory used by copy-on-write
2017-04-27_14:56:49.64912 redis4 redis: 4942:S 27 Apr 14:56:49.649 * Background saving terminated with success
2017-04-27_14:57:28.39439 redis4 redis: 4942:S 27 Apr 14:57:28.394 * Full resync from master: a1f0b3161f4d979757fb0b1add6770aaacb712db:202464877266
But redis3 shows:
2017-04-27_14:51:58.05732 redis3 redis: 36524:S 27 Apr 14:51:58.057 * Background saving started by pid 9094
2017-04-27_14:52:09.89311 redis3 redis: 36524:M 27 Apr 14:52:09.893 # Connection with master lost.
2017-04-27_14:52:09.89325 redis3 redis: 36524:M 27 Apr 14:52:09.893 * Caching the disconnected master state.
2017-04-27_14:52:09.89335 redis3 redis: 36524:M 27 Apr 14:52:09.893 * Discarding previously cached master state.
2017-04-27_14:52:09.89349 redis3 redis: 36524:M 27 Apr 14:52:09.893 * MASTER MODE enabled (user request from 'id=3638510 addr=10.45.2.10:47810 fd=7 name=sentinel-32fcc701-cmd age=467916 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
2017-04-27_14:52:10.62011 redis3 redis: 36524:M 27 Apr 14:52:10.620 # CONFIG REWRITE executed with success.
2017-04-27_14:52:11.29629 redis3 redis: 36524:M 27 Apr 14:52:11.296 * Slave 10.45.2.9:6379 asks for synchronization
2017-04-27_14:52:11.29643 redis3 redis: 36524:M 27 Apr 14:52:11.296 * Full resync requested by slave 10.45.2.9:6379
2017-04-27_14:52:11.29648 redis3 redis: 36524:M 27 Apr 14:52:11.296 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC
2017-04-27_14:52:31.84848 redis3 redis: 36524:M 27 Apr 14:52:31.848 * Slave 10.45.2.7:6379 asks for synchronization
2017-04-27_14:52:31.84859 redis3 redis: 36524:M 27 Apr 14:52:31.848 * Full resync requested by slave 10.45.2.7:6379
2017-04-27_14:52:31.84865 redis3 redis: 36524:M 27 Apr 14:52:31.848 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC
2017-04-27_14:53:11.40794 redis3 redis: 36524:M 27 Apr 14:53:11.407 * Slave 10.45.2.10:6379 asks for synchronization
2017-04-27_14:53:11.40805 redis3 redis: 36524:M 27 Apr 14:53:11.407 * Full resync requested by slave 10.45.2.10:6379
2017-04-27_14:53:11.40807 redis3 redis: 36524:M 27 Apr 14:53:11.407 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC
2017-04-27_14:53:11.49186 redis3 redis: 36524:M 27 Apr 14:53:11.491 * Slave 10.45.2.8:6379 asks for synchronization
2017-04-27_14:53:11.49196 redis3 redis: 36524:M 27 Apr 14:53:11.491 * Full resync requested by slave 10.45.2.8:6379
2017-04-27_14:53:11.49198 redis3 redis: 36524:M 27 Apr 14:53:11.491 * Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC
2017-04-27_14:57:25.36343 redis3 redis: 9094:C 27 Apr 14:57:25.363 * DB saved on disk
2017-04-27_14:57:26.17853 redis3 redis: 9094:C 27 Apr 14:57:26.178 * RDB: 1308 MB of memory used by copy-on-write
2017-04-27_14:57:27.40864 redis3 redis: 36524:M 27 Apr 14:57:27.408 * Background saving terminated with success
2017-04-27_14:57:27.40877 redis3 redis: 36524:M 27 Apr 14:57:27.408 * Starting BGSAVE for SYNC with target: disk
2017-04-27_14:57:28.39469 redis3 redis: 36524:M 27 Apr 14:57:28.394 * Background saving started by pid 9527