8

I've been writing a backup script for sharded replica-sets and it's almost done. Except I can't seem to get it to successfully start the balancer backup after everything's all said and done.

Here's the command I'm trying to use to start the balancer back up; keep in mind that this is being run on the actual mongos server via SSH.

sudo -s
mongo -u username -p password --authenticationDatabase db
use config
sh.setBalancerState(true)
exit
exit
exit

I keep getting the following error whenever the scripts hits the startBalancer function, which runs the above code.

SyncClusterConnection::udpate prepare failed:  mongo-conf-0.foo.bar.com:27019:10276 
DBClientBase::findN: transport error: mongo-conf-0.foo.bar.com:27019 
ns: admin.$cmd query: { resetError: 1 }

I've tried checking against the exit status of the mongo shell process, using something like

if (code != 0) {
  return next('repeat');
} else {
  return next();
}

but regardless of what actually occurs in the mongo-shell, the exit code seems to always be 0.

Any ideas on how I can verify that the mongos process is actually connected to all three configs before I try to re-enable the balancer? I assume the problem is that the mongos server tries to connect to the config server before the mongod process had a chance to finish starting up (part of the backup process for sharded replica-sets is shutting down one of the config servers)

cngzz1
  • 29
  • 6
Alexej Magura
  • 223
  • 2
  • 7
  • **NOTE**, while (as a last resort) I can check against the output that the command produces, I'd rather not... although I just realized that I could check to see if the output is as expected and then repeat the function if it isn't. – Alexej Magura Dec 13 '13 at 00:15
  • Why you shutdown the config server? I don't think that step is necessary. – Antonios Aug 26 '14 at 15:08

2 Answers2

0

It is easier way to "command" your mongo than what you do.

mongo -u username -p password --authenticationDatabase db --eval="sh.stopBalancer()"

mongo -u username -p password --authenticationDatabase db --eval="sh.startBalancer()"

No need for sudo-commands or multiple exits.. Command returns when it's ready.

You can check status of balancer with

mongo -u username -p password --authenticationDatabase db --eval="sh.isBalancerRunning()"
JJussi
  • 5,083
  • 1
  • 11
  • 17
0

Have you tried using the sh.startBalancer() helper instead?

Rather than a straight update, it does takes an timeout argument as how long to wait for balancing to start as well as a sleep interval in terms of how long to sleep between waiting. Here's the code from the shell by way of explanation:

mongos> sh.startBalancer
function ( timeout, interval ) {
    sh.setBalancerState( true )
    sh.waitForBalancer( true, timeout, interval )
}

So, you could even break it up and use the waitForBalancer helper if you wished. For reference, here is the equivalent stopBalancer command erroring out when I tried to stop it with a config server down:

mongos> sh.stopBalancer(2000, 100)
Waiting for active hosts...
Waiting for active host adamc-mbp.local:30999 to recognize new settings... (ping : Tue Dec 31 2013 19:51:32 GMT+0000 (GMT))
Waiting for the balancer lock...
Waiting again for active hosts after balancer is off...
Tue Dec 31 19:51:39.243 error: {
    "$err" : "error creating initial database config information :: caused by :: SyncClusterConnection::udpate prepare failed:  localhost:29000:9001 socket exception [FAILED_STATE] server [localhost:29000] ",
    "code" : 8005
} at src/mongo/shell/query.js:128
Adam C
  • 9,050
  • 2
  • 25
  • 44