mongodb - Mongo Auto Balancing Not Working -
i'm running issue 1 of shards @ 100% cpu usage while i'm storing files mongo db (using grid fs). have shutdown writing db , usage drop down 0%. however, auto balancer on , not appear auto balancing anything. have 50% of data on 1 shard 100% cpu usage , virtually others @ 7-8%.
any ideas?
mongos> version() 3.0.6
auto balancing enabled
storage engine: wiredtiger have general architecture: 2 - routers 3 - config server 8 - shards (2 shards per server - 4 servers) no replica sets!
https://docs.mongodb.org/v3.0/core/sharded-cluster-architectures-production/
log details
router 1 log:
2016-01-15t16:15:21.714-0700 network [conn3925104] end connection [ip]:[port] (63 connections open) 2016-01-15t16:15:23.256-0700 network [lockpinger] socket recv() timeout [ip]:[port] 2016-01-15t16:15:23.256-0700 network [lockpinger] socketexception: remote: [ip]:[port] error: 9001 socket exception [recv_timeout] server [ip]:[port] 2016-01-15t16:15:23.256-0700 network [lockpinger] dbclientcursor::init call() failed 2016-01-15t16:15:23.256-0700 network [lockpinger] scoped connection [ip]:[port],[ip]:[port],[ip]:[port] not being returned pool 2016-01-15t16:15:23.256-0700 w sharding [lockpinger] distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579303:1804289383' detected exception while pinging. :: caused :: syncclusterconnection::update prepare failed: [ip]:[port] (ip) failed:10276 dbclientbase::findn: transport error: [ip]:[port] ns: admin.$cmd query: { getlasterror: 1, fsync: 1 } 2016-01-15t16:15:24.715-0700 network [mongosmain] connection accepted [ip]:[port] #3925105 (64 connections open) 2016-01-15t16:15:24.715-0700 network [conn3925105] end connection [ip]:[port] (63 connections open) 2016-01-15t16:15:27.717-0700 network [mongosmain] connection accepted [ip]:[port] #3925106 (64 connections open) 2016-01-15t16:15:27.718-0700 network [conn3925106] end connection [ip]:[port](63 connections open)
router 2 log:
2016-01-15t16:18:21.762-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e3d110ccb8e38549a9d 2016-01-15t16:18:24.316-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:18:24 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579454:1804289383', sleeping 30000ms 2016-01-15t16:18:24.978-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:35.295-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e4a110ccb8e38549a9f 2016-01-15t16:18:38.507-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:48.838-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e58110ccb8e38549aa1 2016-01-15t16:18:52.038-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:54.660-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:18:54 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579454:1804289383', sleeping 30000ms 2016-01-15t16:19:02.323-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e66110ccb8e38549aa3 2016-01-15t16:19:05.513-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked.
problematic shard log:
2016-01-15t16:21:03.426-0700 w sharding [conn40] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 46 took 17364ms 2016-01-15t16:21:03.484-0700 command [conn40] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 17421ms 2016-01-15t16:21:03.775-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:21:03 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1441718306:765353801', sleeping 30000ms 2016-01-15t16:21:04.321-0700 sharding [conn40] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey } 2016-01-15t16:21:08.243-0700 sharding [conn46] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey } 2016-01-15t16:21:10.174-0700 w sharding [conn37] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 60 took 18516ms 2016-01-15t16:21:10.232-0700 command [conn37] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 18574ms 2016-01-15t16:21:10.989-0700 w sharding [conn25] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 62 took 18187ms 2016-01-15t16:21:11.047-0700 command [conn25] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 18246ms 2016-01-15t16:21:11.365-0700 sharding [conn37] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey }
for splitting error - upgrading mongo v.3.0.8+ resolved it
still having issue balancing itself...shard key md5 check sum unless have similar md5s (not likely) there still investigating do....using range based partitioning
Comments
Post a Comment