mongodb - Mongo Auto Balancing Not Working -


i'm running issue 1 of shards @ 100% cpu usage while i'm storing files mongo db (using grid fs). have shutdown writing db , usage drop down 0%. however, auto balancer on , not appear auto balancing anything. have 50% of data on 1 shard 100% cpu usage , virtually others @ 7-8%.

any ideas?

mongos> version() 3.0.6 

auto balancing enabled

storage engine: wiredtiger  have general architecture: 2 - routers 3 - config server 8 - shards (2 shards per server - 4 servers) no replica sets! 

https://docs.mongodb.org/v3.0/core/sharded-cluster-architectures-production/

log details

router 1 log:

2016-01-15t16:15:21.714-0700 network  [conn3925104] end connection [ip]:[port] (63 connections open) 2016-01-15t16:15:23.256-0700 network  [lockpinger] socket recv() timeout  [ip]:[port] 2016-01-15t16:15:23.256-0700 network  [lockpinger] socketexception: remote: [ip]:[port] error: 9001 socket exception [recv_timeout] server [ip]:[port] 2016-01-15t16:15:23.256-0700 network  [lockpinger] dbclientcursor::init call() failed 2016-01-15t16:15:23.256-0700 network  [lockpinger] scoped connection [ip]:[port],[ip]:[port],[ip]:[port] not being returned pool 2016-01-15t16:15:23.256-0700 w sharding [lockpinger] distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579303:1804289383' detected exception while pinging. :: caused :: syncclusterconnection::update prepare failed:  [ip]:[port] (ip) failed:10276 dbclientbase::findn: transport error: [ip]:[port] ns: admin.$cmd query: { getlasterror: 1, fsync: 1 } 2016-01-15t16:15:24.715-0700 network  [mongosmain] connection accepted [ip]:[port] #3925105 (64 connections open) 2016-01-15t16:15:24.715-0700 network  [conn3925105] end connection [ip]:[port] (63 connections open) 2016-01-15t16:15:27.717-0700 network  [mongosmain] connection accepted [ip]:[port] #3925106 (64 connections open) 2016-01-15t16:15:27.718-0700 network  [conn3925106] end connection [ip]:[port](63 connections open) 

router 2 log:

2016-01-15t16:18:21.762-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e3d110ccb8e38549a9d 2016-01-15t16:18:24.316-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:18:24 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579454:1804289383', sleeping 30000ms 2016-01-15t16:18:24.978-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:35.295-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e4a110ccb8e38549a9f 2016-01-15t16:18:38.507-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:48.838-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e58110ccb8e38549aa1 2016-01-15t16:18:52.038-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 2016-01-15t16:18:54.660-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:18:54 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1442579454:1804289383', sleeping 30000ms 2016-01-15t16:19:02.323-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' acquired, ts : 56997e66110ccb8e38549aa3 2016-01-15t16:19:05.513-0700 sharding [balancer] distributed lock 'balancer/[ip]:[port]:1442579454:1804289383' unlocked. 

problematic shard log:

2016-01-15t16:21:03.426-0700 w sharding [conn40] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 46 took 17364ms 2016-01-15t16:21:03.484-0700 command  [conn40] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 17421ms 2016-01-15t16:21:03.775-0700 sharding [lockpinger] cluster [ip]:[port],[ip]:[port],[ip]:[port] pinged @ fri jan 15 16:21:03 2016 distributed lock pinger '[ip]:[port],[ip]:[port],[ip]:[port]/[ip]:[port]:1441718306:765353801', sleeping 30000ms 2016-01-15t16:21:04.321-0700 sharding [conn40] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey } 2016-01-15t16:21:08.243-0700 sharding [conn46] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey } 2016-01-15t16:21:10.174-0700 w sharding [conn37] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 60 took 18516ms 2016-01-15t16:21:10.232-0700 command  [conn37] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 18574ms 2016-01-15t16:21:10.989-0700 w sharding [conn25] finding split vector files.fs.chunks on { files_id: 1.0, n: 1.0 } keycount: 137 numsplits: 200715 lookedat: 62 took 18187ms 2016-01-15t16:21:11.047-0700 command  [conn25] command admin.$cmd command: splitvector { splitvector: "files.fs.chunks", keypattern: { files_id: 1.0, n: 1.0 }, min: { files_id: objectid('5650816c827928d710ef5ef9'), n: 1 }, max: { files_id: maxkey, n: maxkey }, maxchunksizebytes: 67108864, maxsplitpoints: 0, maxchunkobjects: 250000 } ntoreturn:1 keyupdates:0 writeconflicts:0 numyields:216396 reslen:8318989 locks:{ global: { acquirecount: { r: 432794 } }, database: { acquirecount: { r: 216397 } }, collection: { acquirecount: { r: 216397 } } } 18246ms 2016-01-15t16:21:11.365-0700 sharding [conn37] request split points lookup chunk files.fs.chunks { : objectid('5650816c827928d710ef5ef9'), : 1 } -->> { : maxkey, : maxkey } 

for splitting error - upgrading mongo v.3.0.8+ resolved it

still having issue balancing itself...shard key md5 check sum unless have similar md5s (not likely) there still investigating do....using range based partitioning


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -