6051-6100 of 10000 results (63ms)
2017-07-05 ยง
17:41 <apergos> re-enabled puppet on stat1003 (last dataset nfs client), manually mounted /mnt/data because puppet run has an unrelated error [production]
16:33 <jynus> restart mysql on db2062 [production]
16:04 <ema> restart pybal on lvs200[12] to make them reconnect to conf2001 [production]
16:03 <ema> restart pybal on lvs200[45] to make them reconnect to conf2001 [production]
15:54 <jynus> restart mysql on db2072 [production]
15:30 <apergos> re-enabled puppet on stat1002, did a manual run, dataset filesystem available again there [production]
15:09 <apergos> re-enabled puppet on snapshot6,7, still watching dataset1001 performance [production]
15:09 <ema> restart pybal on lvs2003 to make it reconnect to conf2001 [production]
14:45 <ema> bounce pybal on lvs2006, not synced with etcd information [production]
14:40 <moritzm> rebooting restbase1012 for kernel update [production]
14:19 <moritzm> rebooting logstash100[4-6] for kernel update [production]
14:00 <moritzm> rebooting logstash100[1-3] for kernel update [production]
13:59 <ema> cache_misc: upgrade to varnish 4.1.7-1wm1 and reboot for kernel update [production]
13:48 <apergos> re-enabling puppet on snapshot1001, 1005 for testing [production]
13:46 <moritzm> rebooting restbase1011 for kernel update [production]
13:44 <zeljkof> EU SWAT finished! [production]
13:43 <zfilipin@tin> Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:362986|Set Wikibase readFullEntityIdColumn setting to false]] (duration: 00m 42s) [production]
13:35 <zfilipin@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:363043|Enable WikiLove for ckbwiki (T169563)]] (duration: 00m 43s) [production]
13:24 <zfilipin@tin> Synchronized dblists/closed.dblist: SWAT: [[gerrit:361686|Reopen nlwikinews (T168764)]] (duration: 02m 50s) [production]
13:21 <jmm@puppetmaster1001> conftool action : set/pooled=inactive; selector: mw1196.eqiad.wmnet [production]
13:18 <apergos> power cycled dataset1001, crashed, unresponsivle on mgmt console [production]
13:18 <zfilipin@tin> Synchronized dblists/closed.dblist: SWAT: [[gerrit:361686|Reopen nlwikinews (T168764)]] (duration: 02m 50s) [production]
13:16 <elukey> reboot conf2001 for kernel updates [production]
13:09 <moritzm> rebooting restbase1010 for kernel update [production]
12:49 <marostegui> Force BBU relearn on db1016 - T166344 [production]
12:36 <marostegui> Move labsdb1010 main general replication thread to a named replication thread called db1095 - T153743 [production]
12:33 <marostegui> Stop all replication threads on db1095 for maintenance - T153743 [production]
12:32 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1085 - T153743 (duration: 02m 49s) [production]
12:29 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1051 - T168661 (duration: 02m 50s) [production]
12:16 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1051 - T168661 (duration: 02m 51s) [production]
12:11 <apergos> puppet is currently disabled again on snapshots 1,5,6,7 and on dataset1001; we saw the same nfs issue shortly after reboot, with no dump processes going, as snapshots 5,6,7 had not remounted the filesystem [production]
11:20 <moritzm> rebooting wtp2* servers for kernel update [production]
11:14 <moritzm> rebooting restbase1009 for kernel update [production]
10:56 <hashar> restarting Jenkins for plugin upgrades [production]
10:45 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 59s) [production]
10:41 <marostegui> Run redact_sanitarium on s6 databases db1102 - T153743 [production]
10:41 <moritzm> rebooting wtp1001 for kernel update [production]
10:37 <moritzm> rebooting restbase1008 for kernel update [production]
10:32 <apergos> rebooting snapshot hosts to clean up hung nfs client processes [production]
10:30 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 51s) [production]
10:24 <apergos> rebooted dataset1001 to unstick nfsd and pick up new kernel, re-enabled puppet [production]
10:14 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1066 - T168661 (duration: 02m 50s) [production]
10:11 <moritzm> rebooting restbase1007 for kernel update [production]
10:01 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1066 - T168661 (duration: 02m 50s) [production]
09:57 <marostegui> Deploy alter table on s1 eqiad hosts - T168661 [production]
09:48 <godog> move 'instances' graphite hierarchy out of the way, do not delete yet - T143405 [production]
09:27 <marostegui> Stop MySQL on db1085 for maintenance - T153743 [production]
09:21 <godog> upload nginx_1.11.10-1+wmf2 to jessie-wikimedia and nginx_1.11.10-1+wmf2~stretch1 to stretch-wikimedia [production]
09:17 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1085 - T153743 (duration: 02m 50s) [production]
08:44 <apergos> puppet disabled and processes accessing dataset1001 exported filesystem shot, on: stat1002,3, snapshot1001,5,6,7, while investigation continues [production]