3051-3100 of 10000 results (75ms)
2019-08-13 ยง
15:42 <gehel@cumin2001> END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99) [production]
15:39 <bblack> puppet re-enabled on lvs1014, lvs1016, icinga1001 [production]
15:35 <XioNoX> depool eqsin for cr2-eqsin upgrade [production]
15:32 <bblack> disabled pupped on lvs1014, lvs1016, icinga1001 ahead of deploying https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/528885/ - T229621 [production]
15:32 <gehel@cumin2001> START - Cookbook sre.elasticsearch.rolling-reboot [production]
15:30 <XioNoX> rollback ospf + bgp changes on cr2-eqord [production]
15:19 <XioNoX> restart cr2-eqord - T227886 [production]
15:12 <XioNoX> disable all peering and transit on cr2-eqord [production]
15:01 <XioNoX> increase ospf cost of cr2-eqord<->cr2-eqiad link (+1000) [production]
14:57 <ema> cp5002: reboot for kernel upgrade [production]
14:42 <gehel@cumin2001> END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99) [production]
14:42 <gehel@cumin2001> START - Cookbook sre.elasticsearch.rolling-reboot [production]
14:31 <gehel@cumin2001> END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99) [production]
14:31 <gehel@cumin2001> START - Cookbook sre.elasticsearch.rolling-reboot [production]
14:29 <XioNoX> rollback: disable all peering and transit on cr2-eqdfw [production]
14:18 <XioNoX> reboot cr2-eqdfw for software upgrade - T227886 [production]
14:14 <XioNoX> disable all peering and transit on cr2-eqdfw [production]
14:04 <volans@cumin2001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:04 <volans@cumin2001> START - Cookbook sre.hosts.decommission [production]
13:20 <jbond42> rolling update of postgresql-9.6 [production]
13:07 <jijiki> rolling restart hhvm on api servers in eqiad [production]
12:57 <jijiki> Restart hhvm on mw1235 [production]
12:17 <fsero@puppetmaster1001> conftool action : set/pooled=true; selector: dnsdisc=sessionstore|citoid|cxserver|eventgate-analytics|eventgate-main|termbox|blubberoid|mathoid|zotero,name=eqiad [production]
12:08 <_joe_> restarted php-fpm on mw1221 [production]
12:03 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'sessionstore' for release 'production' . [production]
12:00 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . [production]
11:56 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . [production]
11:56 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . [production]
11:49 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . [production]
11:44 <fsero> recreating cxserver blubber and sessionstore namespace - T228836 [production]
11:39 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'mathoid' for release 'production' . [production]
11:35 <gehel> restart wdqs-blazegraph on wdqs2001 [production]
11:34 <gehel> restart wdqs-updater on wdqs2001 [production]
11:30 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . [production]
11:29 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' . [production]
11:25 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'citoid' for release 'production' . [production]
11:21 <fsero> recreating citoid eventgate-analytics eventgate-main mathoid namespace - T228836 [production]
11:20 <fsero@> helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' . [production]
11:18 <raynor> EU SWAT finished [production]
11:15 <pmiazga@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:529925|Undeploy editor gender surveys (T227793)]] (duration: 00m 48s) [production]
11:13 <fsero> recreating termbox namespace - T228836 [production]
11:06 <oblivian@> helmfile [EQIAD] Ran 'apply' command on namespace 'zotero' for release 'production' . [production]
11:04 <fsero> resetting net.netfilter.nf_conntrack_tcp_timeout_time_wait to 65 in kubernetes2006 [production]
10:59 <_joe_> [eqiad] downtiming zotero on icinga for 10 minutes while recreating the deployment with helmfile [production]
10:57 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
10:57 <oblivian@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:56 <oblivian@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
10:56 <oblivian@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:49 <oblivian@> helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' . [production]
10:44 <oblivian@> helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' . [production]