1101-1150 of 10000 results (60ms)
2019-06-19 ยง
10:50 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
10:47 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
10:38 <ladsgroup@deploy1001> scap-helm termbox finished [production]
10:38 <ladsgroup@deploy1001> scap-helm termbox cluster codfw completed [production]
10:38 <ladsgroup@deploy1001> scap-helm termbox upgrade -f termbox-values.yaml production stable/termbox [namespace: termbox, clusters: codfw] [production]
10:36 <moritzm> rebooting mx2001 for kernel security update [production]
10:35 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:35 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
10:33 <akosiaris@deploy1001> scap-helm termbox finished [production]
10:33 <akosiaris@deploy1001> scap-helm termbox cluster staging completed [production]
10:33 <akosiaris@deploy1001> scap-helm termbox upgrade -f termbox-staging-values.yaml staging stable/termbox [namespace: termbox, clusters: staging] [production]
10:30 <jbond42> update late-install so it installs the correct puppet version https://gerrit.wikimedia.org/r/c/operations/puppet/+/515087 [production]
10:30 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
10:30 <moritzm> installing glibc and ca-certificates-java updates from stretch point release [production]
10:29 <akosiaris@deploy1001> scap-helm termbox finished [production]
10:29 <akosiaris@deploy1001> scap-helm termbox cluster eqiad completed [production]
10:29 <akosiaris@deploy1001> scap-helm termbox upgrade -f termbox-values.yaml production stable/termbox [namespace: termbox, clusters: eqiad] [production]
10:27 <ema@cumin1001> END (FAIL) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=99) [production]
10:23 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
10:21 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
10:05 <ema> cp3030: increase varnish-be thread_pool_max from 12000 (250 * 48) to 14400 (300 * 48) to observe impact on fetcherrors [production]
10:03 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
10:02 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Fully repool db1077 (duration: 00m 55s) [production]
10:01 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
09:56 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
09:54 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
09:49 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to db1077 (duration: 00m 55s) [production]
09:36 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
09:34 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
09:34 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to db1077 (duration: 00m 55s) [production]
09:29 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
09:25 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
09:24 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Slowly repool db1077 T225981 (duration: 01m 00s) [production]
09:19 <XioNoX> jnt push to esams, remove old protect-old-lvs-servers term + update syslog target T224128 [production]
09:14 <marostegui> Start MySQL on db1077 - s3 labsdb lag should start catching up T225981 [production]
09:13 <akosiaris@puppetmaster1001> conftool action : set/pooled=yes; selector: name=kubernetes2001.* [production]
09:09 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
09:06 <akosiaris> repool kubernetes2002, kubernetes2003. Point proven, chasing down lead [production]
09:06 <akosiaris> repool kubernetes2002, kubernetes2003. Point proven, chasing down load [production]
09:06 <akosiaris@puppetmaster1001> conftool action : set/pooled=yes; selector: name=kubernetes2002.* [production]
09:06 <akosiaris@puppetmaster1001> conftool action : set/pooled=yes; selector: name=kubernetes2003.* [production]
09:05 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
09:03 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
08:57 <akosiaris> depool kubernetes200{2,3} for the same out discards investigation [production]
08:56 <ema@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
08:56 <akosiaris@puppetmaster1001> conftool action : set/pooled=no; selector: name=kubernetes2003.* [production]
08:56 <akosiaris@puppetmaster1001> conftool action : set/pooled=no; selector: name=kubernetes2002.* [production]
08:54 <akosiaris> uncordon kubernetes2001, reschedule some pods on it. Investigating out discards still [production]
08:51 <XioNoX> jnt push to codfw, remove old protect-old-lvs-servers term + update syslog target T224128 [production]
08:43 <ema@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]