3351-3400 of 10000 results (62ms)
2019-09-13 §
11:11 <elukey> reboot an-conf100* (Analytics Zookeeper nodes - not yet in production) for kernel upgrades [production]
11:10 <elukey> reboot an-tool1007 (runs turnilo) for kernel upgrades [production]
11:08 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
11:08 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
11:05 <godog> silence kartotherian pages for 2h, known issue [production]
10:47 <vgutierrez> rebooting acmechief-test servers to catch up latest kernel upgrades [production]
10:42 <akosiaris@> helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
10:41 <moritzm> reimage restbase2009 to stretch T224553 [production]
10:38 <moritzm> repool restbase1018 after reimage to stretch and completed Cassandra bootstrap [production]
10:36 <akosiaris@> helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
10:36 <akosiaris@> helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
10:13 <vgutierrez> disable ATS-TLS debug options on cp5001 - T232298 [production]
10:09 <akosiaris@> helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
09:46 <gehel> re-enabling /geoline on maps1004 - T232817 [production]
09:45 <@> helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' . [production]
09:44 <@> helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . [production]
09:42 <@> helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' . [production]
09:40 <godog> install linux-perf-4.9 on maps1002 and attempt to capture a stack sample [production]
09:38 <gehel> drop /geoshape and restart kartotherian on maps1004 - T232817 [production]
09:27 <gehel> restart kartotherian on maps1004 - T232817 [production]
09:24 <gehel> deny access to /geoline on maps1004 - T232817 [production]
09:11 <oblivian@puppetmaster1001> conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=eqiad [production]
09:08 <godog> downtime kartotherian pages for 1h in codfw [production]
09:01 <oblivian@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=elastic1046.eqiad.wmnet [production]
09:00 <oblivian@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=elastic1017.eqiad.wmnet [production]
08:57 <oblivian@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=kartotherian,name=eqiad [production]
08:52 <godog> downtime kartotherian pages for 1h [production]
08:48 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) [production]
08:48 <jmm@cumin2001> Updating IPMI password on 1 hosts - jmm@cumin2001 [production]
08:47 <jmm@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
08:47 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) [production]
08:47 <jmm@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
08:45 <gehel> stop tilerator on maps to help reduce load [production]
08:37 <_joe_> rolling restart of karotherian [production]
08:33 <_joe_> restarting kartotherian on maps1003, all workers seem stuck [production]
05:58 <oblivian@deploy1001> Synchronized w/fatal-error.php: Adding core dump function to fatal-error (duration: 01m 04s) [production]
05:40 <_joe_> live-hacking mw1348, setting rlimit_core = unlimited to allow core dumps to be taken [production]
05:17 <effie> Rolling restart php-fpm across the fleet for 536400 [production]
04:53 <vgutierrez> restarting ats-tls on cp4021 and cp2002 to pick up the new SSL session cache timeout - T231849 [production]
04:50 <eileen> process-control config revision is 43a2677bcf - turned off gender import [production]
02:23 <eileen> civicrm revision changed from c5ab5aea9e to 45dbfdb96f, config revision is 1da8391a9a [production]
01:09 <XioNoX> add IPv6 sampling to cr1-eqiad [production]
01:07 <XioNoX> enable netflow sampling on cr2-codfw [production]
2019-09-12 §
23:35 <XioNoX> enable netflow sampling on cr1-codfw [production]
23:21 <urandom> decommissioning Cassandra, restbase2009-b -- T224553 [production]
23:19 <jforrester@deploy1001> Synchronized wmf-config/CommonSettings.php: T223602 Read config from JSON, not serialised PHP on testwiki (duration: 01m 03s) [production]
23:18 <jforrester@deploy1001> Synchronized multiversion/MWConfigCacheGenerator.php: T223602 Add ability to read config from JSON, not serialised PHP (duration: 01m 04s) [production]
23:10 <eileen> process-control config revision is 1da8391a9a [production]
22:53 <ayounsi@cumin2001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
22:48 <ayounsi@cumin2001> START - Cookbook sre.ganeti.makevm [production]