2020-08-19 ยง
18:19 <urbanecm@deploy1001> Synchronized static/images/mobile/copyright/: b9043331c1c1b352256cffd471b9ff128806607c: Update project wordmarks (T254788; sync 1/2) (duration: 01m 06s) [production]
18:15 <mutante> rebooting webperf2002 VM on ganeti level (outside OS) to upgrade rom 8 to 16GB RAM (T260192) [production]
18:15 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: a6f8354e7599a5e92bea060807065f5b42c540e5: Enable $wgMFNoindexPages for all wikis (T255458) (duration: 01m 07s) [production]
18:13 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
18:13 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:13 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
17:38 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
17:38 <mutante> decom'ing releases2001.codfw.wmnet ( [production]
17:37 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
16:39 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
16:37 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
16:32 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
16:30 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:41 <rzl> finished exercising the switchdc cookbooks with --live-test for now, all changes reverted including re-enabling puppet on cumin1001 [production]
15:38 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) [production]
15:37 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-start-maintenance [production]
15:34 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) [production]
15:34 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-restore-ttl [production]
15:33 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) [production]
15:33 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-restore-ttl [production]
15:31 <jbond42> update java.security https://gerrit.wikimedia.org/r/c/operations/puppet/+/593467 [production]
15:30 <oblivian@cumin1001> conftool action : set/ttl=300; selector: dnsdisc=api-rw [production]
15:26 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) [production]
15:26 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl [production]
15:22 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) [production]
15:22 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-restore-ttl [production]
15:18 <godog> prometheus codfw lvextend --resizefs --size +80G /dev/mapper/vg--ssd-prometheus--ops [production]
15:17 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) [production]
15:17 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl [production]
15:16 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) [production]
15:16 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl [production]
15:14 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) [production]
15:14 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-restore-ttl [production]
15:08 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) [production]
15:08 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.08-restore-ttl [production]
15:06 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
15:04 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
14:50 <rzl@cumin1001> END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) [production]
14:50 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl [production]
14:50 <rzl> running the switchdc cookbooks with --live-test, simulating a switch to eqiad where we're already running, no production impact is expected [production]
14:47 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) [production]
14:47 <rzl@cumin1001> START - Cookbook sre.switchdc.mediawiki.00-disable-puppet [production]
14:41 <rzl> disable puppet on cumin1001 for switchdc testing [production]
14:35 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:33 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
14:27 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
13:38 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
13:34 <gehel> depooling wdqs1007 and restarting blazegraph [production]
13:29 <_joe_> depooling and disabling puppet on restbase1024 for further investigation [production]
13:27 <ppchelko@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [production]